One of the most encouraging applications of artificial intelligence (AI) predictive capabilities lies in the realm of digital healthcare, particularly in precision or personalized medical interventions. As AI algorithms make headway in healthcare and precision medicine, it becomes imperative to grasp not just the strengths but also the potential limitations of machine learning. A recent study spearheaded by the Yale School of Medicine sheds light on how AI algorithms employed to forecast patient outcomes exhibit a lack of generalizability—they excel within the specific confines of the clinical trial they were designed for but falter when applied to diverse clinical trials for schizophrenia treatments.
The utilization of AI machine learning to forecast whether a specific patient will respond favorably to drug therapy stands as a pivotal facet of precision medicine. As per Yale-led researchers, over 50 percent of individuals experiencing a relapse and up to 20-30 percent of those in their first episode fail to exhibit a substantial clinical response to antipsychotic drugs used in schizophrenia treatment, contingent upon the clinical outcome definition.
Within the domain of artificial intelligence, a crucial yardstick for the robustness of machine learning algorithms is generalizability—the capacity of the AI model to deliver high-accuracy performance on novel data that hasn’t been previously encountered by the algorithm. Ideally, an AI algorithm earmarked for predicting outcomes in precision medicine treatments should demonstrate robustness. This new research assumes significance as it delves into the inner workings of AI algorithms at a mathematical level.
“The field at large, including the present authors, envisions that machine learning methodologies can ultimately refine the allocation of treatments in medicine; nonetheless, we should maintain a degree of skepticism towards any predictive model discoveries lacking an independent sample for validation,” remarked corresponding author Adam Chekroud, Ph.D., a recipient of the Forbes “30 Under 30 2018: Consumer Technology” award, an adjunct assistant professor of psychiatry at Yale, and the Co-Founder of Spring Health. He collaborated with a seasoned interdisciplinary team comprising notable co-authors from the realms of medicine, psychiatry, data science, and neuroscience.
The study involved Yale researchers such as Professor John Krystal, M.D., Chair of Yale’s Department of Psychiatry, Associate Professor of Psychiatry Philip Corlett, Ph.D., Professor Harlan Krumholz, M.D., S.M., Director of the Yale New Haven Hospital Center for Outcomes Research and Evaluation (CORE), and researchers Matt Hawrilenko, Ph.D., Ralitza Gueorguieva, Ph.D., and Hieronimus Loho. They collaborated with experts from various institutions including University Augsburg, King’s College London, University of Cologne, University Hospital Cologne, Spring Health, and the Laureate Institute for Brain Research.
The researchers underscored the potential advantages of predicting treatment outcomes in schizophrenia, given the heterogeneous clinical response to pharmacological interventions influenced by various environmental factors like stress, drug abuse, homelessness, and social isolation.
Schizophrenia, a severe and chronic brain disorder impacting an estimated 24 million individuals globally according to the World Health Organization (WHO), manifests as a collection of symptoms affecting the mind, leading to some disconnect from reality as defined by the National Institute of Mental Health (NIMH). Treatment approaches for schizophrenia, despite lacking a definitive cure, may encompass antidepressants, mood stabilizers, cognitive therapy, behavioral therapy, training, support groups, and antipsychotic medications.
In an effort to ascertain the accuracy of AI machine learning models in predicting outcomes for schizophrenia patients across independent clinical trials involving antipsychotic medication, the team assessed the AI model’s performance on both its initial training data and data from distinct clinical trials for patients diagnosed with schizophrenia according to DSM-5 criteria. They leveraged data from five multisite, randomized, controlled trials sourced from the Yale Open Data Access (YODA) Project, encompassing over 1,500 patients from 190+ sites spanning North America, Europe, Africa, and Asia.
The researchers opted for an elastic net regression algorithm, a penalized regression method combining ridge (L2 regularization) and LASSO (L1 regularization) variations, which has demonstrated efficacy in prior psychiatry research endeavors focused on predicting psychiatric treatment outcomes. Elastic net regression in machine learning serves to mitigate overfitting and bolster AI prediction accuracy. Linear regression models in mathematics and statistics elucidate the relationship between a dependent variable (y) or response variable and one or more independent variables (X) or predictor variables.
The researchers identified three potential factors contributing to the observed lack of generalizability: disparities in patient populations across trials, variations in data quantity and types, and the context-driven nature of patient outcomes.
Patient cohorts falling under the same diagnostic category may exhibit differences across trials, owing to varying stages of disease progression among individuals diagnosed with schizophrenia according to DSM-5 criteria.
“If pivotal data distinguishing patients isn’t captured within the dataset or if the scope of such information is more constrained in the dataset used for model development compared to the target trial, predictions may prove inaccurate due to data quantity, type, and the context-dependent nature of patient outcomes,” expounded the researchers.
AI machine learning mandates substantial volumes of training data to enable the algorithm to glean features from the data. The sheer volume and specific types of data collected can influence generalizability and prediction quality of the AI algorithm. While the current study incorporated patient data encompassing sociodemographic, biomarker, and clinical facets, it omitted psychosocial and social determinants of health. Citing a separate study by Dr. Professor Koutsouleris, M.D. et al., published in The Lancet Psychiatry in 2016, the researchers highlighted the utility of psychosocial and social determinants of health in AI machine learning for predicting treatment outcomes in first-episode psychosis.
Interestingly, the researchers did not advocate for the inclusion of genetic and brain imaging data types to enhance AI accuracy. Although the precise etiology of schizophrenia remains unknown, a family history of schizophrenia poses a risk factor, and disparities in brain structure and the central nervous system have been observed via brain imaging techniques as per the Mayo Clinic.
“While some have proposed leveraging neuroimaging and genetic data, scant evidence currently supports the notion that such data would enhance predictions; moreover, incorporating these data types would pose additional hurdles for routine implementation,” noted the Yale-led researchers.
In essence, patient outcomes concerning antipsychotic medication in individuals with schizophrenia may prove excessively context-dependent. Trial-specific characteristics within treatment protocols, recruitment strategies, and inclusion criteria could exert an influence on patient outcomes.
“Our modeling scenarios focusing on predicting antipsychotic treatment outcomes in schizophrenia underscore the fragility of predictive models, emphasizing that exceptional performance within one clinical context doesn’t necessarily guarantee similar performance on future patients,” concluded the researchers.