Machine learning approach predicts emergence of psychosis
Using a machine learning approach to analyze language, researchers found that speech with low levels of semantic density and an increased tendency to talk about voices and sounds predicted the emergence of psychosis.
“The challenge for research is in how to detect the signs of future psychosis while symptoms are still subtle and indistinct. Recent advances in machine learning and natural language processing are making such detection possible,” Neguine Rezaii, MD, of the department of neurology, Massachusetts General Hospital, Harvard Medical School, and the department of psychiatry, Emory School of Medicine, and colleagues wrote in npj Schizophrenia.
Researchers examined two potential linguistic indicators of psychosis — auditory hallucination and low semantic density — in 40 participants of the North American Prodrome Longitudinal Study followed up to 2 years or until conversion to psychosis. Using a technique called vector unpacking — which partitions the meaning of a sentence into its core ideas — they demonstrated how the linguistic marker of semantic density can be measured.
They also performed latent content analysis, a new computational method that identifies the latent content of an individual’s speech by comparing it to the contents of conversations generated on social media (in this case, 30,000 Reddit users). Doing this enables the identification of subtle ways in which the language content of people in the early stages of psychosis differ from the “norm,” according to Rezaii and colleagues.
The investigators found that linguistic indicators of mental health — semantic density and talk about voices — predicted conversion to psychosis with 93% accuracy in the training and 90% accuracy in the holdout datasets.
The results also showed that semantic density was a function of the way words were organized into sentences, not just how words were used across sentences, suggesting that people with psychosis have impairments in the integration of words to generate higher order meaning.
“In future studies, larger cohorts of patients, more variety in the neuropsychiatric disorders under investigation, and the inclusion of healthy controls could help clarify the generalizability and reliability of the results,” Rezaii and colleagues wrote. “Further research could also investigate the ways in which machine learning can extract and magnify the signs of mental illness. Such efforts could lead to not only an earlier detection of mental illness, but also a deeper understanding of the mechanism by which these disorders are caused.” – by Savannah Demko
Disclosure: The authors report no relevant financial disclosures.