In the Journals

Interoperability, transparency, validation inform care in computational psychiatry

To incorporate innovative technological approaches into mental health care research and add to the growing field of computational psychiatry, experts recommended interoperability, transparency and validating each application, according to a paper published in Psychiatric Services.

Specifically, they advised using standardized systems or integrating new tools into existing systems; sharing deidentified data, software and source code; using consistent terminology; and confirming research conclusions using multisite validation.

Currently, more than 60% of psychiatrists use electronic health records, Juliet Beni Edgcomb, MD, PhD, and Bonnie Zima, MD, MPH, from the department of psychiatry and behavioral sciences at University of California, Los Angeles, wrote. However, techniques that combine psychiatry and informatics are still needed.

“With an estimated one billion U.S. medical visits documented each year, EHRs contain rich longitudinal data on large populations and can be linked to contextual data in complex networks of causation,” they explained.

In their paper, Edgcomb and Zima described three growing domains of EHR data science — EHR phenotyping, natural language processing and learning-based predictive modeling (machine learning) — and examined their benefits and challenges within mental health services research.

EHR phenotyping

EHR phenotyping, commonly used to identify patients based on cancer staging, communicable disease and tobacco use, refers to using EHR data to identify patient cohorts, the authors explained. These cohorts can be linked across numerous institutions, matched to fine-grained research data, and combined with genetic and genome-wide association studies.

However, Edgcomb and Zima cautioned researchers to remember that EHR data are largely missing, often inaccurate and complex.

“A well-specified and predefined validation approach is key, requiring clear exclusion and inclusion criteria, time frames for each variable, a defined episode of care and index start date, and consistent definitions,” they wrote. “Consistency and transparency in this evolving chain of specifications are critical.”

Natural language processing

EHRs contain narrative data, like physician notes, and natural language processing (NLP), which helps computers understand and manipulate language, analyzes narrative data and turns it into quantifiable variables (structured data). Recently, NLP has been used to identify depression, negative symptoms and prodromal/premorbid states, according to Edgcomb and Zima.

However, analyzing text data is complicated. The authors recommended that researchers adopting NLP should try to construct accurate, internally valid models and externally validate them across institutions.

“Although access to narrative data on a large scale has newly attracted the efforts of psychiatrists to harness this technology, rigorous standards are undefined,” they wrote. “Development and validation of simple, scalable, and transparent approaches are imperative to advancing NLP applications.”

Machine learning

Machine learning, which uses statistical techniques to give computer systems the ability to progressively improve performance from data, has the potential to uncover patterns in multivariate data sets, the experts explained.

Machine learning applications can model nonlinear relationships between complex interrelated variable sets; however, because the data sets are so large, these methods are “highly robust to random errors,” Edgcomb and Zima wrote.

“Predictive performance of algorithms can be overestimated: if data are lacking, inappropriate validation procedures are used or models are overfit,” they wrote. “Collaboration with [machine learning] experts is critical, because even seemingly straightforward approaches may inadvertently lead to wrong conclusions.”

To conclude, the authors wrote that adopting data science approaches to inform care within computational psychiatry shows promise but requires caution.

“Clear, tangible benefits are tightly connected to multiple foreseeable, and many likely yet unknown, challenges,” Edgecomb and Zima wrote. “Through collaboration with computer scientists and clinical informaticists, mental health services research offers complex research questions that will likely stimulate further advancement in these methods.” – by Savannah Demko

Disclosure: Zima reports funding from the Behavioral Health Centers of Excellence for California, Illinois Children’s Healthcare Foundation, Mental Health Services Act, Patient-Centered Outcomes Research Institute and the State of California Department of Healthcare Services.

To incorporate innovative technological approaches into mental health care research and add to the growing field of computational psychiatry, experts recommended interoperability, transparency and validating each application, according to a paper published in Psychiatric Services.

Specifically, they advised using standardized systems or integrating new tools into existing systems; sharing deidentified data, software and source code; using consistent terminology; and confirming research conclusions using multisite validation.

Currently, more than 60% of psychiatrists use electronic health records, Juliet Beni Edgcomb, MD, PhD, and Bonnie Zima, MD, MPH, from the department of psychiatry and behavioral sciences at University of California, Los Angeles, wrote. However, techniques that combine psychiatry and informatics are still needed.

“With an estimated one billion U.S. medical visits documented each year, EHRs contain rich longitudinal data on large populations and can be linked to contextual data in complex networks of causation,” they explained.

In their paper, Edgcomb and Zima described three growing domains of EHR data science — EHR phenotyping, natural language processing and learning-based predictive modeling (machine learning) — and examined their benefits and challenges within mental health services research.

EHR phenotyping

EHR phenotyping, commonly used to identify patients based on cancer staging, communicable disease and tobacco use, refers to using EHR data to identify patient cohorts, the authors explained. These cohorts can be linked across numerous institutions, matched to fine-grained research data, and combined with genetic and genome-wide association studies.

However, Edgcomb and Zima cautioned researchers to remember that EHR data are largely missing, often inaccurate and complex.

“A well-specified and predefined validation approach is key, requiring clear exclusion and inclusion criteria, time frames for each variable, a defined episode of care and index start date, and consistent definitions,” they wrote. “Consistency and transparency in this evolving chain of specifications are critical.”

Natural language processing

EHRs contain narrative data, like physician notes, and natural language processing (NLP), which helps computers understand and manipulate language, analyzes narrative data and turns it into quantifiable variables (structured data). Recently, NLP has been used to identify depression, negative symptoms and prodromal/premorbid states, according to Edgcomb and Zima.

However, analyzing text data is complicated. The authors recommended that researchers adopting NLP should try to construct accurate, internally valid models and externally validate them across institutions.

“Although access to narrative data on a large scale has newly attracted the efforts of psychiatrists to harness this technology, rigorous standards are undefined,” they wrote. “Development and validation of simple, scalable, and transparent approaches are imperative to advancing NLP applications.”

Machine learning

Machine learning, which uses statistical techniques to give computer systems the ability to progressively improve performance from data, has the potential to uncover patterns in multivariate data sets, the experts explained.

Machine learning applications can model nonlinear relationships between complex interrelated variable sets; however, because the data sets are so large, these methods are “highly robust to random errors,” Edgcomb and Zima wrote.

“Predictive performance of algorithms can be overestimated: if data are lacking, inappropriate validation procedures are used or models are overfit,” they wrote. “Collaboration with [machine learning] experts is critical, because even seemingly straightforward approaches may inadvertently lead to wrong conclusions.”

To conclude, the authors wrote that adopting data science approaches to inform care within computational psychiatry shows promise but requires caution.

“Clear, tangible benefits are tightly connected to multiple foreseeable, and many likely yet unknown, challenges,” Edgecomb and Zima wrote. “Through collaboration with computer scientists and clinical informaticists, mental health services research offers complex research questions that will likely stimulate further advancement in these methods.” – by Savannah Demko

Disclosure: Zima reports funding from the Behavioral Health Centers of Excellence for California, Illinois Children’s Healthcare Foundation, Mental Health Services Act, Patient-Centered Outcomes Research Institute and the State of California Department of Healthcare Services.