Accurate and reliable measures of mental health and functioning are critical for patient care and clinical research. Current gold-standards for measurement of disease severity and treatment response are based on interviewer-led assessments or patient self-report questionnaires, typically grounded in nosology defined by the Diagnostic and Statistical Manual of Mental Health, fifth edition.1 However, such “traditional” clinical assessments of psychiatric health present both practical and clinical challenges.
When simultaneous participation by patients and clinicians is a requirement, it is more often the patients' lives that must be scheduled around the assessment. The interviews themselves can be burdensome for both parties, at times taking as long as 90 minutes to administer. With clinicians increasingly having less time to spend per patient in each day, assessments must be scheduled far apart, leading to infrequent measures of disease progression. Even in scenarios where telehealth technologies allow for remote assessments, the accuracy of the measures acquired remains in question.2
There has been growing concern over the clinical validity of traditional assessments. Most require subjective observation, with potential for clinician bias and poor inter-rater and test-retest reliability.3 Given the heterogeneous symptomatology associated with psychiatric illness, traditional assessments can be insensitive to change, particularly when treatment may only affect the subclasses of symptoms. Such challenges are well-established, with ongoing efforts to update disease classification to align with current understandings in neurobiology. As a result, there is a need for objective, sensitive, and scalable measures of mental health.
Digital Measurement of Mental Health
Digital tools for measurement of mental health are becoming increasingly prevalent, aiming to address the challenges posed by traditional assessments through objective measurement of biomarkers of mental health. Despite the varied nature of these efforts, they are grounded in a common understanding: neuropsychiatric disorders manifest themselves in observable ways that can be measured through digital tools. There have been several attempts to classify digital phenotyping tools and the biomarkers they measure.4 Here, we categorize digital phenotyping according to the way biomarker data is collected (Figure 1), focusing on passive monitoring, active assessment, individual self-report, and biological measurement.
Different types of data collection strategies for digital measurement tools. Although some devices and measurement tools fall exclusively into one category (eg, magnetic resonance scanners are exclusively meant for in-clinic biological measurements), other tools, depending on how their technology is used, can provide multiple types of measures (eg, smartphones can be used for passive behavioral monitoring and also used for active behavioral assessments). EHR, electronic health record; ePRO, eletronic patient-reported outcome; GPS, global positioning system; MRI, magnetic resonance imaging.
Digital phenotyping through passive monitoring uses behavioral data collected as a person goes about one's daily life. It hypothesizes that specific characteristics of behavior can serve as biomarkers of mental health, based on understandings of how neuropsychiatric illnesses can affect behavior. Although passive biomarkers of mental health are a relatively novel area of research, they have demonstrated marked success as effective measurement tools.
Actigraphy and tremor. Actigraphy is arguably the most common passive measure of mental health. Gyroscopes, pedometers, accelerometers, and GPS (global positioning system) trackers, often part of smartphones or wearable devices, provide a detailed view into patient motor and sleep/wake behavior. Efforts to use these measures to characterize mental health have shown they correlate strongly with severity of psychiatric illness.5 Actigraphy is particularly useful in the context of motor disorders. Smartwatches have been shown to provide reliable measures of tremor, correlating strongly with clinical scales.6 Given the ease of access to such measures, efforts to bring them into patient care and clinical research are well underway.7
Electronic behavior. With integration of electronic devices into everyday life, certain aspects of how a person interacts with devices are emerging as biomarkers of mental health. Keystroke activity and social media interactions have been demonstrated as correlates of psychiatric functioning, particularly in the context of cognition and social functioning.8 In one study, natural language characteristics of tweets posted by patients with clinical depression were predictive of depression severity.9 Such measurements, not possible before integration of technology into daily behavior, use a rich and prevalent data source, although how to integrate such measurements into patient care and clinical research remains unclear.
In-home sensors. There are efforts to install sensors within patient homes for passive behavioral monitoring. These tools, which can range from simple device-based sleep monitors to motion-activated cameras, are not designed for widespread adoption, but rather positioned toward patient populations that would benefit from more direct measures of their health and functioning. One study demonstrates that the use of wireless sensing technology for monitoring breathing and heart rate without body contact acquires such measures accurately even from adjacent rooms.10
Active assessments directly engage people in pre-designed tasks for the collection of short bursts of behavioral data that allow for the measurement of biomarkers of mental health. They are most similar to traditional clinical assessments in that they are meant to elicit specific behaviors for targeted measurement of disease symptomatology, with the exception that the subsequent measures acquired do not rely on manual or subjective observation by a clinician or interviewer.
Facial expressivity. Facial behavior is a well-established biomarker of mental health. Traditionally, facial activity could be quantified only through manual human coding, rendering it impractical for clinical use despite its reliability as a measure.11 With the development of computer vision-based objective measurement of facial expressivity,12 the use of facial expressivity as a biomarker of disease severity in patients with psychiatric illness is becoming increasingly relevant.13 If the collection of video data can be made scalable, quantification of facial expressivity poses to be a promising and reliable tool for measurement of neuropsychiatric functioning using digital tools.
Voice and speech. There has been a detailed characterization of how acoustic properties of voice and natural language characteristics of speech can serve as biomarkers of mental health.14 With recent advancements in automated measurement of voice and speech characteristics, efforts to use them in patient care and clinical research have emerged. For example, one web-based tool allows any person to participate in brief verbal assessments for the measurement of psychiatric functioning.15 Similar to the measurement of facial expressivity, if the collection of audio data can be scaled, vocal and speech biomarkers could provide accurate and sensitive measures of mental health functioning.
Movement and tremor. Distinct from actigraphy-based monitoring of motor behavior, active assessments of motor functioning that rely on computer vision-based quantification are not subject to the signal-to-noise challenges often associated with actigraphy. For example, one study used measurements of head movement acquired through smartphone-based assessments to accurately predict the severity of negative symptoms in people with schizophrenia.16 Active assessments are particularly useful for measurement of tremor, with studies successfully demonstrating computer vision-based quantification of tremor and providing greater measurement sensitivity than is possible with clinician observation.17
Cognitive testing. Measures of cognition warrant a separate categorization given the availability of established performance-based digital cognitive assessments that have replaced traditional paper-based measures.18 Several smart-phone- or tablet-based cognitive assessments are already being established as valid clinical measures. For example, one study used such cognitive assessments to measure disease severity in people with dementia.19 Another study implemented a digital version of the number-size Stroop test in patients with Parkinson's disease for measurement of cognition.20
Patient self-report is not typically contemplated alongside other novel digital phenotyping tools considering the subjective nature of the measurement. However, it serves as a key data source in both patient care and clinical research.21 Despite associated issues, integration of electronic Patient Reported Outcomes and ecological momentary assessments as measurement tools have demonstrated sensitivity to change and validity against traditional assessments.22 In some cases, they provide metrics otherwise difficult to acquire through clinician observation or objective digital measurement (eg, substance use, dosing, self-harm, dietary behavior). Notably, patient perspective on treatment efficacy, although subjective, is arguably the most important measure of health, considering the ultimate result of successful treatment is improved patient experience. In a future of digital phenotyping where multiple data streams are integrated for clinical decision-making, self-report will remain an important feature of health and functioning. Digital tools for acquisition of these measures and their integration into digital health data streams will be critical.
Thus far, the discussion has focused on data collection that occurs in the absence of a clinician and likely outside of clinical settings. However, there has been considerable research on how to objectively extract clinical insights from biological and health data collected within clinical environments through neuroimaging, genetic sequencing, and clinician report. This is an additional data stream resulting from advances in computational tools that have demonstrated accuracy and sensitivity as an objective measure of mental health and warrants discussion alongside other efforts in digital phenotyping of neuropsychiatric illness.
Neuroimaging. Neuroimaging data sources such as computed tomography, magnetic resonance imaging, magnetoencephalography, positron emission tomography, or electroencephalography provide spatially and temporally rich information upon which machine learning can be applied to identify characteristics of mental health and functioning.23 Application of machine learning-based classification of disease through neuroimaging data has received popular attention, at times performing as well as or better than clinician observation. However, its application in prognosis of individual outcomes remains an important topic for discussion.24
Genetics. With increased accessibility to gene sequencing, its use as an informative data source is becoming more common. Indeed, most neuropsychiatric illnesses are associated with genetic contributors.25 Genome-wide association studies have shown that a person's transcriptome can inform clinicians about the potential risk for neuropsychiatric disorders, determine the kinds of symptomatology they are likely to exhibit, and the kinds of treatment they are likely to respond to.26 Ease of access to insights from genetic sequencing could empower clinicians to provide a greater quality of care to their patients.
Electronic health records. In a future where digital phenotyping of health results from the integration of heterogeneous data sources, electronic health records (EHR) will provide access to key information otherwise inaccessible. EHRs store information on patients ranging from demographics to laboratory results, immunizations, and clinician notes. There is considerable support for the use of EHR data to predict patient health and treatment outcomes.27 For example, one study demonstrated an accurate prediction of posttraumatic stress development after emergency department admission using solely data typically recorded in a patient's EHR.28 The integration of such insights into digital phenotyping can markedly increase the accuracy of the measures acquired.
Challenges in the Development of Digital Measures
Each data source discussed is associated with its own unique set of advantages and disadvantages. However, novel digital measures of mental health share common obstacles before they can be fully relied upon for patient care and clinical research. Here, we discuss challenges associated with validation of novel methodology, unclear or obscure regulatory pathways, and obstacles associated with integration into patient care and clinical research.
A common pathway for the validation of methodologies underlying novel measures remains elusive. Whereas some measurement tools provide open access to all methods, code, and datasets, others can be secretive about the developed technology. For example, neither Apple nor Google share the algorithms used to calculate step counts or heart rates from their wearable devices. Yet, these measures have been used as reliable indicators of health and behavior. This leads to serious scientific and ethical concerns over the use of such tools to inform clinical decision-making. Although several pathways for validation have been proposed, all aspects of these pathways may not be as easily accessible to any one party; academic initiatives may be more open to publishing all methods, but they may not have the well-established and controlled software development processes of large technology companies. Any novel measure of health must be grounded in a strong, peer-reviewed scientific basis, have the software that is developed be validated by third parties––even if the underlying methods are not necessarily made public––and objective assessment of their accuracy as measures of mental health must be open for evaluation.29
The conversation of validation leads to the question of what party would determine the validity of a novel measure. The US Food and Drug Administration (FDA) is the ultimate gate-keeper for clinical decision-making tools. However, in the FDA's own perspective, it is not yet well-positioned to regulate emerging technologies for machine learning-based measurement of mental health.30 Certain FDA-led efforts, such as the Drug Development Tool Qualification Program, are attempting to directly address this issue. However, validation pathways and regulatory guidelines for algorithm-based measurements of health that can continually improve and update their calculations based on additional data remain a work in progress. Hence, even while some novel measures accumulate widespread scientific support as accurate and reliable measures of neuropsychiatric illness, their pathway to integration into patient care and clinical research can still be unclear.
Even tools determined to be valid measures of mental health face an ambiguous path toward integration into the regular process of patient care. The health care ecosystem has established processes that are subject to strict regulations under the Health Insurance Portability and Accountability Act (HIPAA) and the General Data Protection Regulation (GDPR). Any novel measurement tool must first be adapted to comply with these regulations and then integrated into existing data streams used by health care professionals. Efforts to overturn how health care information is collected, stored, and accessed have thus far been unsuccessful. Although some EHR systems allow for API-based (application program interface) integration of novel measurement tools, these systems are far from ubiquitous, and when available are not necessarily designed to support machine learning-based predictions of health and functioning. To achieve a future where digital phenotyping is conducted based on a variety of independent measures, strong partnerships between industry, health care, and regulatory agencies will be necessary.
Promises and Future of Digital Phenotyping of Mental Health
Given the early success of digital measures of mental health and their need in clinical settings, some tools discussed here will assuredly find applications in patient care and clinical research. A subset of the technologies discussed have already demonstrated usefulness as objective and scalable measures in pockets of the health care ecosystem. However, the promise of digital phenotyping lies beyond the isolated application of individual measures. Integration of data sources in a shared technological infrastructure will allow for the measurement of mental health using a more expansive set of predictive features than has been possible before. This would lead to comprehensive measures that truly encapsulate manifestations of mental illness in individual health and behavior.
Integration of Independent Measures
Historically, mental health measurements have been based on a single data source: clinician observation. With digital tools significantly expanding the availability of mental health measurements, their integration for more informed decision-making comes into play. It has been shown that digital biomarkers can be used for accurate classification of posttraumatic stress disorder and depression.31 Researchers that have been able to integrate multiple independent data sources (EHRs and psychological self-report) to train machine learning models of mental health have demonstrated significant improvements in accuracy of mental health measurements.28 However, it is difficult and manual work is required to merge data from different sources. If independent measurements can be consolidated into common data streams, machine learning-based indicators of mental health could provide more accurate and objective quantification of psychiatric illness and disease severity than has been historically possible with traditional or isolated measurement tools. This raises the question of how independent data sources can find a common infrastructure for both the training and application of machine learning-based measurements and how such an infrastructure could find a home in the health care ecosystem.
A Common Technological Infrastructure
Efforts to consolidate different data streams are already taking shape. Partnerships between health care systems, large technology companies, and smaller providers of novel measurement tools are leading to an interconnected web that would allow for training and application of machine learning-based measurement tools that integrate multimodal data sources.32 Akin to a highway system with interspersed parking garages, such a web allows for flow of information from different storage centers for the integration of any two data streams. Such an infrastructure is reliant on the availability of two important technologies: secure and regulated data storage and software tools that allow for safe inter-platform integration. Luckily, these technologies are becoming increasingly ubiquitous. Cloud-based web services such as the Google Cloud Platform and Amazon Web Services provide HIPAA and GDPR compliant data solutions that novel technologies can be built upon. Software development efforts such as Apple HealthKit and Google Fit are allowing developers of novel measures to seamlessly integrate with minimal software development effort. Both technologies are being designed with the existing health care infrastructure in mind, allowing for their integration into EHRs and independent health care information systems. Early adopters of this vision are already conducting studies and collecting data from thousands of participants. As novel and validated digital measures begin to integrate into a common infrastructure, the accuracy of mental health measures and the availability of those measures have the ability to transform patient care and clinical research (Figure 2).
A technological infrastructure for the integration of digital measurement tools. Independent platforms for measurement of health (eg, actigraphy, blood pressure, cognition) will have their own data repositories, depicted as clouds. This data could be safely transferred across platforms using transfer tools such as secure APIs (application program interfaces), depicted using dashed arrows. Such tools could allow for both unidirectional and bidirectional movement of data. If a combination of measures can be accessible from a single data repository, it allows for the application of machine learning-based measurement of health, integrating all the independent measures in its prediction. The results from this calculation can be integrated into the patient's electronic health record, which can be used for clinical decision-making.