Depression is a global mental health crisis,1 but there is a significant disconnect between alerts from epidemiological reports and personal self-awareness that the physical and mental changes one might be experiencing are indicative of major depression and the need for treatment. The signs of depression are often subtle, have an insidious onset, and often go unrecognized even when people with depression present to health care providers. Studies have reported an average of 5 years between onset of depression and initial treatment contact in children and adults.2 Medical and mental health websites are replete with descriptions of the symptoms of depression and helpline numbers are well advertised, but recognition of warning signs and the rates of depression and suicide, especially among young people, remain unaffected.3,4
To date, the diagnosis of depression is dependent upon people reporting subjective symptoms and health care providers recognizing them as a mental health problem. Although laboratory tests can rule out medical conditions that produce symptoms similar to depression, at present there is no simple test for a clear biological marker of depression. To improve detection and treatment rates we need indicators that are more readily available and easier to use. In addition, we must find ways to draw people's attention to depressive symptoms and help them navigate their treatment options. This may be possible with everyday tools such as smartphones and wearable devices.5
Detection of Depression
According to the Diagnostic and Statistical Manual of Mental Disorders, fifth edition (DSM-5),6 major depressive episodes are characterized by sad mood or loss of interest in usual activities. There are typically changes in physical functioning such as sleep, appetite, and energy. Depending on baseline functioning, cognitive slowing might be noticeable and affect normal daily functioning at school or work, with friends or family, or during solitary activities like playing video games. Cognitions or thoughts during the course of depression can become more negative, hopeless, or self-critical.7 Thoughts of death or suicide may accompany feelings of hopelessness that relief from the emotional pain is beyond reach.
Standard Methods of Screening for Depression
A diagnosis of depression requires a thorough interview by a trained clinician who is familiar with DSM-5 diagnostic criteria. Electronically administered self-report questionnaires can streamline this process by screening people for depressive symptoms.8 The Patient Health Questionnaire-9 (PHQ-9)9 is a good example of a commonly used screening measure in which respondents rate the presence of nine depressive symptoms over the prior 2 weeks. Each symptom is rated on a scale from 0 (“not at all”) to 3 (“nearly every day”) and the scores are summed for a total score. A PHQ-9 score of 10 or higher has a sensitivity of 88% and a specificity of 88% to detect major depression.10 A score of 5 to 9 suggests mild depression, 10 to 14 moderate depression, 15 to 19 moderately severe depression, and 20 or more severe depression.11 Triage through self-report measures administered electronically or on paper can improve efficiency by identifying symptoms that warrant further evaluation. Administering such questionnaires to all patients can improve detection.
After determination that a diagnosis of major depression is present, measurement of the severity of symptoms over time and in response to treatment is helpful for managing the disorder. Reduction in symptom severity scores can indicate improvement, whereas lack of change or worsening of symptoms can aid the clinician in making needed adjustments to the treatment plan. Rating scales like the Quick Inventory of Depressive Symptomatology12 are available in clinician rated (QIDS-C) and self-report (QIDS-SR) formats. Both versions include 16 items rated on a scale of 0 to 3 and summed for a total score, with higher scores reflecting greater severity of symptoms.
Self-report measures are convenient and may not always require a doctor's visit, but the ratings, which reflect a person's subjective judgments, are influenced by the emotional state of the respondent. Thus, it is not uncommon for symptoms to be either over- or under-rated, thus reducing overall accuracy of self-reports. Although generally considered a more accurate representation of patient's symptoms, administration of clinician rating scales requires specialized training and sufficient patient contact time to administer. Accuracy is gained, but efficiency is compromised.
Standardized methods to diagnose depression are designed to provide valid and reliable symptom measures, but in doing so do not capture the phenotypic variability of depression. Two people can have the same total score on self-reports or clinician ratings but present with two different symptom patterns.13 Personalized classification of depression using sensor data may be a viable alternative.
Sensors to Detect Symptoms of Depression
Consider this possible scenario. The patient is a college student who uses a mobile phone and has a wearable sensor on her wrist. These devices passively collect data about her activities and habits. Initially, the patient is not depressed, and these devices collect her baseline activity and habits. As her semester progresses into late autumn, when stress is high, her behaviors change. She stops exercising, her sleeping patterns become erratic, and she loses interest in social contacts and activities. The sensors in her devices detect these changes, provide her with that feedback, and prompt her to complete a survey of depression symptoms. The survey produces a score, the device informs the patient that she is showing signs of depression, and it provides her with a local helpline number and encourages her to seek help. If the patient does not signal that she has made that call within a few days, the device asks her to complete another mood rating and prompts her again to seek treatment. This is an example of making use of everyday sensors to detect changes in behavior that may signal the onset of depression and prompt help-seeking.
Depression symptoms vary in severity but often include loss of interest or pleasure in usual activities, such as reduced time using smartphone applications (apps) or sedentary behavior as indicated by fewer changes in location based on global position system (GPS) or GPS downloads.14 There are often changes in appetite and weight logged into weight management apps,15 and low energy is detectable through self-report or accelerometer readouts.16 Similarly, difficulty with or excessive sleeping are detectible with wrist sensors,16 and cognitive changes such as poor concentration and inability to make decisions could be reflected in reaction time on game apps.17
In recent years, there has been a growing interest in using sensors from mobile devices for depression sensing. Saeb et al.14 collected geolocation and phone usage data as proxies for activity using a mobile phone app (Purple Robot) developed at Northwestern University. Participants (n = 40 adults) were grouped based on high (≥5) or low (<5) scores on the PHQ-9. A logistic regression of sensor data collected for 2 weeks on PHQ-9 score classification achieved 86.5% accuracy in identifying high and low levels of depression. Additionally, the authors trained a linear regression model to predict the participants' PHQ-9 scores from the collected data and obtained an average error of 23.5%.14
Farhan et al.18 developed a mobile app that collected user data through smartphones and applied multiple machine learning models for detection of depression. Their app detected behavioral features extracted from changes in location (GPS coordinates), physical activity (stationary, walking, running), environment (in darkness for more than 1 hour), audio (silence or noise), and conversation data. They examined average daily activity, variations from day to day, and transfer between geo-locations. Using machine learning, they were able to categorize the behavioral indicators of 60 participants with low, medium, or high PHQ-9 scores with an overall accuracy of 87%.18
There are several technical challenges to using smartphone apps to collect behavioral data. These include variation in platforms (eg, Android, iPhone), limitations in GPS access, fluctuating phone power, and excessive battery use of apps,5 all of which can lead to periods of missing data. Yue et al.19 addressed the problem of gaps in GPS data by developing the LifeRhythm app, which collects WiFi association logs in addition to physical activity and location data. Feature extraction and data fusion techniques filled in the gaps in missing GPS data with WiFi logs. Seventy-nine college students used the app over an 8-month period and completed the PHQ-9 at baseline and every 14 days thereafter until study completion. Based on the PHQ-9, clinicians classified participants as either depressed or nondepressed. Data fusion led to detection of a considerably stronger relationship between self-reported depression scores and activity data than relying solely on GPS data.
Lu et al. 16 also used the LifeRhythm app to collect location data from 103 college students during a 4-month period in an effort to explore a more comprehensive measurement approach that included traditional assessment methods with sensor technology. In addition to the smartphone data, they collected data on sleep, physical activity, and heart rate from fitness bands. Clinicians used structured interviews based on the DSM-5 to conduct baseline assessments of depression and rated each participant's depressive symptoms as “stable (0),” “mild (1),” “moderate (2),” or “severe (3).” Participants also completed the QIDS-SR12 every 7 days to assess severity of depressive symptoms. The authors used a heterogeneous multitask learning method that jointly builds inference models for related tasks, including classification and regression tasks that allow for prediction of depression score based on these indicators. Results showed that activity variables such as time staying at home and total time asleep were strong predictors of depression ratings, supporting the use of passive data collection for depression detection.
Although not the original intent of these studies, these studies do provide validation that self-report and clinician-rated depression scales are tapping trackable behaviors such as activity and sleep, which are two commonly reported symptoms of depression. They open the door to broadening our conceptualization of depression screening to make use of all available tools to help treat it. This movement also brings with it challenges with regard to privacy and safety inherent in the use of all mobile health tools.
Mobile Mental Health Apps and User Safety
Using mobile devices to detect symptoms of depression in lieu of more traditional clinical assessment measures raises questions of user safety and privacy. Although access to mental health care is limited in many areas, might it still be better to encourage use of available mental health services rather than rely on mobile devices so that risk of self-harm can be evaluated and established treatment methods implemented? Alternatively, given the increasing rate of depression and suicide in this country, perhaps the only way to stem the tide is to increase overall surveillance by mobilizing all available tools to improve detection. This alternative raises concerns not only about safety of the use of such tools, but also medical and legal accountability; that is, who is responsible if someone uses an app to measure symptoms of depression, fails to seek treatment, and later dies from suicide. If a clinician is responsible for adequately evaluating symptoms of depression and suicidal risk, and creation of a safety plan when indicated, might an app developer have similar ethical responsibilities? These are some of the underlying clinical and ethical questions that are of particular concern to mobile mental health where the risk of suicidal behaviors is a reality in people with depression. As with all clinical tools, mobile apps are intended as tools to support clinical decision-making rather than to replace it.
Of similar concern is the acceptability of devices recommending self-help activities based on user ratings or behavioral monitoring with sensors in portable devices. Randomized controlled trials where manualized treatment protocols are strictly followed are the foundation for evidence-based treatment recommendations. If evidence of efficacy is established, clinicians are encouraged to use these methods after receipt of appropriate training in their implementation and undergo quality control assessments to assure proper adherence to protocols. Without similarly rigorous tests of efficacy it may not be appropriate to use mobile devices to provide mental health recommendations until that testing has occurred.
Yang et al.,20 for example, proposed a system to predict depression and provide personalized assistance to improve mental health. Their mobile phone app collected self-reported symptom data of users and found five external characteristics of depression. They used these characteristics to implement a personalized hierarchical recommendation service, which proposed self-care recommendations based on detected levels of depression. For example, the recommendations included encouragement for music therapy or informing family and friends when symptoms were severe. However, the recommendation service did not provide guidance to the user to seek professional treatment. This raises the concern that distribution of mobile self-help apps can mislead users into believing that self-help is sufficient for the treatment of depression.
Rabbi et al.15 built a smartphone app that provides automatic feedback to the users on changes in their behaviors, such as eating habits or physical activity. These investigators used the sequential decision-making algorithm called “multi-armed bandit” to generate suggestions for activities and the “Pareto-frontier algorithm” to to modify the suggestions to adapt to the user's preferences with the aim of personalizing recommended activities to improve well-being. For example, the mobile app might suggest physical activities such as walking or exercising based on the baseline behavior of the user. Raising user awareness provides an opportunity for the person to actively decide to take an action, one of which might be to seek treatment for depressive symptoms.
The analysis of sensory data collected by everyday devices, such as smartphones and wearable smart watches/bands, has the potential to detect depression symptoms. A major challenge is how to connect the depression detection with professional treatment. Personalized decision-support is vital to provide the user the necessary information of which depression symptoms have been detected and their meaning, as well as possible treatment options. However, the user's privacy and safety are challenging ethical topics for the development of online health tools and need to be addressed.
- Alatab S, Sepanlou SG, Ikuta K, et al. GBD 2017 Inflammatory Bowel Disease Collaborators. The global, regional, and national burden of inflammatory bowel disease in 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet Gastroenterol Hepatol. 2020;5(1):17–30. doi:10.1016/S2468-1253(19)30333-4 [CrossRef] PMID:31648971
- Kazak AE, Nash JM, Hiroto K, Kaslow NJ. Psychologists in patient-centered medical homes (PCMHs): roles, evidence, opportunities, and challenges. Am Psychol. 2017;72(1):1–12. doi:10.1037/a0040382 [CrossRef] PMID:28068134
- Center for Disease Control and Prevention. Data and statistics on children' mental health. https://www.cdc.gov/childrensmentalhealth/data.html. Accessed May 5, 2020.
- Bose J, Hedden SLLipari RNP-LESubstance Use and Mental Health Services Administration. Key substance use and mental health indicators in the United States: results from the 2016 National Survey on Drug Use and Health. https://www.samhsa.gov/data/sites/default/files/NSDUH-FFR1-2016/NSDUH-FFR1-2016.htm. Accessed May 14, 2020.
- Boonstra TW, Nicholas J, Wong QJ, Shaw F, Townsend S, Christensen H. Using mobile phone sensor technology for mental health research: integrated analysis to identify hidden challenges and potential solutions. J Med Internet Res. 2018;20(7):e10131. doi:10.2196/10131 [CrossRef] PMID:30061092
- American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders. 5th ed. Arlington, VA: American Psychiatric Publishing; 2013.
- Wright JH, Basco MR, Thase M, Brown G. Learning Cognitive Therapy: An Illustrated Guide. 2nd ed. Arlington, VA: American Psychiatric Publishing; 2017.
- Jha MK, Grannemann BD, Trombello JM, et al. A structured approach to detecting and treating depression in primary care: VitalSign6 project. Ann Fam Med. 2019;17(4):326–335. doi:10.1370/afm.2418 [CrossRef] PMID:31285210
- Kroenke K, Spitzer RL, Williams JBW. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. 2001;16(9):606–613. doi:10.1046/j.1525-1497.2001.016009606.x [CrossRef] PMID:11556941
- Rush AJ, Trivedi MH, Stewart JW, et al. Combining medications to enhance depression outcomes (CO-MED): acute and long-term outcomes of a single-blind randomized study. Am J Psychiatry. 2011;168(7):689–701. doi:10.1176/appi.ajp.2011.10111645 [CrossRef] PMID:21536692
- Spitzer RL, Williams JBW, Kroenke K. Instruction Manual: Instructions for Patient Health Questionnaire (PHQ) and GAD-7 Measures. https://www.ons.org/sites/default/files/PHQandGAD7_InstructionManual.pdf. Accessed May 6, 2020.
- Rush AJ, Trivedi MH, Ibrahim HM, et al. The 16-Item Quick Inventory of Depressive Symptomatology (QIDS), clinician rating (QIDS-C), and self-report (QIDS-SR): a psychometric evaluation in patients with chronic major depression. Biol Psychiatry. 2003;54(5):573–583. doi:10.1016/S0006-3223(02)01866-8 [CrossRef] PMID:12946886
- Fried EI. Moving forward: how depression heterogeneity hinders progress in treatment and research. Expert Rev Neurother. 2017;17(5):423–425. doi:10.1080/14737175.2017.1307737 [CrossRef] PMID:28293960
- Saeb S, Zhang M, Karr CJ, et al. Mobile phone sensor correlates of depressive symptom severity in daily-life behavior: an exploratory study. J Med Internet Res. 2015;17(7):e175. doi:10.2196/jmir.4273 [CrossRef] PMID:26180009
- Rabbi M, Aung MH, Zhang M, Choudhury T. MyBehavior: automatic personalized health feedback from user behaviors and preferences using smartphones. In: UbiComp 2015 - Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing. ACM Digital Library; 2015. doi:10.1145/2750858.2805840 [CrossRef]
- Lu J, Bi J, Shang C, et al. Joint modeling of heterogeneous sensing data for depression assessment via multi-task learning. Proceeding of the ACM Interactive, Mobile, Wearable Ubiquitous Technology. 2018. doi:10.1145/3191753 [CrossRef]
- Rudovic O, Utsumi Y, Guerrero R, Peterson K, Rueckert D, Picard RW. Meta-weighted gaussian process experts for personalized forecasting of AD cognitive changes. Proceed Machine Learning Res. 2019;106:1–15.
- Farhan AA, Lu J, Bi J, Russell A, Wang B, Bamis A. Multi-view bi-clustering to identify smartphone sensing features indicative of depression. In: 2016 IEEE 1st International Conference on Connected Health: Applications, Systems and Engineering Technologies. 2016;264–273. doi:10.1109/CHASE.2016.27 [CrossRef]
- Yue C, Ware S, Morillo R, et al. Fusing location data for depression prediction. In: 2017 IEEE SmartWorld Ubiquitous Intelligence and Computing, Advanced and Trusted Computed, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People and Smart City Innovation. 2018;1–8. doi:10.1109/UIC-ATC.2017.8397515 [CrossRef]
- Yang S, Zhou P, Duan K, Hossain MS, Alhamid MF. emHealth: towards emotion health through depression prediction and intelligent health recommender system. Mob Netw Appl. 2018;23(2):216–226. doi:10.1007/s11036-017-0929-3 [CrossRef]