Automated assessment of psychiatric disorders using speech: A systematic review.

Published on Jan 31, 2020
路 DOI :10.1002/LIO2.354
Daniel M. Low6
Estimated H-index: 6
(MIT: Massachusetts Institute of Technology),
Kate H. Bentley11
Estimated H-index: 11
(Harvard University),
Satrajit S. Ghosh40
Estimated H-index: 40
(Harvard University)
Objective: There are many barriers to accessing mental health assessments including cost and stigma. Even when individuals receive professional care, assessments are intermittent and may be limited partly due to the episodic nature of psychiatric symptoms. Therefore, machine-learning technology using speech samples obtained in the clinic or remotely could one day be a biomarker to improve diagnosis and treatment. To date, reviews have only focused on using acoustic features from speech to detect depression and schizophrenia. Here, we present the first systematic review of studies using speech for automated assessments across a broader range of psychiatric disorders. Methods: We followed the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines. We included studies from the last 10 years using speech to identify the presence or severity of disorders within the Diagnostic and Statistical Manual of Mental Disorders (DSM-5). For each study, we describe sample size, clinical evaluation method, speech-eliciting tasks, machine learning methodology, performance, and other relevant findings. Results: 1395 studies were screened of which 127 studies met the inclusion criteria. The majority of studies were on depression, schizophrenia, and bipolar disorder, and the remaining on post-traumatic stress disorder, anxiety disorders, and eating disorders. 63% of studies built machine learning predictive models, and the remaining 37% performed null-hypothesis testing only. We provide an online database with our search results and synthesize how acoustic features appear in each disorder. Conclusion: Speech processing technology could aid mental health assessments, but there are many obstacles to overcome, especially the need for comprehensive transdiagnostic and longitudinal studies. Given the diverse types of data sets, feature extraction, computational methodologies, and evaluation criteria, we provide guidelines for both acquiring data and building machine learning models with a focus on testing hypotheses, open science, reproducibility, and generalizability. Level of Evidence: 3a.
馃摉 Papers frequently viewed together
6 Authors (Zhixin Yang, ..., Liu Yuzhong)
3 Authors (Xiaoyong Lu, ..., Hongwu Yang)
Sep 15, 2019 in INTERSPEECH (Conference of the International Speech Communication Association)
#1John Gideon (UM: University of Michigan)H-Index: 8
#2Heather T. Schatten (Butler Hospital)H-Index: 10
Last. Emily Mower Provost (UM: University of Michigan)H-Index: 20
view all 4 authors...
#1Kate Allsopp (University of Liverpool)H-Index: 5
#2John Read (UEL: University of East London)H-Index: 61
Last. Peter Kinderman (University of Liverpool)H-Index: 47
view all 4 authors...
Abstract The theory and practice of psychiatric diagnosis are central yet contentious. This paper examines the heterogeneous nature of categories within the DSM-5, how this heterogeneity is expressed across diagnostic criteria, and its consequences for clinicians, clients, and the diagnostic model. Selected chapters of the DSM-5 were thematically analysed: schizophrenia spectrum and other psychotic disorders; bipolar and related disorders; depressive disorders; anxiety disorders; and trauma- and...
#1Wei Pan (CAS: Chinese Academy of Sciences)H-Index: 3
#2Jonathan Flint (Semel Institute for Neuroscience and Human Behavior)H-Index: 103
Last. Tingshao Zhu (CAS: Chinese Academy of Sciences)H-Index: 25
view all 7 authors...
: A large proportion of Depression Disorder patients do not receive an effective diagnosis, which makes it necessary to find a more objective assessment to facilitate a more rapid and accurate diagnosis of depression. Speech data is easy to acquire clinically, its association with depression has been studied, although the actual predictive effect of voice features has not been examined. Thus, we do not have a general understanding of the extent to which voice features contribute to the identific...
Jun 17, 2019 in EC (Economics and Computation)
#1Jon Kleinberg (Cornell University)H-Index: 119
#2Sendhil Mullainathan (U of C: University of Chicago)H-Index: 84
Algorithms can be a powerful aid to decision-making - particularly when decisions rely, even implicitly, on predictions [7]. We are already seeing algorithms play this role in domains including hiring, education, lending, medicine, and criminal justice [2, 6, 10]. As is typical in machine learning applications, accuracy is an important measure for these tasks.
Black box machine learning models are currently being used for high-stakes decision making throughout society, causing problems in healthcare, criminal justice and other domains. Some people hope that creating methods for explaining these black box models will alleviate some of the problems, but trying to explain black box models, rather than creating models that are interpretable in the first place, is likely to perpetuate bad practice and can potentially cause great harm to society. The way fo...
#6Meng Li (NYU: New York University)H-Index: 30
BACKGROUND: The diagnosis of posttraumatic stress disorder (PTSD) is usually based on clinical interviews or self-report measures. Both approaches are subject to under- and over-reporting of symptoms. An objective test is lacking. We have developed a classifier of PTSD based on objective speech-marker features that discriminate PTSD cases from controls. METHODS: Speech samples were obtained from warzone-exposed veterans, 52 cases with PTSD and 77 controls, assessed with the Clinician-Administere...
#1Yasir Tahir (NTU: Nanyang Technological University)H-Index: 6
#2Zixu YangH-Index: 6
Last. Justin Dauwels (NTU: Nanyang Technological University)H-Index: 32
view all 10 authors...
Negative symptoms in schizophrenia are associated with significant burden and possess little to no robust treatments in clinical practice today. One key obstacle impeding the development of better treatment methods is the lack of an objective measure. Since negative symptoms almost always adversely affect speech production in patients, speech dysfunction have been considered as a viable objective measure. However, researchers have mostly focused on the verbal aspects of speech, with scant attent...
#1Danilo Bzdok (RWTH Aachen University)H-Index: 39
#2John P. A. Ioannidis (Stanford University)H-Index: 205
Recent decades have seen dramatic progress in brain research. These advances were often buttressed by probing single variables to make circumscribed discoveries, typically through null hypothesis significance testing. New ways for generating massive data fueled tension between the traditional methodology that is used to infer statistically relevant effects in carefully chosen variables, and pattern-learning algorithms that are used to identify predictive signatures by searching through abundant ...
#1Daniel SmilkovH-Index: 12
#2Nikhil ThoratH-Index: 7
Last. Martin WattenbergH-Index: 52
view all 20 authors...
TensorFlow.js is a library for building and executing machine learning algorithms in JavaScript. TensorFlow.js models run in a web browser and in the Node.js environment. The library is part of the TensorFlow ecosystem, providing a set of APIs that are compatible with those in Python, allowing models to be ported between the Python and JavaScript ecosystems. TensorFlow.js has empowered a new set of developers from the extensive JavaScript community to build and deploy machine learning models and...
Cited By40
#1Flavio Bertini (University of Parma)H-Index: 8
#2Davide Allevi (UNIBO: University of Bologna)H-Index: 1
Last. Danilo Montesi (UNIBO: University of Bologna)H-Index: 20
view all 5 authors...
Abstract null null According to the World Health Organization, the number of people suffering from dementia worldwide will grow to 150 million by mid-century, and Alzheimer鈥檚 disease is the most common form of dementia contributing to 60%鈥70% of cases. The problem is compounded by the fact that current pharmacologic treatments are only symptomatic, and therapies are ineffective in slow down or cure the degenerative process. An automatic and standardize classifier for Alzheimer鈥檚 disease is there...
#1BertiniFlavio (UNIBO: University of Bologna)
#2AlleviDavide (UNIBO: University of Bologna)
Last. Calz脿Laura (UNIBO: University of Bologna)
view all 5 authors...
The World Health Organization estimates that 50 million people are currently living with dementia worldwide and this figure will almost triple by 2050. Current pharmacological treatments are only s...
#1Hussein SarwatH-Index: 1
#2Hassan SarwatH-Index: 1
Last. Mohammed I. AwadH-Index: 11
view all 6 authors...
The large number of poststroke recovery patients poses a burden on rehabilitation centers, hospitals, and physiotherapists. The advent of rehabilitation robotics and automated assessment systems can ease this burden by assisting in the rehabilitation of patients with a high level of recovery. This assistance will enable medical professionals to either better provide for patients with severe injuries or treat more patients. It also translates into financial assistance as well in the long run. Thi...
#1Alexandra K枚nig (IRIA: French Institute for Research in Computer Science and Automation)H-Index: 19
#2Elisa Mallick (IRIA: French Institute for Research in Computer Science and Automation)
Last. Philippe Robert (IRIA: French Institute for Research in Computer Science and Automation)H-Index: 30
view all 7 authors...
BACKGROUND Certain neuropsychiatric symptoms (NPS), namely apathy, depression, and anxiety demonstrated great value in predicting dementia progression, representing eventually an opportunity window for timely diagnosis and treatment. However, sensitive and objective markers of these symptoms are still missing. Therefore, the present study aims to investigate the association between automatically extracted speech features and NPS in patients with mild neurocognitive disorders. METHODS Speech of 1...
#2Jessica RobinH-Index: 12
Last. Anthony YeungH-Index: 4
view all 4 authors...
#1Hansen LH-Index: 1
#2Yan-Ping Zhang (Hoffmann-La Roche)H-Index: 1
Last. Riccardo Fusaroli (AU: Aarhus University)H-Index: 20
view all 6 authors...
Abstract null Objective null Affective disorders have long been associated with atypical voice patterns, however, current work on automated voice analysis often suffers from small sample sizes and untested generalizability. This study investigated a generalizable approach to aid clinical evaluation of depression and remission from voice. null Methods null A Mixture-of-Experts machine learning model was trained to infer happy/sad emotional state us ing three publicly available emotional speech co...
#1Lena Palaniyappan (UWO: University of Western Ontario)H-Index: 36
Automated extraction of quantitative linguistic features has the potential to predict objectively the onset and progression of psychosis. These linguistic variables are often considered to be biomarkers, with a large emphasis placed on the pathological aberrations in the biological processes that underwrite the faculty of language in psychosis. This perspective offers a reminder that human language is primarily a social device that is biologically implemented. As such, linguistic aberrations in ...
#2Joel Schwartz (Biogen Idec)
Last. Eleftheria Kyriaki Pissadaki (Biogen Idec)H-Index: 8
view all 9 authors...
Abstract null Although speech and language biomarker (SLB) research studies have shown methodological and clinical promise, some common limitations of these studies include small sample sizes, limited longitudinal data, and a lack of a standardized survey protocol. Here, we introduce the Voiceome Protocol and the corresponding Voiceome Dataset as standards which can be utilized and adapted by other SLB researchers. The Voiceome Protocol includes 12 types of voice tasks, along with health and dem...
#1Sanne Brederoo (UMCG: University Medical Center Groningen)H-Index: 6
#2F.G. Nadema (UMCG: University Medical Center Groningen)
Last. Iris E. C. Sommer (UMCG: University Medical Center Groningen)H-Index: 77
view all 0 authors...
Abstract null null Psychiatry is in dire need of a method to aid early detection of symptoms. Recent developments in automatic speech analysis prove promising in this regard, and open avenues for implementation of speech-based applications to detect psychiatric symptoms. The current survey was conducted to assess positions with regard to speech recordings among a group (n聽=聽675) of individuals who experience psychiatric symptoms. Overall, respondents are open to the idea of speech recordings in ...
This website uses cookies.
We use cookies to improve your online experience. By continuing to use our website we assume you agree to the placement of these cookies.
To learn more, you can find in our Privacy Policy.