Abstract
Background Acute respiratory syndrome due to coronavirus 2 (SARS-CoV-2) is characterised by heterogeneous levels of disease severity. It is not necessarily apparent whether a patient will develop severe disease or not. This cross-sectional study explores whether acoustic properties of the cough sound of patients with coronavirus disease 2019 (COVID-19), the illness caused by SARS-CoV-2, correlate with their disease and pneumonia severity, with the aim of identifying patients with severe disease.
Methods Voluntary cough sounds were recorded using a smartphone in 70 COVID-19 patients within the first 24 h of their hospital arrival, between April 2020 and May 2021. Based on gas exchange abnormalities, patients were classified as mild, moderate or severe. Time- and frequency-based variables were obtained from each cough effort and analysed using a linear mixed-effects modelling approach.
Results Records from 62 patients (37% female) were eligible for inclusion in the analysis, with mild, moderate and severe groups consisting of 31, 14 and 17 patients respectively. Five of the parameters examined were found to be significantly different in the cough of patients at different disease levels of severity, with a further two parameters found to be affected differently by the disease severity in men and women.
Conclusions We suggest that all these differences reflect the progressive pathophysiological alterations occurring in the respiratory system of COVID-19 patients, and potentially would provide an easy and cost-effective way to initially stratify patients, identifying those with more severe disease, and thereby most effectively allocate healthcare resources.
Abstract
Acoustic analysis of cough sounds recorded via smartphone in COVID-19 patients reveals features of cough that could potentially be used to provide a fast, easy, cost-effective way to identify patients’ disease severity at home or in any healthcare setting https://bit.ly/3XtTJOd
Introduction
The global coronavirus disease 2019 (COVID-19) pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [1] continues to be a major health problem. Although most people affected by COVID-19 now have mild-to-moderate symptoms and recover within a few weeks, some that develop more severe disease and pneumonia often have a poorer prognosis. It is not known with certainty which factors predispose to severe disease, although certain genetic variants have been implicated [2]. Other risk factors including age, comorbidities and determinants of cardiovascular risk have also been identified [3]. There is also evidence that males are more susceptible to severe disease and death from COVID-19 [4, 5]. Although it has been suggested that the immune and inflammatory response may contribute to this sex disparity, the underlying pathophysiological mechanisms have not been fully elucidated.
Much research time and money has been invested into finding ways to obtain an early COVID-19 diagnosis. In this regard, laboratory methods, and the use of imaging techniques [6, 7], statistical models [8] and artificial intelligence [9] have been investigated. The currently accepted gold-standard diagnostic test – reverse transcriptase polymerase chain reaction (RT-PCR) – is widely available and relatively accessible [10]. However, although risk stratification protocols have been developed [11], potential diagnostic and prognostic tools are mostly based on relatively expensive and in many scenarios, difficult-to-access imaging methods (radiography, ultrasound, computed tomography (CT)) [12, 13]. There is a clinical need for a simpler and more widely available prognostic tool that would enable healthcare providers to identify patients who have developed or are at risk of developing a severe disease, thereby facilitating triaging of patients and early intervention even at a patient's home or primary care centres [14–16].
The analysis and interpretation of cough sounds in the initial stages of COVID-19 disease could potentially provide a predictive tool that would meet these criteria. A dry cough is one of the most common symptoms of COVID-19, which occurs during the initial disease phase in up to 70% of patients [17, 18]. To date, several studies have applied machine learning paradigms to the acoustic properties of cough sounds to develop a screening or diagnostic tool for COVID-19 [19–22]. Many of these studies have leveraged the recording capabilities of the ubiquitous and easy-to-use smartphone to collect data, often via crowdsourcing techniques [9]. Such devices are available to a large proportion of the population and allow cost-effective recording of coughs using built-in microphones, even outside more sophisticated healthcare settings [23].
We hypothesised that the acoustic properties of the cough sound of patients with COVID-19 would differ with disease and pneumonia severity. Given the different effects of COVID-19 on males and females, we expected that the results would differ by sex. In addition, we postulated that several other variables including age, smoking status, pre-existing respiratory conditions, length of time with symptoms and fraction of inspired oxygen (FiO2) could also potentially affect this relationship. To test this hypothesis, here we explore the correlation between the frequency content of cough sounds recorded via smartphone with the disease and pneumonia severity in COVID-19 patients.
Material and methods
Voluntary cough sounds were recorded in 70 COVID-19 patients over the age of 18 years, whose disease symptoms had been present for 15 days or less, within the first 24 h (in some exceptional cases 48 h) of their arrival to our teaching hospital. We used a cross-sectional study design, with data collection taking place on 21 different dates between April 2020 and May 2021. The sample size was determined by the number of participants available on these dates. The participants were divided into three groups according to the severity of their disease at the time of cough recording. The mild group consisted of patients without pneumonia, the moderate group were patients with pneumonia but not requiring supplemental oxygen, and severe patients showed moderate or severe pneumonia that necessitated oxygen therapy with invasive or noninvasive respiratory support [24]. The pre-existing respiratory conditions that were present in some patients were asthma, COPD and interstitial pulmonary fibrosis. The respiratory rate of the patients was between 18 and 24 breaths·min−1, and there were no apparent limitations to their production of voluntary cough sounds. The study was conducted in accordance with the Declaration of Helsinki and approved by the institutional Ethics Committee (CEIM, ref. 10231I). Fully informed written consent was obtained from patients or their relatives prior to their inclusion in the study. In some cases, verbal consent was initially obtained, as recommended by the Ethics Committee, and was further completed with the written consent.
Voluntary cough sounds were recorded by respiratory medicine specialists with a smartphone (Samsung Galaxy S21). Recordings were made in a room with as little background noise as possible. Patients were instructed to take a deep breath and then cough voluntarily; this often caused them to trigger an involuntary cough. Patients coughed three to four times in the direction of the smartphone, which was positioned 15–20 cm from their mouth. Patients who required a low-flow oxygen device were asked to remove the mask briefly to perform the manoeuvres. Cough sounds were acquired and sampled at 48 kHz using the built-in hardware on the smartphone and the Easy Voice Recorder application (available at Google Play Store). For infection control purposes, the smartphone was encased in a disposable latex cover prior to each recording. We previously confirmed that the addition of this cover did not affect the fidelity of the cough recording by comparing the temporal, spectral and time-frequency characteristics of audio signals recorded with and without the cover.
In this study, a single expiratory effort has been labelled as a “cough effort” (CE), with a “cough bout” consisting of two or more CEs following a single initial inspiration [25]. This is illustrated in figure 1a. Each CE can be further segmented into three constituent parts: a first sound (CS1), an intermediate part (CINT) and if present, a second sound (CS2). This segmentation was performed manually using both the aural and visual representation of each cough sound and is illustrated in figure 1b [26]. The cough sound occurs during the expulsive phase of the cough, with the first sound happening at the moment of glottal opening. The intermediate part follows this and represents the steady-state flow of air with the glottis open. The second cough sound, which is not always present, occurs at the end of the expulsive phase as the glottis narrows. Cough recordings were included as valid in the database for further analysis only if a minimum of one valid cough bout, consisting of a minimum of two CEs, could be identified and isolated from the recording. Single CEs were excluded as the results of a single effort may be random and thus not representative of the cough sounds of a participant, and to ensure a balanced data set (equal number of coughs in first and second position in the bout).
Individual CEs were identified in each recording by visual and aural inspection of the recorded signals (Audacity Team 2021, https://audacityteam.org/). A flowchart outlining the analysis methodology is shown in figure 2. A total of 459 CEs were isolated, with a median of six CEs (range 5–10) from each recording. Information on the CE number within the recording as well as the CE position within a bout were also annotated. The CEs were filtered using a 20th order Chebyshev Type II low-pass filter with a cut-off frequency of 6 kHz to minimise background noise. The envelope of each identified CE was then calculated using the root mean square. This was used as an aid to identify and manually split each CE into its above-mentioned constituent parts – CS1, CINT and CS2 when this was present. For each CE, as well as its constituent parts, the power spectrum was estimated using Welch's method with a Hanning window and a 50% overlap applied to compute the modified periodograms. Several time- and frequency-based parameters of each whole CE signal and its constituent parts were obtained and analysed (table 1). Data analysis was performed offline using custom developed scripts in Matlab (version 9.9.0.R2020b; The MathWorks Inc., Natick, MA, USA).
Statistical analysis was performed in RStudio (RStudio Team, Boston, MA, USA). The relationship between each parameter and disease severity was investigated with a linear mixed-effects model with maximum likelihood optimisation using the lme4 library [27]. The mean value of each parameter for each CE position (nesting level one) was nested within each subject (nesting level two), which was in turn nested according to disease severity (nesting level three). Disease severity, sex and CE position were entered as fixed effects into the model. An interaction term (disease severity * sex) was included in the model to investigate whether the parameter being examined was affected differently by disease severity in males and females. An intercept for individual subjects was included as a random effect to account for differences between subjects. Several further models were then defined, each with an additional fixed effect added to the main model. These were the patients’ age, smoking status, length of time with symptoms, presence/absence of a pre-existing respiratory condition and FiO2. In each case, visual inspection of residual plots did not reveal any obvious deviations from homoscedasticity or normality. The variance inflation factor (VIF) was calculated for each independent variable to ensure that there was no collinearity between them (VIF <5). A type II ANOVA with F-tests and p-values using Satterthwaite's method for denominator degrees-of-freedom and F-statistic was applied to test whether the parameters had a statistically significant effect on the derived model. Non-parametric Kruskal–Wallis and Wilcoxon rank sum tests were applied as post hoc tests, with Benjamini and Hochberg p-value adjustment. In those cases where a significant interaction effect between disease severity and sex was observed, male and female data were also examined separately. An α value of 0.05 was used to indicate significance throughout.
Results
A total of 70 participant recordings were initially examined for eligibility. Of these, recordings from six individuals were excluded due to technical problems with the recording quality (five recordings had (inadvertently) been recorded at too low sampling frequency and a sixth was excluded due to the presence of a second person coughing simultaneously), and a further two recordings were also excluded as no clearly discernible cough sounds were present. The remaining 62 recordings were deemed eligible and included in all analyses. The main participant characteristics and relevant clinical data are presented in table 2.
All participants had a minimum of one CE at position 1 (CE1) and one at position 2 (CE2). Only 33 participants (53%) had a CE at position 3 (CE3). Therefore, only CE1s and CE2s were included in the analysis, to ensure balanced representation of each participant, and to allow for the effect of CE position within the bout on the features extracted to be examined.
As for the effect of sex, higher frequency content was found in female coughs than in male coughs. Cough sounds in general mirror the natural expected frequency content of the voice of males and females, with overall frequency content of female coughs higher than that of male coughs.
Five of the parameters examined were found to be significantly different in the cough recordings of patients at different disease levels (figures 3 and 4). In the whole CE signal, these parameters were the frequency variability (FVAR) (760.3 versus 767.4 versus 614.9 Hz, p=0.0031) and peak frequency (FPK) (473.9 versus 340.1 versus 610.3 Hz, p=0.0025). Values given within parentheses are the median values of the mild, moderate and severe groups, respectively, with this order followed in all the results here. In CS1, the FVAR (729.5 versus 601.9 versus 526.7 Hz, p=0.0010) and frequency of maximum energy (FMAX) (2191.8 versus 1898.4 versus 1620.4 Hz, p=0.0130) differed significantly, and in CINT, the interquartile range (FIQR) (674.7 versus 825.7 versus 527.6 Hz, p=0.0260) also differed significantly. Pairwise comparisons using the Wilcoxon rank sum test with continuity correction revealed significant differences for all five parameters between individuals with mild and severe disease. A significant difference for FVAR and FPK in the whole CE signal, and the FIQR of CINT between those with moderate and severe disease was also found.
A significant interaction term for disease severity and sex was found for two parameters: FVAR and FMAX of CINT (figure 5). Moreover, significant differences were observed in FVAR of CINT for males (697.2 versus 752.0 versus 588.6 Hz, p=0.0131) and females (1054.5 versus 768.8 versus 597.5 Hz, p<0.0001), and in FMAX of CINT for females only (3567.0 versus 2691.0 versus 2190.0 Hz, p<0.001). Pairwise comparisons using the Wilcoxon rank sum test with continuity correction revealed significant differences for FVAR of CINT between female patients at all disease levels, and for FMAX of CINT between female patients with mild and moderate, and mild and severe disease. For male patients a significant difference was observed for FVAR in CINT between individuals with moderate and severe disease.
The position of the CE within the cough bout had a significant effect on the duration of the whole CE signal, and the duration of CS1 and CINT individually. However, there was no significant difference found between the frequency parameters reported here for CE positions 1 and 2. The addition of a fixed effect of either patients’ age, smoking status, length of time with symptoms, presence/absence of a pre-existing respiratory condition or FiO2 to the main model was found to have no significant effect on the model at a significance level of α=0.05.
Discussion
This study describes the relationship between frequency-based features of the cough sound in COVID-19 and the disease and pneumonia severity in the patient. These relationships were explored, considering patient and disease profiles (sex, age and smoking status, as well as duration of COVID-19 symptoms, presence or absence of a pre-existing respiratory condition and the oxygen requirements) using linear mixed-effects models. The analysis of cough recordings is a relatively easy way to get information about some diseases in the respiratory system. A qualitative assessment of cough sounds may be done by a medical professional in usual care scenarios. However, this relatively coarse assessment is subjective and depends on the expertise and hearing acuity of the professional involved. Healthcare professionals usually just differentiate dry versus productive cough; in fact, this is the most common comment in standard clinical records. Therefore, although a high-level distinction between disease types may be observed, more subtle nuances of cough sounds may be missed, and healthcare professionals may encounter difficulties in diagnosing from cough sounds [28]. Automatic algorithms can help to extract objective information from cough sounds and thus simplify the process and support the medical staff.
In our quantitative analysis, five frequency-based features (FVAR and FPK of the whole CE signal, FVAR and FMAX of CS1 and FIQR of CINT), were found to differ significantly with disease severity, the classification of which is based on the presence and/or severity of pneumonia [24]. We suggest that these differences reflect the progressive pathophysiological alterations of the respiratory system in patients with COVID-19 [29]. Differences have been previously noted in chest CT scans between patients with mild and severe/critical disease [13]. Although similar analysis of acoustic properties of cough sounds has been used to diagnose respiratory illnesses [30–32], we are not aware of any studies that explore a possible relationship between cough sounds and varying disease severity levels of a respiratory illness.
Two further frequency-based features, FVAR and FMAX of CINT, were observed to be affected differently in male and female patients by disease severity. CINT occurs between CS1 and CS2 and is the part of the cough sound that is produced via steady-state airflow with the glottis open. It is possible that the pathophysiology of COVID-19 differs between the sexes in this part due to well-known differences of male and female anatomy, which would be reflected in the sound differences we observed.
The data used in the present study consist of a clinically recorded and validated dataset, collected from a relatively large cohort of well-characterised patients. The cough recordings were acquired with an easy-to-use smartphone application in an early period of a patient's first contact with the health system, and by the healthcare professional caring for the patient. This helped to ensure that the recordings were of a consistently high quality across participants. The availability of relevant patient information enabled us to explore the effect of possible covariates – age, smoking status, length of time with symptoms, presence/absence of a pre-existing respiratory condition and required FiO2 – on the results obtained in our analysis. These strengths offer distinct advantages over other studies that use datasets that have been crowdsourced or collected using less stringent methodology. In addition, the use of the smartphone enables the cough recordings to be acquired in virtually any setting, thus overcoming limitations posed by location-dependent imaging, and other techniques.
There are also some possible limitations to our study. Spontaneous cough recording could be considered the optimal way to predict the pathophysiological situation in a respiratory patient. However, this can be difficult to acquire, as patients can have long periods without this spontaneous effort occurring. Therefore, we collected voluntary, induced coughs, which are easy to perform and have previously been validated as a good surrogate measure of the spontaneous cough from an acoustic perspective [33]. Our study included the analysis of CEs from positions 1 (CE1) and 2 (CE2) within a cough bout. The definition of a classical cough includes an inhalation prior to the cough sound occurring [34]. Therefore, the second cough sound in a bout is likely to be an expiration reflex (ER) rather than a true cough. However, although the distinction exists, as the two sounds are indistinguishable to the human ear, for clinical purposes no distinction was made between them. Our results suggest that the frequency content of the classical cough (CE1) and the ER (CE2) does not differ, but we noted that the duration of the ER appears shorter than that of the classical cough. Finally, although we found some apparent differences in the acoustic features of male and female cough sounds, our database was not completely balanced, consisting of 37% female patients.
Our study highlights acoustic features of the cough sound in COVID-19 patients that differ significantly with disease and pneumonia severity. The results obtained suggest that it might be possible to identify and predict the severity and extent of COVID-19 from the cough sound of a particular patient. However, despite the significant differences reported, it must be noted that there is a substantial variability and overlap between parameters from patients with different COVID-19 severity. Moreover, these parameters can also vary within individual patients, and this might be another important source of variability. For example, the mean of the intra-subject standard deviation for the FVAR of the whole CE signal was 106 Hz in the mild group, 87 Hz in the moderate group and 75 Hz for the severe group. Interestingly, the intra-subject variation was lower in the severe group than in the mild group. The potential of the proposed features for classification purposes has not been studied yet but remains a topic for further research. Using machine learning techniques and perhaps adding some extra features, the potential of this approach for discriminating different severities could be confirmed. This could result in an early stratification and prediction of probable clinical outcomes to triage correctly and allocate healthcare resources accordingly, which would be of huge benefit for both patient and healthcare providers. Further studies would elucidate if this methodology may also be extendable to long-COVID to analyse whether the evolution of the cough signal can reflect the presence or severity of respiratory sequelae (organising pneumonia, interstitial fibrosis, hyperreactivity).
Footnotes
Provenance: Submitted article, peer reviewed.
Support statement: This study was supported by Spanish Ministry of Science, Innovation and Universities grant RTI2018- 098472-B-I00 MCIU/AEI/FEDER, UE.; Severo Ochoa programme of the Spanish Ministry of Science and Competitiveness grant SEV-2014-0425 (2015-2019); CERCA Program/Generalitat de Catalunya and Secretaria d'Universitats i Recerca de la Generalitat de Catalunya grant GRC 2017 SGR 01770; and the European Commission under Horizon 2020's Marie Sklodowska-Curie Actions COFUND scheme grant: GA 712754. Funding information for this article has been deposited with the Crossref Funder Registry.
Conflict of interest: C. Davidson declares receipt of a BEST postdoctoral fellowship funded by the European Commission under Horizon 2020's Marie Skłodowska-Curie Actions COFUND scheme (GA 712754) and the Severo Ochoa programme of the Spanish Ministry of Science and Competitiveness (SEV-2014-0425 (2015–2019); and funding from the Spanish Ministry of Science, Innovation and Universities under Grant RTI2018-098472-B-I00 MCIU/AEI/FEDER.
Conflict of interest: O.A. Caguana declares payment or honoraria to them and their institution from GlaxoSmithKline, AstraZeneca, Menarini and FAES, and to their institution alone from Chiesi, and support for attending meetings and/or travel from Chiesi, FAES and Menarini.
Conflict of interest: I. Ferrer-Lluis declares funding for the present study from La Caixa INPhINIT grant LCF/BQ/DI17/11620029, the European Union Horizon 2020 Research and Innovation Program (grant number 713673), the Spanish Ministry of Science, Innovation and Universities (grant RTI2018-098472-B-I00 MCIU/AEI/FEDER), and CIBER-BBN (mobility grant 2019 CB06/01/1050).
Conflict of interest: Y. Castillo-Escario declares funding for the present study as follows from La Caixa Foundation (ID 100010434; fellowship code LCF/BQ/DE18/11670019), Centro de Investigación Biomédica en Red en Bioingeniería, Biomateriales y Nanomedicina (CIBER-BBN) contract “PREDOCTORAL LEY DE LA CIENCIA”, and the Spanish Ministry of Science, Innovation and Universities and European Regional Development Fund (Grant RTI2018-098472-B-I00), as well as receipt of a travel grant from CIBER-BBN.
Conflict of interest: P. Ausín declares payment or honoraria to them and their institution from Sanofi, GlaxoSmithKline, AstraZeneca and Menarini, and to their institution alone from Chiesi, and payment for expert testimony from AstraZeneca, support for attending meetings and/or travel from GlaxoSmithKline and Sanofi, and participation on a Data Safety Monitoring Board or Advisory Board for Sanofi, all in the 36 months prior to manuscript submission.
Conflict of interest: J. Gea declares grants or contracts from SEPAR, FIS (ISCiii) and EC, and the Horizon Programme of the European Union, and consulting fees from GlaxoSmithKline and Menarini, all in the 36 months prior to manuscript submission.
Conflict of interest: R. Jané declares funding for the present study paid to their institution from the Spanish Ministry of Science, Innovation and Universities under grant RTI2018-098472-B-I00 MCIU/AEI/FEDER (Spain).
Conflict of interest: All other authors declare no competing interests.
- Received May 18, 2022.
- Accepted January 7, 2023.
- Copyright ©The authors 2023
This version is distributed under the terms of the Creative Commons Attribution Non-Commercial Licence 4.0. For commercial reproduction rights and permissions contact permissions{at}ersnet.org