Abstract
Research question The assessment of cough frequency in clinical practice relies predominantly on the patient's history. Currently, objective evaluation of cough is feasible with bulky equipment during a brief time (i.e. hours up to 1 day). Thus, monitoring of cough has been rarely performed outside clinical studies. We developed a small wearable cough detector (SIVA-P3) that uses deep neural networks for the automatic counting of coughs. This study examined the performance of the SIVA-P3 in an outpatient setting.
Methods We recorded cough epochs with SIVA-P3 over eight consecutive days in patients suffering from chronic cough. During the first 24 h, the detector was validated against cough events counted by trained human listeners. The wearing comfort and the device usage were assessed using a questionnaire.
Results In total, 27 participants (mean±sd age 50±14 years) with either chronic unexplained cough (n=12), COPD (n=4), asthma (n=5) or interstitial lung disease (n=6) were studied. During the daytime, the sensitivity of SIVA-P3 cough detection was 88.5±2.49% and the specificity was 99.97±0.01%. During the night-time, the sensitivity was 84.15±5.04% and the specificity was 99.97±0.02%. The wearing comfort and usage of the device was rated as very high by most participants.
Conclusion SIVA-P3 enables automatic continuous cough monitoring in an outpatient setting for objective assessment of cough over days and weeks. It shows comparable sensitivity or higher sensitivity than other devices with fully automatic cough counting. Thanks to its wearing comfort and the high performance for cough detection, it has the potential for being used in routine clinical practice.
Abstract
This novel cough detector is comfortable to wear, and can reliably detect and quantify coughs over an extended period of time. Its performance is equal to or better than other automatic cough detectors on the market. https://bit.ly/3SGhbWJ
Introduction
Chronic cough in adults is defined as cough lasting ≥8 weeks [1] and is estimated to affect approximately one in 10 people worldwide [2]. It can be a result of different aetiologies, such as bronchial asthma, upper respiratory tract pathology, gastro-oesophageal reflux and rarer causes such as a lung tumour, chronic inflammation or interstitial lung disease. Additionally, patients can be diagnosed to suffer from unexplained chronic cough, called refractory chronic cough [3]. Currently, there is a lack of effective treatments for patients suffering from chronic cough, especially if an underlying cause has not been identified. Due to its high prevalence and severe impact on quality of life, there is a high interest in developing treatments for chronic cough [4].
In clinical practice, the assessment of cough relies on the patient's history and is evaluated using questionnaires and scored using a visual analogue scale (VAS) for cough intensity [5]. Additionally, cough has been monitored by electronic sound registration. Early versions of automated cough monitors, such as the Hull Automated Cough Counter (HACC), the LifeShirt or the PulmoTrack, suffered from insufficient accuracy; therefore, their use for automatic cough detection has been limited [6–9]. Later versions, such as the Leicester Cough Monitor (LCM) [10] and the VitaloJAK [11] have demonstrated good validity and have been used more widely as part of clinical trials. The VitaloJAK is a device worn around a patient's hip and which uses a microphone to record sounds. The final assessment of cough counts requires the listener to count coughs during condensed cough recordings [11]. The LCM combines continuous recording with automatic cough detection and counting [6, 10, 12]. However, for optimal cough detection, a manual calibration in every patient is needed [6, 13]. Due to the relatively time-consuming evaluation or initialisation, neither system has been widely used in clinical practice.
Additionally, better understanding of chronic cough is hindered by the lack of objective long-term continuous data that do not rely on patient recall [14]. Due to the size of the devices and time-consuming analysis, both LCM and VitaloJAK have been mainly used for up to 24 h [6]. The registration of cough in a nonobtrusive way for >24 h and the evaluation of cough frequency (and, potentially, other qualities of cough) in a fully automatic way, could be used in clinical research for developing novel therapies and in clinical practice for symptom and treatment monitoring.
We have developed a small wearable cough detector (SIVA-P3) for continuous cough monitoring that uses deep neural networks for the automatic detection and counting of coughs. We evaluated the performance of the SIVA-P3 cough detector in an outpatient setting in 27 patients diagnosed with chronic cough.
Material and methods
Study subjects
Patients were recruited from the pulmonary outpatient clinic at the University Hospital Zurich (Zurich, Switzerland) between April 2021 and September 2021. Patients had to be aged ≥18 years and have a pulmomologist's diagnosis of a chronic cough of unknown origin, COPD, asthma or interstitial lung disease. Patients were required to use a smartphone with the minimum version of iOS 13/Android 10 to install the mandatory smartphone application.
Pregnant patients or patients with a mental or physical disability diagnosis that precluded informed consent or compliance with the study protocol were excluded. The study was conducted in accordance with the Declaration of Helsinki and all subjects provided written informed consent. The ethics committee of the Canton of Zurich approved the study (EK-ZH-NR: 2021-00330).
Study design
The main study objective was to validate the SIVA-P3 system in a real-world setting of patients diagnosed with chronic cough. This explorative study had a monocentric, noncontrolled, open-label framework. It was conducted at the outpatient division of the pulmonary department of the University Hospital Zurich. Due to the observational nature of this study, no formal sample size calculation was made and a sample size of 27 was considered to be sufficient to test the SIVA-P3 system adequately. First, patients who consented to participate in the study were asked to rate their baseline cough severity on a VAS [5] and to install the SIVA-P3 app on their smartphones. Second, the patients received the cough detector including accessories and an envelope with a second VAS. Patients had to place the cough detector between two layers of clothing on the chest (using a commercially available sports strap to secure it around the neck; figure 1) and wear it for eight consecutive days. At night, they had to charge the cough detector on their bedside table, where it continued to record. Study participants were instructed to remove their cough detector only during showering or swimming. Every day, the smartphone application prompted patients once in the evening to indicate the timing of their main meals. After eight days, a research associate called the study participants for an interview. The phone interview included instructing the patient to fill in the follow-up VAS, asking the participant user feedback questionnaire questions, and instructing the patient to send the device back to the trial site using the return envelope.
The SIVA-P3 cough detector worn on the patient's chest. a, b) The cough detector. Note that in this pilot study, it was intended to be worn between layers of clothing, but for illustration purposes, it is worn over the clothes in this figure; c) smartphone application; d) cloud storage, where the data are saved and processed; e) data are presented on a dashboard in a user-friendly way.
As primary outcome, the number of detected time-stamped cough epochs in the first 24 h was compared between the SIVA-P3 algorithm and the human listener. The algorithm's performance was evaluated by specificity and sensitivity and related metrics (positive and negative predictive values, rates of false positive and false negative detections per hour).
As a secondary outcome, we assessed the user-friendliness and wearing comfort over the 8-day study period using a questionnaire via a structured interview at the end of each participant's study period.
Methods
SIVA-P3 system
The SIVA-P3 system consists of several components that interact and share data (figure 1). The cough detector records audio and movement data and pre-processes the data so that only relevant data are stored (i.e. sound segments that exceed a specific loudness). The cough-detection algorithm processes the data, resulting in a stream of time-stamped cough events, which does not allow for the reconstruction of conversations or other audio information. For the first 24 h of the study, segments of original audio data have been stored to enable the evaluation of the performance of the cough-detection algorithm. In this explorative study, movement data were not analysed.
Automatic cough counting (SIVA-P3 algorithm)
The SIVA-P3 cough detection algorithm analyses the audio data, and labels single cough events at the resolution of individual cough explosion sounds using a deep neural network. The deep neural network has been trained using a large dataset of cough and non-cough sounds. The training dataset consists of data that have been collected in preliminary experiments with previous prototypes of the SIVA device and data that are publicly available. The COUGHVID [15] dataset is used as an additional source for cough training data, and the Audio Set [16] dataset is used as additional training data for background noise. The training dataset consists of a total of ∼82 000 coughs and ∼1100 h of background noise. The final algorithm's performance was analysed for detection of cough epochs, which is defined as continuous coughing sounds without a 2-s pause between the coughs [14].
Confidence intervals
A bootstrapping method was used for determining confidence intervals for the performance metrics [17]. The data were split into 1-h segments, resulting in 24 h × 27 patients = 648 segments (449 day segments, 199 night segments). We chose to analyse 1-h segments because the data are displayed to healthcare professionals with the same resolution. For day and night distributions, 10 000 new datasets were iteratively sampled with replacement from the respective segments, and for each iteration the performance metric was calculated enabling to determine the 95% confidence intervals.
Consistency analysis between ground truth and the device measurements
To evaluate how consistently the cough detection algorithm performs between different patients, the average rate of cough epochs per hour during the first 24 h was determined (separately for day and night) for each participant based on the results from the algorithm and a human listener. Spearman's rank correlation coefficients were calculated and a Bland–Altman analysis was performed to determine the agreement between the SIVA-P3 algorithm and a human listener.
Participant user feedback questionnaire
The participant user feedback questionnaire included five questions asking participants to rate wearing comfort, acceptance, likelihood of wear for a longer period of time and data privacy on a five-degree Likert scale (supplementary material).
Analysis tools
The descriptive statistics were calculated using R (version 4.1.2; R Core Team 2019, R Foundation for Statistical Computing, Vienna, Austria). The cough detection algorithm was developed and inferred using the machine learning framework TensorFlow (version 2.4.2; Google Brain Team, Mountainview, CA, USA) running in Python (version 3.8.0; Python Software Foundation, Delaware, USA).
Analysis
The performance of the algorithm was determined using the first 24 h of the audio recordings. First, a pre-screening algorithm identified all sound events starting with a sudden increase in loudness (preliminary testing has shown that this procedure detects cough events with 100% sensitivity and is thus suited as a privacy-enhancing pre-screening). Second, a research assistant trained in recognising cough sounds inspected 1 s of audio data for each identified sound event and rated it as “cough” or “non-cough” for producing the ground truth data. The labelled cough samples were double-checked by a second listener and in case of disagreement, a third listener. During the cough epochs, every explosive cough phase triggers the pre-screening algorithm, allowing for the inspection of the individual cough explosions within one cough epoch. During the inspection the research assistant was unaware of the predictions of the algorithm and therefore unbiased. Third, the ground truth was compared to the automatic cough detection of the SIVA-P3 algorithm to determine the performance metrics. The algorithm predictions were compared to the ground truth on a local (time-wise) level, meaning the algorithm had to detect the cough at the correct point in time to count as a true positive. Because of this local comparison, false positives and false negatives cannot cancel out, as would be the case if the predictions were calculated on a time-segment level, e.g. during 1 day.
The 24-h data were split into day and night data points, i.e. data points where the device had been worn or was placed on the charger, as the sound profiles for day and night differ due to the greater distance between the device and the patient during the night. The algorithm uses a different cough acceptance threshold for daytime and night-time predictions. This parametrisation was optimised on the validation set, which was distinct from the test set. The different parametrisation of the algorithm leads to different performance profiles during day and night. The 24-h device data from the patients were evaluated with a four-fold cross-validation approach with separated validation and test sets. During the evaluation and testing, the data were used as recorded to ensure that the evaluated performance represents the real-world use case without the algorithm overfitting on data augmentation patterns.
Results
Patient population
46 patients were screened between April 2021 and September 2021. Of these, 15 patients did not meet the inclusion criteria. After including 31 patients in the study, four had to be excluded (due to technical problems n=2, wrong way of wearing the device n=1 or withdrawal of consent n=1). This resulted in a sample size of 27 patients, analysed in this study (supplementary figure S1). Among 27 participants, there were 15 men and 12 women with a mean±sd (range) age of 50.6±13.7 years (25–79 years). 12 (44%) participants had a diagnosis of chronic cough of unknown origin; four (15%) were diagnosed with COPD; five (18%) with asthma; and six (22%) with interstitial lung disease (table 1).
Study participant demographics
Out of the 27 included participants, 24 wore the device over the full 7 days (one participant stopped early due to holidays, and two lost the motivation to keep wearing the device). 21 of these 24 participants provided data over 7 days. Data gaps in the participants with <7 days of coverage were caused due to usability-related issues with charging the device and connectivity issues with older smartphone models.
We evaluated the performance of the algorithm using cough epochs, defined as continuous coughing sounds without a 2-s pause between the coughs [14]. Each cough epoch is made up of one or several cough explosions. Next, we compared the number of cough epochs and cough explosions (labelled first-day data) (figure 2). We observed a strong correlation between cough explosions and cough epochs, with a Spearman's rank correlation of 0.88 for the day and 0.89 for the night. On average, there were 2.39±0.95 cough explosions per cough epoch during the day and 2.36±1.06 cough explosions per cough epoch during the night.
Relationship between cough epochs and cough explosions for a) daytime and b) night-time. Each point depicts the data of one patient during the first day (n=27).
The number of cough explosions and cough epochs varied a lot between patients and individual days. On average, the number of cough epochs per patient across days had a variation of 65% (69% for cough explosions) (figure 3). The overall maximum recorded number of cough explosions per hour was 38.9 epochs·h−1 (78.55 explosions·h−1) and the minimum was 0.17 epochs·h−1 (0.26 explosions·h−1).
Box plots depicting the a) average cough explosions per hour for the full week and b) average cough epochs per hour for the full week.
Cough detection performance
We first evaluated the algorithm's performance by calculating specificity, sensitivity and other related metrics (positive and negative predictive values, rates of false positive and false negative detections per hour). The algorithm reached 88.55% (95% CI 85.86–90.85%) sensitivity and negative predictive value of 99.97 (95% CI 99.96–99.97%) (table 2). The specificity and false positive rate per hour of the device during daytime were 99.97% and 0.4 coughs·h−1, respectively. During night-time, the specificity and false positive rate per hour of the device were 99.97% and 0.3 coughs·h−1, respectively. The algorithm's performance for daytime and night-time was not significantly different for most measures, except for specificity (two-sample Kolmogorov–Smirnov test p=0.028) and negative predicitve value (two-sample Kolmogorov–Smirnov test p=0.043). A receiver operating characteristic curve (figure 4a) and plotting of the positive predictive value against sensitivity (figure 4b) confirmed a high performance of the cough detection algorithm across different operating points.
Mean values (95% CI) for daytime and night-time usage
a) Receiver operating characteristic curve for day and night, including the area under the curve (AUC); b) curve plotting positive predictive value against sensitivity for day and night, including the AUC.
As part of quality assurance, we examined the agreement between the algorithm and the ground truth for average cough per hour during the first 24 h using Spearman's rank correlation and the Bland–Altman plots. We observed a high correlation for the average cough epochs per hour for both daytime (rs 0.95, 95% CI 0.90–0.98) and night-time (rs 0.94, 95% CI 0.88–0.97) measurements (figure 5). For the Bland–Altman analysis (figure 6), we plotted the difference in cough epochs per hour between the algorithm and a human listener over the average cough epochs per hour for each patient. We observe high agreement between the SIVA-P3 algorithm and the human listener, with a slight overestimation of cough number from the algorithm for daytime.
Correlation plot of daily average cough epochs per hour between the SIVA-P3 algorithm and a human listener for a) daytime use and b) night-time use. Each point depicts the data of a single patient over the first 24 h. The Spearman's correlation coefficient for daytime use is 0.95 (95% CI 0.90–0.98) and for night-time use is 0.94 (95% CI 0.88–0.97).
Bland–Altman plot of the daily average cough epochs per hour showing the difference between a human listener and the SIVA-P3 algorithm for a) daytime use and b) night-time use. Data are presented as the data points of single patients over the first 24 h, means (dashed red lines) and 95th percentiles (black dashed lines).
Secondary outcomes
As secondary outcomes, we assessed wearing comfort using a participant user feedback questionnaire (n=25). 44% (n=11) of the participants agreed, 20% (n=5) rather agreed, 12% (n=3) neither agreed nor disagreed, 24% (n=6) rather disagreed and 0% (n=0) disagreed with SIVA-P3 being comfortable (figure 7). The main reason why the six participants did not find the cough detector so comfortable was that the cough detector interfered with changing clothes or because the collar was too wide, which made one sweat more. Assuming that wearing the SIVA-P3 continuously for longer time will facilitate better diagnosis of cough-related pathologies, we asked the participants if they would be willing to use SIVA-P3 daily for a period of 8 weeks if it was recommended by the doctor. 36% (n=9) of participants agreed, 24% (n=6) rather agreed, 20% (n=5) neither agreed nor disagreed, 8% (n=2) rather disagreed and 12% (n=3) disagreed with the statement. In addition, the questionnaire results showed that the users were not bothered by the cough monitor during the night-time, and they were not concerned about data privacy.
Likert scale responses to the statement “I found wearing the cough detector comfortable”. n=25.
Discussion
The SIVA-P3 automatic cough monitoring device showed high performance for cough detection (sensitivity 88.5% for daytime and 84.2% for night-time), and was rated highly for wearing comfort over the 8-day wearing period. This is the first study that has monitored coughing patterns continuously over an 8-day period among patients with common chronic cough aetiologies such as asthma, COPD and interstitial lung disease.
Compared to other electronic cough monitoring approaches, the SIVA-P3 algorithm's performance is better than fully automatic cough monitoring systems such as the LifeShirt (sensitivity 78.1%, specificity 99.6%) [8] and the HACC (sensitivity 80%, specificity 98%) [7]. PulmoTrack (sensitivity 96%, specificity 94%) is the only fully automatic cough monitoring system with better performance than SIVA-P3; however, these results were obtained under fixed conditions such as sitting, walking or lying down in healthy subjects during 25 min of measurement time [9]. SIVA-P3 is comparable to approaches limited to night-time use in a quiet environment, such as the LEOSound monitor (sensitivity 89.7%, specificity 98.7%) [18], and to approaches requiring additional manual calibration such as the LCM (sensitivity 91%, specificity 99%, false positive rate 2.5 events·h−1) [13]. Only cough detection by human listeners as used by VitaloJAK has both higher sensitivity and specificity than SIVA-P3 [11]. However, VitaloJAK's processing requires a trained professional to perform the cough count by listening to an abbreviated version of the 24-h recording (median 62.4 min) [12]. Cough counting with a fully automated system such as SIVA-P3 allows continuous cough monitoring over extended time periods without manual calibration, facilitating its use across large numbers of patients and in routine clinical use.
In addition to the approaches with dedicated body-worn devices, solutions are emerging that rely solely on the patient's smartphone for data recording, such as the Hyfe Cough Tracker App (Hyfe, Wilmington, DE, USA; no published performance metric available) and the Asthma Guardian smartphone application (Resmonics, Zürich, Switzerland; sensitivity 99.9%, specificity 91.5% for night-time use) [19]. The main limitation of such approaches is that recording with consistent quality can only be guaranteed under well-defined circumstances (mobile phone openly placed on table or nightstand next to the user) and during certain times of day (at home or in other silent environments). For conditions where, for example, the nightly cough burden can be used as a surrogate marker for the general patient state, these approaches may be a cost-effective alternative to monitor disease progression [20–22]. However, obtaining a complete (24/7) unbiased picture of the patient's coughing patterns during all activities of daily life requires a dedicated device with robust cough detection performance and high patient acceptance. SIVA-P3 facilitates such usage, and further studies are necessary to investigate the value of these cough patterns with respect to differential diagnostics.
This study has some limitations. We performed a convenience sampling approach that was balanced in terms of diseases with respect to the sample and the sample size was proposed to serve as a pre-test for a larger experimental study. For a more generalisable statement about the validity of the device, follow-up studies with a larger sample size need to be conducted.
Conclusion
In summary, SIVA-P3, a fully automated ambulatory cough monitoring system, was successfully tested in a clinical outpatient setting and shows similar or better performance metrics than other automated cough detection solutions while requiring significantly less evaluation and installation effort for personnel than not fully automated alternatives. Furthermore, the SIVA-P3 cough detector is small and lightweight which the patients tolerate very well even over a longer period (24 h per day, 7 days).
Supplementary material
Supplementary Material
Please note: supplementary material is not edited by the Editorial Office, and is uploaded as it has been supplied by the author.
Supplementary material 00279-2022_figure_S1
Footnotes
Provenance: Submitted article, peer reviewed.
Conflict of interest: C.F. Clarenbach reports consulting fees from GSK, Novartis, Vifor, Boehringer Ingelheim, AstraZeneca, Sanofi, Vifor and Daiichi Sanko outside the submitted work. He reports payment or honoraria for lectures, presentations, speakers bureaus, manuscript writing or educational events from GSK, Novartis, Vifor, Boehringer Ingelheim, AstraZeneca, Sanofi and Vifor. M. Alge is employed by and owns shares in SIVA Health AG. E. Nalbant has received consulting fees from Siva Health AG. L. Kuett is employed by SIVA Health AG. E.W. Russi has received consulting fees from Siva Health AG. He participates in the ESTxENDS Trial (study supported by SNF, University of Bern) and is a participant in the Data and Safety Monitoring Board. N.A. Sievi, A. Arvaji, D. Kohlbrenner and M. Kuhn have no conflicts of interests.
Support statement: The study was sponsored by Evoleen AG, a company that owns shares in SIVA Health AG. SIVA Health AG intends to commercialise the SIVA-P3 system. Funding information for this article has been deposited with the Crossref Funder Registry.
- Received June 8, 2022.
- Accepted September 23, 2022.
- Copyright ©The authors 2023
This version is distributed under the terms of the Creative Commons Attribution Non-Commercial Licence 4.0. For commercial reproduction rights and permissions contact permissions{at}ersnet.org