Abstract
Introduction The multiple breath nitrogen washout (MBNW) test provides important clinical information in obstructive airways diseases. Recently, a significant cross-sensitivity error in the O2 and CO2 sensors of a widely used commercial MBNW device (Exhalyzer D, Eco Medics AG, Duernten, Switzerland) was detected, which leads to overestimation of N2 concentrations. Significant errors in functional residual capacity (FRC) and lung clearance index (LCI) have been reported in infants and children. This study investigated the impact in adults, and on additional important indices reflecting conductive (Scond) and acinar (Sacin) ventilation heterogeneity, in health and disease.
Methods Existing MBNW measurements of 27 healthy volunteers, 20 participants with asthma and 16 smokers were reanalysed using SPIROWARE V 3.3.1, which incorporates an error correction algorithm. Uncorrected and corrected indices were compared using paired t-tests and Bland–Altman plots.
Results Correction of the sensor error significantly lowered FRC (mean difference 9%) and LCI (8–10%) across all three groups. Scond was higher following correction (11%, 14% and 36% in health, asthma and smokers, respectively) with significant proportional bias. Sacin was significantly lower following correction in the asthma and smoker groups, but the effect was small (2–5%) and with no proportional bias.
Discussion The O2 and CO2 cross-sensitivity sensor error significantly overestimated FRC and LCI in adults, consistent with data in infants and children. There was a high degree of underestimation of Scond but minimal impact on Sacin. The presence of significant proportional bias indicates that previous studies will require reanalysis to confirm previous findings and to allow comparability with future studies.
Abstract
O2 and CO2 cross-sensitivity sensor error in the Exhalyzer D device significantly overestimates FRC and LCI in adults, consistent with infants and children. Importantly, there was a high degree of underestimation of Scond, but minimal impact on Sacin. https://bit.ly/3HcH3Tp
Introduction
The multiple breath nitrogen washout (MBNW) test assesses ventilation heterogeneity, often increased in respiratory diseases such as asthma and COPD [1, 2]. The test involves measurement of the concentration of an inert tracer gas of interest (i.e. N2) in expired breath, which is progressively washed out by inhalation of 100% oxygen over a series of tidal breaths. Analysis of the exhaled N2 concentration versus exhaled volume of each breath allows calculation of a global measure of heterogeneity (lung clearance index, LCI), heterogeneity arising predominantly within the convection-dependent airways (Scond), heterogeneity arising in the more peripheral, diffusion-dependent acinar airways (Sacin) and functional residual capacity (FRC) [3].
MBNW has been extensively used as a research tool in various respiratory diseases, particularly in obstructive airways diseases. With the availability of commercially available devices and international guidelines, it has emerging utility in clinical care, especially in cystic fibrosis (CF). The LCI has proved to be a sensitive marker of early disease progression in children with CF and has also been included as a primary end-point in several therapeutic trials [4, 5]. MBNW has yet to be a part of clinical management in other lung diseases, but studies have shown utility of Sacin and Scond in guiding up- versus down-titration of treatment [6, 7] and sensitivity to detect improvement in symptoms in response to treatment with high-dose inhaled corticosteroid [8] or monoclonal antibody therapy in asthma [9]. These indices are also sensitive markers of small airway dysfunction and its reversibility in smokers with normal spirometry [10, 11].
Recently, the presence and impact of a critical sensor error in a commercial device used to perform MBNW (Exhalyzer D, Eco Medics AG, Duernten, Switzerland) has been reported in infants and older children [12, 13]. This MBNW device relies on accurate measurements from O2 and CO2 sensors to calculate N2 concentration indirectly. It was found that both sensors exhibit cross-sensitivities, i.e. the O2 sensor estimation is dependent on CO2 concentrations and vice versa, such that as the washout progresses, O2 and CO2 concentrations are underestimated and N2 concentrations increasingly overestimated, prolonging the washout. This has been shown to result in significant errors of up to 12% and 15–19% in the assessment of FRC and LCI, respectively [12–14]. A software update (V 3.3.1) has now been released by the manufacturer with an implemented correction algorithm, which recalculates the N2 concentration trace.
The magnitude of effect of this sensor error correction on these MBNW indices in adults is currently unknown, and to date there has been no description of the effects on additional important indices such as Scond and Sacin. This is essential to understand the validity of changes reported in previously published studies. Therefore, this study aimed to determine the effect of the CO2 and O2 sensor correction on MBNW parameters in both health and disease by examining three different adult cohorts: 1) healthy volunteers, 2) patients with asthma, and 3) long-term smokers. Secondly, we investigated whether correction of the sensor error affected the within- and between-session repeatability of MBNW parameters in health. Some of the data from the healthy and asthma participants have been previously published [15, 16].
Methods
Research participants
In this study we retrospectively reanalysed MBNW measurements from healthy volunteers, participants with asthma and long-term smokers that were recruited from Royal North Shore Hospital and the Woolcock Institute of Medical Research. Healthy participants were current nonsmokers with a smoking history of <10 pack-years and no respiratory disease. Patients with asthma had a physician diagnosis of asthma and were current nonsmokers with a smoking history of <10 pack-years. Long-term smokers were current smokers with at least a 10 pack-year smoking exposure; these data were collected as part of a larger clinical trial (Australian Clinical Trials Registration Number (ACTRN): 12616001208493) in smokers with normal post-bronchodilator (BD) spirometry or GOLD Stage 1 (post-BD FEV1/FVC <0.7 but FEV1 >80% predicted), with the additional inclusion criteria of abnormal Scond and/or Sacin as assessed by z-score <−1.64 using published predicted equations [11]. The original studies were approved by the local Human Research Ethics Committee (Northern Sydney Local Health District, LNR/16/HAWKE/11 and HREC/15/HAWKE/489).
Standard pulmonary function testing
After obtaining written informed consent, all participants underwent conventional lung function testing including spirometry, plethysmography and diffusing capacity for carbon monoxide (DLCO). These were performed according to American Thoracic Society (ATS)/European Respiratory Society (ERS) criteria. All parameters were expressed as percent predicted using published predicted equations [17, 18].
MBNW testing
In the original studies, after a period of at least 10 min of rest, the healthy and asthmatic participants underwent MBNW testing by two commonly used breathing protocols: controlled and free-breathing protocols, in randomised order (assigned by a computer-based random number generator); the group of smokers performed MBNW using the controlled breathing protocol only. A subset of healthy participants returned for testing within 3 months of their first visit, in which all measurements were repeated in the same order. Both controlled and free-breathing protocols were included as several published studies showed that indices of conductive and acinar ventilation heterogeneity were not comparable between breathing protocols [15, 16, 19].
MBNW was performed using the Exhalyzer D with SPIROWARE V 3.1.6 (Eco Medics AG, Duernten, Switzerland). Both the controlled breathing and free-breathing protocols were performed according to ERS consensus and have been previously described in detail [15, 20]. In brief, after establishing a stable breathing pattern and end-expiratory lung volume (EELV), nitrogen washout during 100% O2 inhalation was commenced. The controlled breathing protocol required participants to breathe at a RR between 8 and 12 breaths.min−1 and tidal volume (VT) between 0.95 and 1.3 L following visual feedback until the N2 concentration decreased to 1/40th of the starting end-expiratory N2 concentration. In the free-breathing protocol, participants were encouraged to adopt relaxed tidal breathing but advised to adjust tidal volumes if insufficient expired N2 phase III slope was observed; calculated Scond and Sacin were adjusted for VT, as per consensus guidelines [20]. At least three technically acceptable trials with FRC values <10% of the mean were obtained for each breathing protocol.
MBNW analysis
The effect of the sensor error was assessed by comparing the parameters of standard (uncorrected) analysis in SPIROWARE V 3.1.6 with corrected parameters reanalysed in new SPIROWARE V 3.3.1, applying the sensor error correction algorithm. The correction algorithm has been described extensively before in Sandvik et al. [13] and Wyler et al. [12]. Briefly, the algorithm was derived using Exhalyzer D sensors and mass spectrometer to measure the O2 and CO2 concentrations of a wide range of well-defined technical gas mixtures under various conditions, and used a polynomial function to correct for the errors observed. System settings, delay correction and quality control remained unaltered (i.e. selection of breaths and any correction made to phase III slopes were consistent between both versions).
Statistical analysis
Statistical analysis was carried out with GraphPad Prism 8 (GraphPad Software Inc., La Jolla, CA, USA). All data are expressed as mean±sd, unless otherwise stated. Differences between uncorrected and corrected parameters were examined using paired Student's t-tests and Pearson's correlation. To investigate bias, we generated Bland–Altman plots as the difference (corrected minus uncorrected) versus the average, plotting the mean difference and 95% limit of agreement (95% LoA). We performed linear regression of the difference versus average to determine any proportional bias.
To make clear the consequence of the correction of the sensor error on prior studies, we present these results as the change in the outcome parameters of existing studies that result from this correction, i.e. with the uncorrected parameters as reference (for example, the sensor error results in expired N2 being erroneously high towards the end of the washout). This in turn causes an overestimation in FRC. Our results are presented in the context of how FRC is altered when the sensor error is corrected, in this case a reduction in calculated FRC.
Within-session variability was expressed as the coefficient of variation (CoV) calculated as the ratio of the sd to the mean from three separate trials. To determine between-session variability, we calculated the difference (visit 2 minus visit 1) and 95% LoA separately for corrected and uncorrected parameters. We also report the between-session intra-class correlation coefficients (ICC), calculated using a two-way mixed effects ANOVA model based on absolute agreement, multiple measurements (k=3). A p-value below 0.05 was considered statistically significant.
Results
Patient demographics
We reanalysed MBNW measurements from 27 healthy volunteers, 20 asthmatic patients and 16 long-term smokers. The patients’ demographics and lung function are summarised in table 1. The healthy volunteers were slightly younger than the asthmatic patients and smokers. The group of smokers had a mean±sd smoking history of 19.3±8.6 pack-years. Both plethysmography and MBNW-derived FRC were comparable across the groups, whereas MBNW indices of heterogeneity were significantly higher in the asthma and smoker groups compared to health, and higher in the smokers compared to asthma (in terms of Scond and Sacin).
Effects of sensor correction on MBNW parameters
Correction of CO2 and O2 sensor error had a significant effect on all MBNW parameters measured by the controlled breathing protocol (table 2). Following correction, mean (95% CI) FRC and LCI decreased by 7.8 (7.0–8.4)% and 9.8 (8.8–10.8)%, respectively, in health. Similar decreases in FRC and LCI were observed in asthma and long-term smokers. While uncorrected FRC values measured by MBNW were comparable to FRC measured by body plethysmography, corrected FRC values were significantly lower compared to FRCpleth in all three groups (mean±sd differences of −0.26±0.47 L (p=0.008), −0.26±0.37 L (p=0.006) and −0.64±0.71 L (p=0.003) in health, asthma and smokers, respectively).
Notably, mean (95% CI) Scond significantly increased by 11.1 (−1.4–23.5)%, 14.0 (4.2–23.9)% and 36 (19.8–52.2)% following sensor correction in health, asthma and smokers, respectively. In contrast, Sacin was significantly lower following sensor correction in the asthma and smokers groups, with a trend to significance in the healthy group (p=0.08). The impact on Sacin, however, was minimal with mean decreases (95% CI) of 1.8 (0.44–4.0)%, 2.9 (0.9–4.9)% and 4.8 (0.7–8.9)% observed in health, asthma and smokers, respectively. When using the free-breathing protocol, similar effects for LCI, FRC, Scond and Sacin were observed in health and asthma (Online Supplement, Table S1).
There were strong correlations between all corrected and uncorrected MBNW values across the three groups (all r-values >0.85) (figures 1–3, panels A–D) and for both breathing protocols (Online Supplement, Figures S1 and S2). Bland–Altman plots showed that the effect of sensor correction on LCI and FRC demonstrated strong proportional bias in all three groups (greater difference with higher mean value) (figures 1–3, panels E–H). The Bland–Altman plots also revealed large variance in Scond and significant proportional bias in health and smokers, but not in asthma. Less variance in differences was seen in Sacin and there was no evidence of proportional bias in any of the three groups.
Effects on within- and between-session repeatability in health
Fifteen healthy volunteers underwent repeat testing. Within-session and between-session variability measurements are presented in table 3. There were no differences observed in within-session CoVs between corrected and uncorrected FRC (p=0.46) or LCI (p=0.84). Between-session variability was minimally affected by the sensor error. Corrected FRC and LCI showed narrower 95% LoAs, whereas Sacin and Scond showed slightly wider 95% LoAs. Between-session ICC values were numerically comparable between corrected and uncorrected values. Similar impact on within- and between-session repeatability was observed with the free-breathing protocol (Online Supplement, Table S2).
Discussion
In this study, we demonstrate that correction of the O2 and CO2 sensor error in the Exhalyzer D system results in significantly lower FRC and LCI, and higher Scond values in three different adult patient groups. The impact on Sacin, although statistically significant, was minimal. There were strong correlations between the corrected and uncorrected values for all MBNW parameters in all three groups. Importantly, the effect of the correction showed a significant proportional bias in FRC and LCI in all three groups, and significant proportional bias in Scond was also evident in health and smokers, although not in asthma. The O2 and CO2 sensor error correction produced less variance in Sacin compared to other parameters and there was no evidence of proportional bias. Furthermore, sensor error correction had minimal impact on within-session and between-session variability, with a smaller 95% LoA for LCI between sessions.
Overestimation of FRC and LCI by the Exhalyzer system was first suggested when comparing the use of sulfur hexafluoride (SF6) to N2 as a tracer gas. Jensen et al. [21] found in children with CF that N2 resulted in higher estimates of FRC and LCI compared to SF6 obtained using mass spectrometry. In addition to differences in the diffusion front, the assumption was that back-secretion of N2 from the tissues probably contributed to overestimation of FRC by MBNW. In fact, subsequent device comparison studies in adults tended to show FRC by the Exhalyzer D system to be larger than FRCpleth [22, 23]. However, these findings are at odds with the idea that gas dilution techniques during tidal breathing can only access communicating lung units and not trapped gas compartments, such that the estimated FRC in disease should be lower than FRC obtained from plethysmography, which includes all compressible gas volume within the lungs. Prior to reanalysis, there were no differences between FRCpleth and FRCMBNW in smokers, patients with asthma or in health but sensor error correction resulted in a significantly lower FRCMBNW compared to FRCpleth in all groups, more consistent with expectation. These results suggest that the sensor error explains most of the overestimation of FRC seen in the Exhalyzer device, just as Sandvik et al. [13] found that sensor error correction of MBNW removed the discrepancy in FRC between N2 and SF6. It is unknown whether the error affects different commercially available MBW utilising O2 and/or CO2 sensors, a subject that warrants further investigation.
Our study is the first to demonstrate the impact of the O2 and CO2 sensor error correction on FRC and LCI in adults, and the first to investigate the impact on Scond and Sacin. The effect of the sensor error on LCI and FRC has been described previously in infants and children, and our data are consistent with their findings in both magnitude and presence of proportional bias [12, 13]. The alignment of these findings is important to understand consistency in the correction algorithm. The high correlations between uncorrected and corrected values suggest that previous findings involving correlations with MBNW indices may be preserved, but the presence of significant proportional bias indicates that previous studies examining interventional effects will require reanalysis, both to reconfirm previous findings and to allow comparability with future studies. Although a recent reanalysis of CF clinical trials was reassuring to a degree and showed that while treatment effects were reduced, they were maintained following sensor correction [14].
Previous studies investigating the effect of sensor error correction were in infants and children [12–14], hence they did not include a comparison of phase III slope indices Scond and Sacin, which are not as commonly used in paediatric compared to adult age groups. Scond is calculated as the slope of the plot of normalised phase III slope (SnIII) versus lung turnover (TO), between TO 1.5 and 6, where SnIII is the slope of phase III in the N2 expirogram normalised by mean or end-tidal N2 concentration. Errors in Scond arise from two sources. First, the observed overestimation of FRC results in a lower TO, shortening the SnIII versus TO plot leftward and slightly elevating Scond. Second, as the washout progresses towards higher values of TO, the phase III slope is normalised by an increasingly overestimated N2 concentration. The effect is a less steep SnIII versus TO plot, thus lowering calculated Scond. These effects are demonstrated in figure 4, where corrected SnIII values for three different patients are increased, resulting in larger Scond as calculated between TO 1.5 and 6. In particular, the dominant effect of the impact on SnIII is clearly seen in panel 4C where uncorrected SnIII values deviate markedly from the corrected values at high TO. However, the change in SnIII in the first breath was minimal, both because the sensor error is smallest at high N2 concentrations, and because the N2 concentration used for normalisation is large at this point in the washout. Much of the effect of sensor correction on Sacin probably comes from propagation of the Scond error into the correction applied to SnIII(1) to obtain Sacin [20].
Our comparison found Scond to be significantly increased by the sensor error correction, and furthermore with a significant proportional bias in both health and in smokers. However, this distinction between groups is probably a manifestation of small numbers in each cohort, coupled with the inherent variability in the measurement of Scond. Indeed, when the three cohorts are combined into the single dataset (Online Supplement, Figure S3), it is clear that the sensor effort correction results in comparable effects on Scond regardless of the underlying pathophysiology.
Correction of the sensor error resulted in minimal impact on within-session and between-session variability in health. Within-session CoV remained small in FRC and LCI, demonstrating that trial repeatability for MBNW was high even after reanalysis. Similarly, all parameters had minimal change in between-session difference, with a small change in the LoA for LCI, which is probably attributed to the overall reduction in LCI caused by correction. Furthermore, we also reanalysed previously published data collected using both free breathing and controlled breathing [15, 16]. Sensor error correction did not affect the between-protocol differences in Scond and Sacin in health [15] or asthma [16], nor their dependences on the breathing pattern.
This study is limited by the selection criteria for the previous studies that we have included for reanalysis. Patients with asthma had relatively mild disease, and smokers were recruited for a larger study based on having abnormal ventilation heterogeneity as described in the Methods, and thus may not be representative of the population in general. Future reanalysis of MBNW data is required to understand the effect of sensor error correction in disease more broadly and the associated implications. Moreover, in our reanalysis, we chose to retain the same breath exclusions and other settings in the original analysis, to allow us to solely examine the effect of corrected N2 concentrations on MBNW indices. There is a chance that the adjusted washout traces may result in, for example, changes in the shape of the expirogram, which may result in different quality control decisions by a manual operator. However, we attempted to maintain a consistent approach for quality control. The new software version also includes changes in the way in which delay between flow and gas concentration sampling is calculated, to include a dynamic delay correction [24], which was not implemented in our reanalysis, but which may be a factor affecting comparability between old and new studies in the literature involving the Exhalyzer D. This was intentionally done to focus on the effects of the cross-talk sensor error correction.
In conclusion, our study is the first to describe the effect of O2 and CO2 sensor error correction on the Exhalyzer D MBNW system in adults, and the first to investigate the effect on Scond and Sacin. Our results confirm the LCI and FRC effects seen in infants and children and demonstrate strong underestimation with proportional bias for Scond, with errors up to 50% observed in those with the greatest ventilation heterogeneity, but minimal effects on Sacin. While the discovery of the error is an important step towards improved accuracy of MBNW devices, it also represents an important hurdle for ongoing efforts to support MBNW as a clinical tool or an end-point for clinical studies. These findings provide important considerations for the interpretation of previously published adult MBNW studies, and those in younger age groups incorporating phase III slope analysis. The magnitude of effect supports reanalysis of that data to better understand the true findings.
Acknowledgements
We would like to acknowledge the study participants for volunteering the time and effort required to conduct this study. We would also like to thank Blake Handley (Dept of Respiratory Medicine, Royal North Shore Hospital, St Leonards, NSW, Australia) and Stephen Milne (Centre for Heart Lung Innovation and Division of Respiratory Medicine, University of British Colombia, Vancouver, Canada) for assistance with the healthy and asthma datasets, and Prof. Christine Jenkins (ECOS Study; Dept of Thoracic Medicine, Concord Hospital, Concord, Australia and The George Institute for Global Health, Sydney, Australia) for provision of the smokers dataset used for this study.
Footnotes
Provenance: Submitted article, peer reviewed.
Data sharing statement: The study protocol and raw data that support the findings of this study are available from the corresponding author upon reasonable request.
Conflict of interest: No conflicts of interest, financial or otherwise, relating to this study are declared by the authors.
Support statement: S. Rutting was supported by the Berg Family Foundation. The smokers dataset was from a larger study funded by an investigator-initiated grant from GlaxoSmithKline, Australia. Funding information for this article has been deposited with the Crossref Funder Registry.
- Received October 31, 2021.
- Accepted February 17, 2022.
- Copyright ©The authors 2022
This version is distributed under the terms of the Creative Commons Attribution Non-Commercial Licence 4.0. For commercial reproduction rights and permissions contact permissions{at}ersnet.org