Abstract
Functional residual capacity (FRC) accuracy is essential for deriving multiple-breath nitrogen washout (MBNW) indices, and is the basis for device validation. Few studies have compared existing MBNW devices. We evaluated in vitro and in vivo FRC using two commercial MBNW devices, the Exhalyzer D (EM) and the EasyOne Pro LAB (ndd), and an in-house device (Woolcock in-house device, WIMR).
FRC measurements were performed using a novel syringe-based lung model and in adults (20 healthy and nine with asthma), followed by plethysmography (FRCpleth). The data were analysed using device-specific software. Following the results seen with ndd, we also compared its standard clinical software (ndd v.2.00) with a recent upgrade (ndd v.2.01).
WIMR and EM fulfilled formal in vitro FRC validation recommendations (>95% of FRC within 5% of known volume). Ndd v.2.00 underestimated in vitro FRC by >20%. Reanalysis using ndd v.2.01 reduced this to 11%, with 36% of measurements ≤5%. In vivo differences from FRCpleth (mean±sd) were 4.4±13.1%, 3.3±11.8%, −20.6±11% (p<0.0001) and −10.5±10.9% (p=0.005) using WIMR, EM, ndd v.2.00 and ndd v.2.01, respectively.
Direct device comparison highlighted important differences in measurement accuracy. FRC discrepancies between devices were larger in vivo, compared to in vitro results; however, the pattern of difference was similar. These results represent progress in ongoing standardisation efforts.
Abstract
Multiple-breath washout devices are not yet comparable http://ow.ly/bB7b30eAs0c
Introduction
Abnormalities in the small airways can be detected using the multiple-breath nitrogen washout (MBNW) test, which provides a more sensitive measure of small airway function than spirometry [1–3]. MBNW provides insightful information about the small airways, using indices that assess gas-mixing efficiency in the lung (Lung Clearance Index, LCI) and mechanistic information about ventilation heterogeneity in the conduction and diffusion-dependent airways (Scond and Sacin). These indices rely on accurate functional residual capacity (FRC) measurements [4].
Advances in technology have allowed the development of new commercial MBNW devices; however, there are still limited published data comparing measurements between devices. Previous studies have compared in-house and/or commercial devices against standard body plethysmography as well as mass spectrometry, often considered the gold standard [5–8]. Many of these studies have used lung models, assessed measurements in the paediatric population and used inert gases rather than N2 [9–11]. Limited published data have also shown inconsistencies between lung model and adult measurements [6].
Furthermore, the effect of software upgrades in a rapidly evolving field has not been studied extensively. Previous studies in one specific device have shown that software changes can have a significant impact on results [12, 13]. Recommendations for MBNW techniques have been published in the European Respiratory Society/American Thoracic Society (ERS/ATS) Consensus statement [4]; however, ongoing work is still needed to improve the standardisation of equipment, technical specifications and algorithms for the calculation of indices.
The aim of this study was to evaluate FRC measured using two commercial MBNW devices, the Exhalyzer D (EM) from Eco Medics AG (Duernten, Switzerland) and the EasyOne Pro LAB (ndd) from ndd Medical Technologies (Zurich, Switzerland). We also examined a third, independent, previously published [2] in-house device at the Woolcock Institute of Medical Research (WIMR), which measures N2 directly, in contrast to the commercial devices. FRC measurements from the devices were compared against a known volume using a syringe lung model (in vitro) and against plethysmographic lung volume in healthy and asthmatic adult subjects (in vivo). In addition, we compared analyses using two different software versions of the ndd device, because a software update involving major changes to N2 calculation has recently become available. Previous data [14, 15] have shown significant FRC underestimation using the older, widely available software.
Methods
Study design
In vitro and in vivo FRC measurements were performed using the three different MBNW devices, in random order as determined by a computer-generated randomisation sequence. The study was approved by the Northern Sydney Local Health District Human Research Ethics Committee (protocol no. LNR/16/HAWKE/11). Written informed consent was obtained from all recruited participants.
The lung model
In vitro measurements were performed using an optimised syringe lung model (figure 1). A 3 L volume calibration Hans Rudolph syringe (5530 series) was modified to produce the physiological expirogram encountered during in vivo testing. This was accomplished by incorporating an attachment on the front of the syringe consisting of 18 flexible Tygon (S3 E-3603) tubes of varying lengths (internal diameter of 0.048 cm and lengths ranging from 10 to 49.5 cm) and permeability coefficients for CO2 of 360×10−11 cm2·s−1·cmHg−1, N2 of 40×10−11 cm2·s−1·cmHg−1 and O2 of 80×10−11 cm2·s−1·cmHg−1, to produce the phase I and II portions of the expirogram, and a 3D-printed helical mixer device inserted at the syringe entrance to optimise gas mixing and produce a smooth phase III.
Physical lung model composed of a) a 3 L Hans Rudolph syringe, b) an attachment made from 18 flexible tubes, c) a helical mixer device inserted at the syringe entrance, and d) an adjustable syringe stopper.
The target FRC was calculated by adding the known syringe volume to the dead space of the lung model attachments. The syringe volume was adjusted via a stopper on the syringe plunger to predetermined positions. The dead space of the attachments was 0.310 L, determined from the computer-aided design specifications and confirmed with water displacement.
In vitro study
In vitro measurements were performed in triplicate on each device using four different FRC volumes (1.51 L, 1.81 L, 2.11 L and 2.31 L). A standardised adult protocol (tidal volume of 1–1.3 L) [2, 4, 16–18] was used. After at least 5 syringe strokes with a stable end expiratory volume, the washout phase was commenced and 100% oxygen was switched on. Syringe strokes were continued until end-tidal N2 concentration decreased to 1/40th of the starting end-tidal N2 concentration [4]. Between measurements, at least 10 strokes were first performed to expel any residual oxygen within the syringe lung model and to ensure the N2 had returned to baseline. A single operator (K.O. Tonga) performed all measurements under ambient temperature and pressure dry (ATPD) conditions.
In vivo study
Healthy volunteers were recruited from the WIMR and defined as current non-smokers with <10 pack-years smoking history and with no history of acute respiratory illness within the preceding month. Subjects with asthma were recruited if they had a physician diagnosis of asthma: history of asthma symptoms, previously documented significant bronchodilator reversibility on spirometry and/or a positive bronchoprovocation challenge test and on inhaled bronchodilator and/or inhaled corticosteroid medication. Short acting β-agonists were withheld for 6 h and long-acting β-agonists for 24 h before testing.
All participants were over the age of 18 years and completed a standardised interview on respiratory and general health before performing pulmonary function tests (PFTs) in the following order: MBNW, spirometry and body plethysmography. All tests were performed in a single session at the Woolcock Institute of Medical Research.
Spirometry and body plethysmography were performed according to ATS/ERS Guidelines, using a BodyBox 5500 (Medisoft Corporation, Sorrines, Belgium). MBNW was performed in triplicate in the seated position, using a noseclip and device-specific bacterial filter and mouthpiece attachments. Tests were conducted according to ERS/ATS consensus statement [4], using a standardised adult protocol (tidal volume of 1–1.3 L) [2, 16–18]. After at least 5 breaths with a stable end expiratory lung volume, the washout phase was commenced and 100% oxygen was switched on. Breaths were continued until end-tidal N2 concentration decreased to 1/40th of the starting end-tidal N2 concentration [4]. The time interval between measurements was standardised to twice the previous washout time [4].
MBNW hardware and software
Details of the WIMR, EM and ndd devices are provided in the supplementary material. Briefly, the WIMR device measures N2 directly via a side-stream N2 analyser [2]. The commercial devices both measure N2 indirectly, based on side-stream CO2 and O2 in the EM device and molar mass and CO2 in the ndd device [6]. Daily calibration and/or verification of each device was performed as described in the supplementary material.
FRC was calculated as the net volume of N2 expired divided by the difference between end-tidal N2 concentration at the start and end of the washout portion of the test [4]. FRC values were corrected for the pre-capillary dead space volume for each device. MBNW data were analysed using software specific to each device. For the WIMR device, custom-written software (Solver Version 1.3.2.18) was used. For the EM device, Spiroware Version 3.1.6 was used. For the ndd device, the same measurements were analysed using two different software versions (described in the supplementary material): clinical software Version 2.00.01.05 (termed “ndd v.2.00”) and the recent upgrade released in March 2016, Version 2.01.00.09 (termed “ndd v.2.01”). The ndd v.2.01 software contains updates to the N2 calculation method and improved flow-gas delay synchronisation.
Measurements were included for analyses if acceptability criteria were met. For the in vitro study, measurements were deemed acceptable if three tests were within 10% of the mean FRC [19] value across the triplicate measures. For the in vivo study, measurements were deemed acceptable if two or more tests were performed adequately according to the ERS/ATS Consensus statement [4]. Tests were excluded if there was evidence of leak during testing, the tidal volume of the first breath or more than a third of breaths during washout was outside 1–1.3 L, and if the end-tidal N2 concentration did not reach the recommended 1/40th of the initial concentration during data acquisition.
Statistical analysis
FRC data were analysed using IBM SPSS Version 22, and graphs were generated using GraphPad Prism Version 6.0. Summary data are presented as mean±sd. The accuracy of in vitro FRC was assessed according to the consensus statement [4] (FRC values within 5% of the known volume for at least 95% of values) and expressed as absolute (L) difference (measured FRC – lung model FRC) and relative (%) difference (absolute difference×100/lung model FRC). In vivo FRC was compared with body plethysmography FRC (FRCpleth). FRC differences between devices were assessed using Bland and Altman plots with 95% limits of agreement and by non-parametric one-way repeated measures ANOVA (Friedman's test) and post hoc tests. p-values <0.05 were considered statistically significant.
Results
In vitro comparison
A total of 108 measurements were performed across the three MBNW devices (table 1). Differences between measured and lung model FRC values are shown in figure 2 and table 2. FRC accuracy was within the specified 5% accuracy range of the lung model FRC for 100% of WIMR measurements and 97% of EM measurements. All FRC measurements using ndd v.2.00 were underestimated and not within the specified 5% accuracy range. Using ndd v.2.01, accuracy improved, although only 36% of measurements were within the 5% accuracy range.
In vitro functional residual capacity (FRC) measurements on each device
Differences in in vitro functional residual capacity (FRC) measurements from the lung model functional residual capacity for four different lung volumes
Bland–Altman plots of in vitro functional residual capacity (FRC) measurements on each device. Data are plotted as measured FRC minus lung model FRC, expressed as the absolute difference versus mean of measured and lung model FRC values. Absolute differences (circles), mean difference and upper and lower limits of agreement (mean difference±sd of differences) are shown as dashed lines. a) Woolcock in-house device; b) Exhalyzer D device; c) EasyOne Pro LAB device analysed in Version 2.01.00.09.
In vivo comparison
A total of 29 subjects (20 healthy controls and 9 asthmatics) were included in the analyses, and their characteristics are outlined in table 3. The mean (±sd) FRC measured by the WIMR (3.27±0.82 L) and EM (3.56±0.92 L) devices did not differ significantly from FRCpleth (3.44±0.77 L) or from each other, and FRC measured by ndd v.2.00 (2.71±0.64 L) was significantly lower than FRCpleth, WIMR and EM (p<0.0001) (figure 3 and table 4). The same pattern of FRC differences was observed in healthy control and asthmatic subjects.
Subject demographics and standard lung function measurements
Differences in in vivo functional residual capacity (FRC) measurements from body plethysmography FRC on each device
Bland–Altman plots of in vivo functional residual capacity (FRC) measurements on each device. Data are plotted as body plethysmography FRC (FRCpleth) minus multiple-breath nitrogen washout (MBNW) FRC, expressed as the absolute difference versus mean of FRCpleth and MBNW FRC. Absolute differences (circles), mean difference and upper and lower limits of agreement (mean difference±sd of differences) are shown as dashed lines. a) Woolcock in-house device; b) Exhalyzer D device; c) EasyOne Pro LAB device analysed in Version 2.01.00.09.
When the ndd data were reanalysed using ndd v.2.01 software, end-tidal N2 concentrations became systematically higher, such that only 9 of the 29 subjects had measurements that reached an acceptable end-tidal N2 concentration. However, a majority of the measurements exhibited a stable plateau at the end of the washout. For the purposes of this study, all 29 subjects were included for analysis to allow comparison of the effect on FRC. FRC increased when reanalysed using ndd v.2.01 (3.06±0.71 L) and was also significantly lower than FRCpleth (p=0.011) and the EM device (p<0.0001); however, it did not differ from the WIMR device.
Discussion
Summary of results
This is the first study to compare in vitro and in vivo FRC measurements in healthy and asthmatic adults, using two currently available commercial devices and one in-house MBNW device. The WIMR device measured in vitro FRC closest to the known lung model volume. The mean overestimation of in vitro FRC and FRCpleth was 3% by the EM device, in comparison to a mean 21% underestimation by the ndd device. However, on reanalysis using ndd v.2.01, underestimation was reduced to 5% (in vitro) and 11% (in vivo), respectively. There were statistically significant differences in FRC measurements between commercial MBNW devices, although this difference was relatively small between EM and ndd v.2.01. Furthermore, the pattern of differences (i.e. over-estimation or under-estimation of FRC) between devices was consistent using both the in vitro physical lung model and the in vivo measurements.
Comparison to other studies
The Consensus recommendations based on expert opinion stated that 95% of the in vitro FRC measurements should be within 5% of the target volume [4]. WIMR and EM devices fulfilled this criterion; however, the ndd device did not achieve 5% accuracy for any measurements (v.2.00). When measurements were reanalysed using ndd v.2.01, 36% fulfilled this criterion. Both commercial devices were reported previously to be highly accurate in measuring FRC using a physical lung model [14]. This lung model was water-based and incorporated BTPS (body temperature, ambient pressure, saturated) correction, whereas ours did not. Despite this, our results showed the same pattern of over/underestimation in vitro and in vivo. This is consistent with their in vivo results in 10 healthy adults, where EM overestimated FRCpleth by 14% and ndd underestimated by 23%. A recent paediatric study with healthy control and cystic fibrosis patients also reported the same pattern, i.e. higher FRC measurements using the EM device compared to the ndd device [15]. Other in vitro and in vitro paediatric studies exist [9, 11], but they involve different tracer gases, which limits comparison.
Impact of N2 estimation method
Differences in FRC across the three devices are probably attributable to device and software variations. This includes the method used to calculate N2 concentration. The WIMR device measures N2 directly, whereas the EM device uses Dalton's law to compute N2 by simple subtraction of other constituent gases in the expired air. These two methods were found to be similar. In contrast, the ndd v.2.00 device uses the concept of a prototype expirogram, derived from the shape of the molar mass versus expired volume curve in the early breaths of the washout. The expired N2 volume for each breath is then determined by scaling the prototype expirogram to match the end-expiratory N2 concentration for that breath. Potential inaccuracies could occur if the expirogram shape changed greatly during the course of the washout, as is typically observed in more severe obstructive airways disease. However, ndd v.2.01 uses a combination of molar mass measurements and Dalton's law to compute N2, on a point-by-point basis for the entire expirogram. This could partly explain the improved accuracy of ndd v.2.01 and the reduced FRC discrepancy between the new software and the WIMR and EM devices.
Impact of other software changes
The other major change in ndd v.2.01 involve modifications to the estimation of delay between the flow and respective gas measurement points, which are more robust to variation in breathing patterns, particularly very brief pauses in flow (personal communication with the manufacturer). Differences in delay time potentially have a large effect on FRC calculations [20, 21].
Newer software versions were available at the time of writing (v.2.2.0.15 onwards). These involve major changes to the user interface; however, they fundamentally employ the same indirect N2-based FRC calculation and delay estimation method. Our results thus have significant implications on the reanalysis of old data and are relevant to ndd v.2.01 as well as any newer versions. Furthermore, they illustrate the importance of analysis methods and any preprocessing algorithms that may be used, as well as the need for software transparency [22]. They also highlight the importance of ongoing validation and that standardisation efforts are gradually working.
It should be noted that this study was unable to confirm whether collecting data using ndd v.2.01 would have further improved the FRC accuracy for the ndd device. A large proportion of the in vivo tests reanalysed using the new software failed to meet end-of-test criteria, possibly due to the underestimation of N2 in the old software. One could speculate that if acceptable end-tidal N2 concentration had been met, in vitro FRC may have been slightly closer to the true lung volume and in vivo FRC closer to FRCpleth.
Impact of patient factors
Other factors that may affect FRC measurements include the effect of breathing patterns during the washout. Different mouthpieces, resistances, the nature of the real-time displays of volume and patient breathing incentives, and even different open bypass systems delivering oxygen [15], may have affected breathing patterns. However, dead space correction was applied in each device and the same 1 L breathing protocol was used. More importantly, the direction of over/underestimation was preserved regardless of in vitro and in vivo FRC comparisons, i.e. independent of BTPS correction, breathing pattern and the presence of CO2. Thus, it is unlikely that patient factors contributed significantly to the differences between devices. In addition, the differences between MBNW FRC and FRCpleth on each device were of similar magnitude between healthy controls and subjects with asthma with well-preserved spirometry. It is unknown whether the same results would hold in subjects with more airway obstruction, or other patient groups such as chronic obstructive pulmonary disease (COPD).
Limitations
Our study had a number of limitations. First, we compared differences only in FRC between devices; however, LCI, Sacin and Scond are the more clinically relevant MBNW indices. There are no known lung models to evaluate these other indices and a formal comparison of these indices in vivo was not performed. Nevertheless, high-quality FRC measurement is necessary for evaluation of these indices. Second, between-session variability for FRC has not been defined, so it is not known whether differences between in vivo measurements are within the limits of normal test variability. Third, the applicability of our results to the paediatric population is unknown. However, the simplistic design of the lung model could be easily adapted to a smaller syringe. Also, we only evaluated the standardised 1 L breathing protocol [23], so the effect of the free breathing protocol used by other groups [24], and especially in infants and children, is unknown. Furthermore, the syringe is an ATPD model rather than a BTPS model, which is less representative of the actual physiological situation but more practical for routine laboratory use. Despite this, the same pattern of accuracy was seen in vitro and in vivo, suggesting that incorporation of BTPS correction has a minimal effect on the relative errors reported. Finally, as discussed above, we did not evaluate the accuracy of data collected using the ndd v.2.01, which is scope for further work.
Conclusion
We have shown differences in the measurements of FRC between three MBNW devices in a physical syringe model of the lung. The syringe lung model used in this study was simple, portable and relatively easy to produce compared to that used in other studies [6, 25, 26]. It would allow a simple and practical way to calibrate or check the MBNW setup, because it tests the combined accuracy of volume and N2 concentration measurement during a more realistic expirogram and the correct alignment of these signals. The use of such a syringe model may be beneficial for comparison between devices and laboratories and for quality assurance monitoring. FRC differences were also reflected in vivo in healthy and asthmatic subjects, in relation to plethysmographic FRC. While further work is required to improve accuracy, the in vivo differences observed are small and probably not clinically significant. Differences are likely to reflect the method of calculating N2 concentration and other software factors. How these FRC errors translate to the accuracy of other indices has been explored for LCI [15], but not Sacin and Scond so far. Nevertheless, our results show that the state of the art is closer to achieving better comparability and standardisation for FRC accuracy across existing MBNW devices.
Supplementary material
Supplementary Material
Please note: supplementary material is not edited by the Editorial Office, and is uploaded as it has been supplied by the author.
Description of MBNW devices 00011-2017_supp
Disclosures
Supplementary Material
G.G. King 00011-2017_King
Acknowledgements
The authors acknowledge the contribution of Gunnar Unger, Martin Turner, Aaron Skelsey and Sunny Ye, who were involved with the development of the syringe lung model. Both manufacturers for the Eco Medics and ndd devices were consulted for the technical accuracy of the manuscript. The authors acknowledge the contribution of Christian Buess, who provided additional technical information regarding the ndd software where indicated in the manuscript. Neither manufacturer influenced the study design, results or interpretation.
Footnotes
This article has supplementary material available from openres.ersjournals.com
Conflict of interest: Disclosures can be found alongside this article at openres.ersjournals.com
- Received January 29, 2017.
- Accepted June 23, 2017.
- Copyright ©ERS 2017
This article is open access and distributed under the terms of the Creative Commons Attribution Non-Commercial Licence 4.0.