Abstract
In Europe, two commercial devices are available to measure combined single-breath diffusing capacity of the lung for nitric oxide (DLNO) and carbon monoxide (DLCO) in one manoeuvre. Reference values were derived by pooling datasets from both devices, but agreement between devices has not been established.
We conducted a randomised crossover trial in 35 healthy adults (age 40.0±15.5 years, 51% female) to compare DLNO (primary end-point) between MasterScreen™ (Vyaire Medical, Mettawa, IL, USA) and HypAir (Medisoft, Dinant, Belgium) devices during a single visit under controlled conditions. Linear mixed models were used adjusting for device and period as fixed effects and random intercept for each participant.
Difference in DLNO between HypAir and MasterScreen was 24.0 mL·min−1·mmHg−1 (95% CI 21.7–26.3). There was no difference in DLCO (−0.03 mL·min−1·mmHg−1, 95% CI −0.57–0.12) between devices while alveolar volume (VA) was higher on HypAir compared to MasterScreen™ (0.48 L, 95% CI 0.45–0.52). Disparity in the estimation of VA and the rate of NO uptake (KNO=DLNO/VA) could explain the discrepancy in DLNO between devices. Disparity in the estimation of VA and the rate of CO uptake (KCO=DLCO/VA) per unit of VA offset each other resulting in negligible discrepancy in DLCO between devices. Differences in methods of expiratory gas sampling and sensor specifications between devices likely explain these observations.
These findings have important implications for derivation of DLNO reference values and comparison of results across studies. Until this issue is resolved, reference values, established on the respective devices, should be used for test interpretation.
Abstract
Large discrepancies between commercial devices to measure single-breath diffusing capacity of the lung for nitric oxide in healthy subjects caution against pooling or direct comparison of measurements obtained using different protocols and devices https://bit.ly/3vKyF7U
Introduction
Lung diffusing capacity (DL) measures the conductance of gas transfer from alveolar air to capillary haemoglobin. The combined measurement of diffusing capacity for nitric oxide (DLNO) and carbon monoxide (DLCO) has recently been summarised in a technical standards document by a Task Force of the European Respiratory Society (ERS) [1]. DLNO has been mainly used in research settings in healthy people and in various cardiopulmonary diseases; however, its additional value for use in clinical practice has yet to be determined. To date, two devices are commercially available in Europe: the MasterScreen™ PFT Pro, by Vyaire Medical (Mettawa, IL, USA) (hereafter referred to as MasterScreen); and the HypAir by Medisoft (Dinant, Belgium) (hereafter referred to as HypAir).
Large sets of normal values have been published for adults using prospectively collected data on the MasterScreen [2], and by pooling existing data [1] from three studies on healthy subjects [3–5] collected on MasterScreen and HypAir devices and a modified Jaeger DLCO device to allow additional measurement of DLNO [3]. Recent data [6] suggest substantial differences in predicted DLNO values between the two reference equations by Munkholm et al. [2] and Zavorsky et al. [1]. Different study populations, devices, testing protocols and analysis methods are likely contributing factors. Interesting observations were made by Munkholm et al. [2] showing that the study by Aguilaniu et al. [4] produced the highest DLNO predicted values when compared to each individual study [3, 5] contributing data to the official ERS reference equations [1]. Since Aguilaniu et al. [4] contributed a large dataset to the official ERS reference equation [1] (about 54% of total) and used the HypAir device, we speculated that the observed difference between the two equations [6] may be partly due to differences between devices used to measure DLNO. To examine this possibility, we designed a randomised crossover study to compare the MasterScreen and HypAir devices in healthy subjects under controlled conditions and using measurement protocols recommended by the respective manufacturer. A secondary aim was to investigate the intrasession variability of both devices.
Methods
We conducted a single-centre randomised crossover trial at the University Hospital of Zurich, Switzerland between October 2019 and January 2020. Study participants were invited to the research laboratory (430 m above sea level) to perform spirometry and combined DLNO–DLCO measurements on the MasterScreen and HypAir devices in random order during a single study visit. Participants were advised not to perform vigorous physical activities for at least 24 h prior to testing. Coffee and meal consumption were restricted 3 h before the study visit. At the beginning of each visit, height and weight were measured to the nearest 0.1 mm and 0.1 kg, respectively. Body mass index was calculated based on height and weight.
Ethics
This study does not fall under the scope of the Human Research Act (HRA) in Switzerland. The study was designed to compare the output of two different pulmonary function devices with the results not being used for diagnostic purposes and/or to providing any treatment advice. The ethical committee of the canton of Zurich confirmed with a declaration of responsibility that ethical approval was not necessary for this study (2019-02026). All participants provided written consent to participate in this study. The study was registered with Clinicaltrials.gov (NCT04016597).
Participants
Healthy male and female adults were recruited at the Epidemiology, Biostatistics and Prevention Institute at the University of Zurich, the physiotherapy master's programme at the Zurich University of Applied Sciences (ZHAW) and the University Hospital Zurich, Switzerland. Inclusion criteria were healthy subjects aged 18 years and older. Exclusion criteria were current smoking or smoking within the last 12 months, chronic lung disease (e.g. asthma, COPD), a forced expiratory volume in 1 s (FEV1) and/or forced vital capacity (FVC) below the lower limit of normal [7], acute respiratory infection, previous thoracic surgery, a body mass index >30 kg·m−2 and pregnancy.
Randomisation
We used simple randomisation (1:1 ratio) to allocate study participants to start the measurements either with the MasterScreen or HypAir device. A computer-generated list of random numbers was created by an independent person not involved in the study using the online randomisation tool accessible at https://www.randomizer.org. The generated list contained the numbers 1 or 2, where 1 implied the participant starts with the MasterScreen device and 2 implied the participant starts with the HypAir device. Access to the list was restricted to two independent persons not involved in this study. Allocation concealment was ensured using central randomisation, by ad hoc request of the allocation sequence via phone. This was done after the participant provided verbal consent to participate in the study and inclusion and exclusion criteria were verified. Masking of participants and outcome assessors performing pulmonary function tests was not possible in this study design.
Quality control
Prior to the start of the study, both devices underwent technical check-ups and rigorous quality control by technicians of the respective companies or service providers. Each day the devices were manually calibrated using the three-flow method and a calibrated 3-L syringe. Besides volume calibration, a gas calibration was performed using automated procedures for helium (He), carbon monoxide (CO), nitric oxide (NO) and oxygen (O2). To ensure high-quality measurements, two members of the study team (QdG, MM) served as biological controls allowing us to detect any relevant fluctuations in DLNO values during the course of the study. Both team members (male: age 26 years and female: age 54 years, nonsmokers and free of any chronic disease impacting pulmonary function tests) performed weekly diffusing capacity measurements on both devices.
Measurement protocol
All measurements were performed on the MasterScreen™ PFT Pro (Vyaire Medical) and the HypAir (Medisoft) devices. Technical specifications for both devices are given in table S1. During spirometry and diffusing capacity measurements, participants were asked to stay seated to avoid any influence of changes in cardiac output on diffusing capacity measurements. They were allowed to drink water between test manoeuvres. Tests were done in the following order on each device: 1) slow spirometry, 2) forced spirometry, and 3) DLNO–DLCO. At least three technically acceptable manoeuvres were performed for both slow and forced spirometry following established standards [8]. The test with the highest value of the two best tests (i.e. two tests within 150-mL difference) was used in analysis. In regard to DLNO, at least three technically correct tests (e.g. no Valsalva or Muller manoeuvre, inspired volume ≥90% of vital capacity) were performed on each device following technical standards [9]. Additional tests (maximum of five) were done if the two best tests were not within 17 mL·min−1·mmHg−1 [9]. In-between DLNO–DLCO tests, a 5-min break was allowed for complete wash-out of test gases before the new test started. After all tests had been completed on one device, a 5-min rest was ensured before starting with measurements on the other device.
Study end-points
The primary end-point was DLNO (in mL·min−1·mmHg−1). Secondary outcomes were DLCO, rate constant for NO or CO removal from alveolar gas (i.e. permeability factor, κNO or κCO), physiological rate of NO or CO uptake from alveolar gas (transfer coefficient of the lung for nitric oxide (KNO) or carbon monoxide (KCO)) where K=κ/(barometric pressure–water vapour pressure) and numerically equals the corresponding DL/VA for NO or CO, alveolar volume (VA), change in alveolar NO fraction (ΔFANO=expired−inspired NO concentration); change in alveolar CO fraction (ΔFACO=expired−inspired CO concentration); breath-hold time, inspired and expired concentrations for NO, CO, He and O2.
Statistical analyses and sample size calculation
Since there were no data available on which we could base our power calculations, we used data from our own pilot study (n=6) during which we measured team members on the two different devices. The intraclass correlation coefficient (ICC) for DLNO values measured with the MasterScreen and HypAir devices was 0.96 with a 95% confidence interval (95% CI) of 0.85–1.0. Since all participants of the pilot study were experienced in performing spirometry and DLNO–DLCO measurements, we decided to follow a more conservative approach. Therefore, with an estimated ICC of 0.85 (95% CI 0.75–0.95), 31 participants were required (ICCest Calculation; calculated with nQuery Advisor 7.0) for primary end-point analysis. To account for possible dropout, we aimed to recruit 35 participants.
Descriptive data are presented as number (per cent) or means±sd. Diffusing capacity outcomes from HypAir and MasterScreen were analysed with a linear model adjusted for repeated measurements and reported as mean (95% confidence interval). Comparisons of primary (DLNO) and secondary end-points between devices were calculated using a linear mixed model [10] adjusting for device (MasterScreen versus HypAir, coded as 0, 1) and period (i.e. 1st device used or 2nd device) as fixed effects and random intercept for each participant. Intrasession variability of HypAir and MasterScreen devices was calculated using ICC using a two-way-mixed model. Precision of DLNO values was quantified by the within-subject standard deviation (SDws=root mean square error) calculated by the root mean square (RMS) method and the coefficient of variation (CV) [11, 12]. Repeatability was calculated with 1.96*1.96*√2*SDws (95% confidence interval). Intra-device repeatability was calculated as 1.96 * √2 *SDws (95% level of confidence). ICCs and their 95% confidence intervals were calculated for DLNO, DLCO and VA using a two-way mixed model [consistency, single measurement (ICC, 3.1)] [13].
Results
35 participants were recruited and completed all measurements without experiencing any adverse events (figure 1). Characteristics of the study population stratified by test period are given in table 1.
Table 2 provides an overview of between-device differences in DLNO (primary end-point) and all secondary end-points including all individual tests and adjusted for repeated measures. Mean raw values for DLNO, DLCO, VA, κNO and κCO from all individuals’ tests performed on each of the two devices are shown in figure 2. Individual mean raw data for inspired and expired gas concentrations from MasterScreen and HypAir are summarised in table S2. Individual mean raw data for breath-hold time, inspiratory volume, KNO, KCO, κCO, κNO, Δ alveolar carbon monoxide fraction (FACO) and Δ alveolar nitric oxide fraction (FANO) are provided in figures S1 and S2.
In mixed linear models adjusted for period and device, the difference in DLNO between HypAir and MasterScreen was 24.0 mL·min−1·mmHg−1 (95% CI 21.7 to 26.3), see table 3. Similarly, large differences were noticed in VA (8%), κNO (15%) and κCO (16%), while DLCO was not different between HypAir and MasterScreen (table 3).
Intrasession variability
Intrasession variability characteristics for MasterScreen and HypAir are displayed in table 4. All participants fulfilled the test quality criteria for DLNO (i.e. the two highest tests were within 17 mL·min−1·mmHg−1), except one participant who performed five tests on the HypAir device without reaching the quality criterion.
Biological monitoring
During the study, two team members completed a total of 21 DLNO–DLCO measurements. Both individuals showed slight variation in DLNO with a mean variation of 6.13 mL·min−1·mmHg−1 (CV 3.17%) and 2.60 mL·min−1·mmHg−1 (CV 2.72%) on the HypAir device, respectively. On the MasterScreen device, mean variation of DLNO was 4.56 mL·min−1·mmHg−1 (CV 3.07%) and 2.64 mL·min−1·mmHg−1 (CV 3.06%), respectively. Additional diffusing capacity outcomes (i.e. DLCO, VA, KCO, KNO, κCO, and κNO) are given in table S3. These two well-trained team members showed the same systematic between-device differences as in our study population.
Discussion
This randomised crossover study was designed to directly compare DLNO (primary end-point) measurements in healthy, non-smoking adults using two devices commercially available in Europe. The intrasession variability characteristics for DLNO and DLCO were comparable between the two devices and similar to previous studies in healthy people [9, 14] and those with chronic lung disease [15], indicating that both devices are internally consistent in measuring lung diffusing capacity outcomes. However, there are substantial differences in DLNO between the MasterScreen and HypAir devices, with values on average 17% higher by HypAir than MasterScreen. In contrast, the simultaneously measured DLCO was similar (1% difference) between the two devices. Published studies reporting DLNO reference values also showed a similar discrepancy between HypAir and MasterScreen devices. Munkholm et al. [2] measured DLNO using the Jaeger MasterScreen Pro and obtained values significantly lower than that obtained by Aguilaniu et al. [4] using HypAir with an initial 14% He and selecting DLNO values from the manoeuvre yielding the highest DLCO. Other studies including Zavorsky et al. [5] using HypAir and 9.47% initial He concentration and van der Lee et al. [3] using a Jaeger DLCO device with substantial modifications and an added chemiluminescence NO cell, and a recent ERS Task Force document [1], reported DLNO values intermediate between HypAir (14% initial He) and MasterScreen (∼10% initial He). Despite the variations in subject populations, measurement methods and analysis, comparisons of prior studies are consistent with our current finding of a higher DLNO using HypAir than using MasterScreen.
Multiple factors likely contributed to the selective discrepancy in DLNO and are systematically discussed below. Between-device differences in diffusing capacity outcomes should be interpreted based on the magnitude of effects, i.e. statistical significance versus clinical/physiological relevance.
Breath-hold time
DLNO and DLCO decreases while VA increases with increasing breath-hold times [2, 16], and NO was taken up from the inspired gas mixture faster than CO. In healthy adults, a difference of 2 s in breath-hold time (i.e. 4 s versus 6 s) resulted in a mean difference in DLNO and VA of ∼6 mL·min−1·mmHg−1 and <100 mL, respectively [16]. In both HypAir and MasterScreen devices, breath-hold times are calculated using the same equations [17], but they operate with different software interfaces. The small difference in breath-hold times between measurements on the two devices, −0.33 s (95% CI −0.42 to −0.24) shorter on HypAir (table 2), should not contribute significantly to the differences in DLNO and VA between the two devices.
Anatomical dead space
The two devices use the same equation to calculate total dead space based on body weight, with a 50 mL difference in the apparatus dead space (table S1). However, the anatomical dead space varies with age, sex, height, body size and lung volume [18]; this factor may introduce inaccuracy in the assumed anatomical dead space but is expected to similarly affect results obtained on both devices.
He dilution
The recommended initial He concentration is 10% [1]. The initial He concentration by MasterScreen was 9.9%. The manufacturer of HypAir recommended a gas mixture containing 14% He (default setting). The thermal conductivity He analysers in the two devices have similar resolution (table S1). The accuracy of MasterScreen is ±0.05% or 2%, whichever is greater; the accuracy of HypAir is <1%, which could range from 50% lower or up to 20 times higher than MasterScreen. The response time of HypAir is 25 to 50 times slower than that by MasterScreen. He concentrations measured from a reservoir (HypAir) may be more constant than that measured in real time (MasterScreen). The starting He concentrations were also different (14% HypAir versus 9.9% MasterScreen). Each factor may cause minimal disparity, but cumulatively they could potentially contribute to the 10% difference in He dilution (figure S2), which in turn could account for an 8% higher VA estimated by HypAir relative to MasterScreen. However, any difference in VA is expected to similarly affect the estimates of both DLNO and DLCO, suggesting there are additional causes for the larger discrepancy in DLNO estimation.
Methods of expiratory gas sampling
With HypAir, the average inspired and expired gas concentrations are measured before and after breath-hold in separate collection bags. With MasterScreen, there is a single bag for inspiratory gas while expiratory gas concentrations are measured in real time within the device; the details of sampling and calculations are not disclosed. We were unable to extract the raw gas disappearance curves to directly verify the linearity of NO or CO uptake or the constancy of He concentration during breath-hold. Differences in the two measurement approaches may also have affected the observed rates of gas uptake (KNO, KCO).
Measurement of NO concentration and uptake
While average inspired NO was higher by HypAir than MasterScreen (47.6 versus 43.4 ppm, respectively), expired NO concentration was slightly lower (4.8 versus 5.1 ppm, respectively). The expired/inspired NO ratio was on average 14.5% higher by MasterScreen than HypAir; the corresponding KNO was 11% higher by HypAir than MasterScreen (23.1 versus 20.8 mL·min−1·mmHg−1·L−1, respectively). Since DLNO is the product of KNO× VA, the 11% higher KNO combined with the 8% higher estimates of VA by HypAir than MasterScreen could potentially explain the observed 20% net difference in DLNO between devices.
The two devices use different models of electrochemical NO cells from the same manufacturer (CiTicel®, City Technology Ltd, Portsmouth, UK). Specifications (table S1) show that while the two sensors have comparable sensitivity and similar resolution and repeatability, the model used by HypAir (3MNT) has a larger measurement range and different accuracy (0 to 1000 ppm, relative accuracy <1%) compared to that used by MasterScreen (7NT, range 0 to 100 ppm, absolute accuracy 3 ppm). Without side-by-side performance comparison of the two NO sensors under identical controlled conditions and knowledge of the algorithms used for real time expiratory gas sampling and calculation of NO uptake by MasterScreen, it remains unclear whether the two models behave comparably at the low levels of expiratory NO concentrations (average 4–5 ppm with the lowest values <2.5 ppm) or to what extent differences in sensors contributed to the discrepancy in DLNO estimates. Furthermore, sensor output signal can drift over time, thereby altering the magnitude of discrepancy both within and between devices.
Measurement of CO concentration and uptake
Both inspired and expired CO concentrations were higher by HypAir than MasterScreen (by 11% and 10%, respectively) (table 2), while KCO was 9% lower by HypAir than MasterScreen. Given that VA was 8% higher by HypAir than MasterScreen, the discrepancies in KCO and VA cancelled each other resulting in negligible net discrepancy (1%) in DLCO estimates between the two devices. The electrochemical CO cells used in the two devices exhibit similar accuracy and response time with different ranges; it is unclear whether they have similar resolutions.
The NO electrochemical cell of MasterScreen has a narrower range of detection than that of HypAir, suggesting that the linearity for NO decay during breath-hold may be more strictly maintained in MasterScreen. However, the CO electrochemical cell of HypAir has a narrower range of detection than that of MasterScreen, suggesting that the linearity of CO decay may be more strictly maintained in HypAir. Without the ability to directly examine the concentration curves, in the presence of the other between-device technical differences mentioned above, and in the absence of an established gold standard, we cannot state whether one device is “more accurate” than the other.
Limitations of the study
We studied healthy subjects with a broad age range, under strictly controlled laboratory conditions, and using a randomised crossover design (i.e. each participant acts as its own control) to minimise confounding factors. We used simple randomisation (1:1 ratio) that resulted in an unequal number of participants starting with the HypAir device (i.e. 63% were randomly allocated to start with this device). Stratified block randomisation would have been the preferred randomisation method to ensure similar group sizes. Nevertheless, participant characteristics were well balanced along both sequence groups. We compared the two devices following the respective measurement conditions recommended by each manufacturer, resulting in different initial gas concentrations and breath-hold times. Strictly matching these conditions within and among subjects and/or by the use of a simulation system may allow pinpointing the source(s) of the observed discrepancy. We did not study subjects with lung disease, although it is unlikely that the systematic differences in DLNO between the two devices will disappear when testing subjects with cardiorespiratory diseases; for example, in obstructive airway disease ventilatory inhomogeneity plus a short breath-hold time may further accentuate any discrepancy in VA, KNO and KCO between devices.
Conclusions
The rapid pulmonary uptake of NO relative to that of CO [1] increases the susceptibility of DLNO to methodological variations such as in the initial and final gas concentrations, breath-hold time, and device-related differences in the quantification of helium dilution and NO and CO uptake rates. Disparity in the estimation of VA and KNO could fully explain the observed discrepancy in DLNO measurement between devices. Disparity in the estimation of VA and KCO offset each other resulting in negligible discrepancy in DLCO measurement between devices. These large disparities measured in healthy subjects have important implications for the derivation of DLNO reference values and the comparison of results across studies, which in turn impact the utility of DLNO as a biomarker of lung disease. Further studies will examine whether and how the presence of lung disease alters device-related disparity in DLNO measurements. Given these uncertainties and the need to avoid systematic errors, caution must be exercised regarding the pooling or comparison of DLNO measurements obtained using different protocols and devices.
Supplementary material
Supplementary Material
Please note: supplementary material is not edited by the Editorial Office, and is uploaded as it has been supplied by the author.
Supplementary material 00193-2021.SUPPLEMENT
Acknowledgement
We thank all participants for their contribution to this study. Furthermore, we want to thank Fabio Barresi and Reka Maria Blazsik (University of Zurich) for their support with the central randomisation. We also want to thank Gerald S. Zavorsky (US Davis Medical Center) for his intellectual input on the study protocol.
Footnotes
Provenance: Submitted article, peer reviewed.
This article has been revised according to the author correction published in ERJ Open Res 2021; 7: 50193-2021 [https://10.1183/23120541.50193-2021]
This article has supplementary material available from openres.ersjournals.com
This study is registered at www.clinicaltrials.gov with identifier number NCT04016597. Individual participant data that underline the results of this article will be made available after deidentification.
Author contributions: Conception and design: H. Dressel, T. Radtke and Q. de Groot; acquisition of data: M. Maggi and Q. de Groot; statistical analysis: S.R. Haile; interpretation: C.C.W. Hsia, H. Dressel, M. Maggi, T. Radtke, S.R. Haile and Q. de Groot; first draft: T. Radtke, Q. de Groot and C.C.W. Hsia; all authors edited, reviewed, and approved the final version of the manuscript.
Conflict of interest: T. Radtke has nothing to disclose.
Conflict of interest: Q. de Groot has nothing to disclose.
Conflict of interest: S.R. Haile has nothing to disclose.
Conflict of interest: M. Maggi has nothing to disclose.
Conflict of interest: C.C.W. Hsia has nothing to disclose.
Conflict of interest: H. Dressel has nothing to disclose.
- Received March 17, 2021.
- Accepted June 15, 2021.
- Copyright ©The authors 2021
This version is distributed under the terms of the Creative Commons Attribution Non-Commercial Licence 4.0. For commercial reproduction rights and permissions contact permissions{at}ersnet.org