Abstract
Background References from the Global Lung Function Initiative (GLI) are widely used to interpret children's spirometry results. We assessed fit for healthy schoolchildren.
Methods LuftiBus in the School was a population-based cross-sectional study undertaken in 2013–2016 in the canton of Zurich, Switzerland. Parents and their children aged 6–17 years answered questionnaires about respiratory symptoms and lifestyle. Children underwent spirometry in a mobile lung function lab. We calculated GLI-based z-scores for forced expiratory volume in 1 s (FEV1), forced vital capacity (FVC), FEV1/FVC and forced expiratory flow for 25–75% of FVC (FEF25–75) for healthy White participants. We defined appropriate fit to GLI references by mean values between +0.5 and −0.5 z-scores. We assessed whether fit varied by age, body mass index, height and sex using linear regression models.
Results We analysed data from 2036 children with valid FEV1 measurements, of whom 1762 also had valid FVC measurements. The median age was 12.2 years. Fit was appropriate for children aged 6–11 years for all indices. In adolescents aged 12–17 years, fit was appropriate for FEV1/FVC z-scores (mean±sd −0.09±1.02), but not for FEV1 (−0.62±0.98), FVC (−0.60±0.98) and FEF25–75 (−0.54±1.02). Mean FEV1, FVC and FEF25–75 z-scores fitted better in children considered overweight (−0.25, −0.13 and −0.38, respectively) than normal weight (−0.55, −0.50 and −0.55, respectively; p-trend <0.001, 0.014 and <0.001, respectively). FEV1, FVC and FEF25–75 z-scores depended on both age and height (p-interaction 0.033, 0.019 and <0.001, respectively).
Conclusion GLI-based FEV1, FVC, and FEF25–75 z-scores do not fit White Swiss adolescents well. This should be considered when using reference equations for clinical decision-making, research and international comparison.
Abstract
This study suggests GLI-based FEV1, FVC and FEF25–75% z-scores over-detect abnormal lung function in Swiss adolescents, and more so among slimmer adolescents, which has important implications for clinical care, research and international comparisons https://bit.ly/3sbGtAS
Introduction
Standardised global spirometry reference equations are important for interpreting spirometry results, supporting clinical decision-making and allowing comparison among research findings. To adequately interpret spirometry results and avoid misclassification in clinical care and research, lung function references must fit the local population [1]. The Global Lung Function Initiative (GLI) collected large amounts of spirometry data from healthy nonsmoking populations around the world and developed these reference equations [2]. GLI references are now widely used in clinics and studies worldwide. An ideal fit for GLI references with a local population is defined by mean z-scores of spirometry indices equal to 0 with a standard deviation of 1; however, a shift of up to ±0.5 z-scores of the mean (i.e. mean between +0.5 and −0.5 z-scores) is considered acceptable due to sampling variations [3–5]. GLI references account for ages 3–95 years, ethnicities, height and sex [2]. Thus, z-scores of spirometry indices should not be affected by these factors. GLI references do not adjust for body composition assessed by weight or body mass index (BMI), although these factors may explain some variability in z-score results in children [6, 7], and excessive weight gain is reported to adversely affect lung growth [8, 9].
Standardising lung function references during adolescence is particularly complex [10], because physiological changes in lung growth and body structure may occur at different ages for males and females of the same ethnicity [11–13]. European studies showed that GLI references fitted children from France, Norway, Spain and the United Kingdom (UK) [14–17], yet only partially fitted children from Italy and Germany [18, 19]. At present, we know little about the factors that contribute to the poor fit for some populations. Zapletal references were used in the past in the canton of Zurich (Switzerland), but GLI references for lung function in children have been used in Switzerland since they were published. The fit of GLI references had not been assessed for Swiss children. In this study, we assessed GLI reference equation fit for healthy White schoolchildren in Switzerland and investigated factors that may affect fit.
Methods
Study design and setting
LuftiBus in the School (LUIS) is a population-based cross-sectional study of children aged 6–17 years and their respiratory health from 2013 to 2016 in the canton of Zurich (ClinicalTrials.gov identifier NCT03659838). Zurich is the most populated canton in Switzerland, and it includes many persons born in other countries. The methodology of LUIS has been described [20]. All schools in the canton of Zurich were invited to participate. If the head of a school agreed, trained lung function technicians visited the school in a bus with equipment to measure lung function and anthropometrics. Parents completed a detailed questionnaire at home; children answered a questionnaire via interview with technicians at school and underwent lung function testing. The ethics committee of the canton of Zurich approved the study (KEK-ZH-Nr: 2014–0491) and informed consent was obtained from parents and children prior to participation.
Selection of the study population
We included children with available parent and child questionnaires. To obtain a sample of healthy children, we excluded those with parent-reported wheeze in the past year, use of inhaled corticosteroids in the past year or lifetime doctor's diagnosis of asthma (figure 1). We excluded children who reported smoking at least once per week and those who reported cough or a cold on the day of the measurement. These are standard criteria to define healthy participants (with some variations between studies) [6, 19, 21, 22]. We excluded children with ethnicities other than White, because there were too few to assess fit [1]. We also excluded children with height for age or BMI z-scores >4 or <−4, which could have resulted from data-recording errors during height and weight measurements. We did quality control checks of lung function measurements, as described previously [20]. We excluded children with invalid lung function measurements, whose flow–volume curves had signs of hesitation at the start of expiration or submaximal effort. If there were signs of early termination of expiration, or cough or glottis closure that occurred before the first second of expiration, these measurements were excluded both for forced expiratory volume in 1 s (FEV1) and forced vital capacity (FVC), and if these occurred after the first second of expiration they were excluded for FVC, but not for FEV1 [23].
Study procedures
The questionnaire for parents was paper-based and completed at home [20]. It asked about diagnoses, family and household characteristics and lifestyle, medication and respiratory symptoms. The interview questionnaire for children was shorter and focused on respiratory symptoms, presence of cough or a cold on the day of the measurement, and active smoking. Questions were based on the International Study of Asthma and Allergies in Childhood and the Leicester Respiratory Cohort study questionnaires [24, 25].
For spirometry we used Masterlab (Jaeger, Würzburg, Germany). Trained technicians conducted spirometry according to European Respiratory Society/American Respiratory Society standards and calibrations were done according to recommendations [26]. As main outcomes, we used FEV1, FVC, FEV1/FVC and forced expiratory flow between 25 and 75% of the FVC (FEF25–75). We applied GLI ethnicity-specific reference equations for White people to produce z-scores for spirometry indices using the GLI desktop software [27]. Children's standing height was measured without shoes using a stadiometer and recorded to the closest centimetre, and weight was measured without shoes and wearing light clothing by the technicians in the bus according to standard [28]. We calculated BMI z-scores and height-for-age z-scores based on World Health Organization (WHO) references [28]. We classified BMI z-scores into four categories (underweight <−2 z-scores; normal weight: ≥−2 to <1 z-scores; overweight: ≥1 to <2 z-scores; obese: ≥2 z-scores) [28].
Statistical analysis
We calculated the mean (95% CI) for z-scores of spirometry indices. Deviations of the mean outside the range of −0.5 to +0.5 are generally considered physiologically meaningful [3, 5, 19]. We described the proportion of the population below the lower limit of normal (LLN) (i.e. <−1.645 z-scores) and its 95% confidence interval. By definition, 5% of the population should fall below the LLN. To address systematic deviations in the fit of GLI references by age, we stratified by two age groups: children aged <12 years and adolescents aged ≥12 years [19]. We chose this age cut-off because although puberty begins at different ages for males and females, it has usually begun in both sexes by the age of 12 years [29]. To allow further comparison, the supplementary material shows results stratified by four age groups: 6–9 years, 10–11 years, 12–13 years and 14–17 years [19].
We studied associations of GLI-based z-scores for FEV1, FVC, FEV1/FVC and FEF25–75 with age, BMI z-scores, height and sex using scatterplots and univariable linear regression models. We tested for age–height, age–sex and height–sex interactions using multivariable regression models for each spirometry parameter and compared models with and without interaction terms using likelihood ratio tests.
We performed three sensitivity analyses. In the first, we used stricter criteria to define healthy children; in the second, we used looser criteria; and in the third, we excluded children with migration backgrounds (i.e. ethnically White children who were not themselves, nor their parents, born in Switzerland (supplementary table S2)).
We used STATA (version 16; StataCorp, College Station, TX, USA) for statistical analysis and graphs. We followed Strengthening the Reporting of Observational Studies in Epidemiology reporting guidelines [30].
Results
Study population and sample selection
The LUIS study included 3870 children from 37 schools in the canton of Zurich. From these 3870 children, we analysed data from 2213 children who had parent-completed questionnaires and child-answered questionnaires, spirometry and anthropometric measurements, were ethnically White, and healthy (figure 1). After quality control of flow–volume curves, we included data from 2036 children. All had valid FEV1 and 1762 had valid FVC measurements. The median age was 12.2 years (range 6.3–17.0 years, interquartile range 9.6–14.0 years) and 49% were male (table 1). Most children were born in Switzerland (n=1814, 90%), but only two-thirds of their parents were (table 1). Compared to WHO references, our population had a similar BMI distribution (mean BMI z-score 0.04), but was slightly taller (mean height-for-age z-score 0.56).
Fit of GLI references
For children aged 6–11 years, fit was appropriate for FEV1 (mean±sd −0.37±0.94), FVC (−0.26±0.98) and FEF25–75 (−0.49±0.94). For adolescents aged 12–17 years, the fit for all three outcomes was poorer: mean±sd FEV1 was −0.62±0.98, FVC was −0.60±0.98 and FEF25–75 −0.54±1.02 (figures 2 and 3, supplementary table S1). Fit of FEV1/FVC z-scores was appropriate for both age groups (mean±sd z-scores 6–11 years: −0.14±1.06; 12–17 years −0.09±1.02). The proportion of participants with mean (95% CI) z-scores below the LLN was above the expected 5% for the 6–11-year age group for FEV1 (9%, 7–11%), FVC (8%, 6–10%) and FEF25–75 (11%, 9–13%). Among adolescents, this was even more pronounced for FEV1 (15%, 13–17%), FVC (14%, 12–16%) and FEF25–75 (14%, 11–16%). For FEV1/FVC, proportions below LLN were close to 5% for males aged 6–11 years (4%, 3–7%) and females aged 12–17 years (7%, 5–10%) and males (7%, 5–10%) aged 12–17 years, yet not for females aged 6–11 years (9%, 7–12%). Findings were very similar in the three sensitivity analyses (supplementary tables S2 and S3).
Associations of GLI-based spirometry z-scores by age, BMI, height and sex
FEV1, FVC and FEF25–75 z-scores were lower for older than for younger children (figure 4). For ages 10–11 years, FEV1, FVC and FEF25–75 z-scores were lower for females than males (figure 4, supplementary table S1). FEV1/FVC z-scores did not vary with age (figure 4) or height (figure 5).
There was an inverse association of FEV1, FVC and FEF25–75 z-scores by age in multivariable linear models (supplementary table S4) and this association with age varied by height (p-values for interaction <0.033, 0.019 and <0.001, respectively) (figure 6). For those aged 6–11 years, FEV1 and FEF25–75 z-scores decreased with height, while z-scores increased with height for those aged 12–17 years. FVC z-scores increased with height in all age groups, but more so for adolescents. We found no interactions with sex (data not shown).
Children considered overweight or obese had higher FEV1, FVC and FEF25–75 z-scores than children considered normal weight or underweight. Fit was poor for children classified as underweight (e.g. mean±sd FEV1 −1.13±0.92) or normal weight (−0.55±0.96), but good for children considered overweight (−0.25±0.92) or obese (−0.08±0.93) (figure 7, supplementary table S5). FEV1, FVC and FEF25–75 z-scores also increased with BMI in linear regression models (supplementary table S4).
Discussion
Our large population-based study of healthy schoolchildren found that FEV1, FVC and FEF25–75 z-scores were lower than expected from GLI reference values for Swiss schoolchildren, particularly adolescents. The association of FEV1, FVC and FEF25–75 z-scores with height differed by age. Although FEV1 z-scores decreased with height for younger children, they increased in older children. The fit of references for FEV1, FVC and FEF25–75 z-scores was much better for children with higher BMI. This suggests that it is worthwhile considering whether weight or BMI data could be used to improve future GLI reference equations, resulting in a potentially better fit for samples who are slimmer or more overweight than the average.
Strengths and limitations
Our study had several strengths. The large sample size allowed us to assess the fit of GLI-based lung function z-scores in two age groups of White schoolchildren, each with well over 150 males and 150 females, according to GLI recommendations [1]. We performed spirometry using a state-of-the-art set-up instead of hand-held portable spirometers and did careful quality control of spirometry flow–volume curves, which increases the reliability of our results and makes measurement bias due to poor quality unlikely [20]. We obtained information about respiratory symptoms from both parents and children, which allowed us to perform sensitivity analyses to define a healthy population, adding robustness to our findings. However, our study also had some limitations. First, we cannot exclude sampling bias, because not all schools took part in the study. However, participating schools were similar to the whole canton [20]. Selection bias due to inclusion of children who are not healthy, e.g. children with chest wall abnormalities, neuromuscular, cardiological or other diseases not assessed in our questionnaires, cannot be ruled out. However, these diseases have such a low prevalence in the general population that only one or two cases could have been included in our study population by chance, and this could not have substantially influenced results. More prevalent conditions such as wheezing disorders and chronic cough have been excluded in our main and sensitivity analyses. Second, we lacked information about the onset of puberty; this information would have allowed us to better explore whether pubertal stage influences fit of GLI references in adolescents. Third, minor inaccuracies in the measurement of height could have introduced some misclassification bias, but this would have been nondifferential across age groups and is therefore unlikely to explain the lack of fit we observed among adolescents. Last, although the canton of Zurich is the most populated and diverse region of Switzerland, we cannot assess whether our findings apply to all Swiss schoolchildren.
Comparison with other studies
Other studies have assessed the fit of national data with GLI reference values. The fit was good in studies from France, Norway, Spain and the UK [14–17]; however, fit was poorer in studies from Germany and Italy [18, 19]. Our findings are comparable to those of Hüls et al. [19]. They assessed the applicability of GLI references for White children from several German cities using a school-based study of 1943 children aged 4–19 years and a population-based birth cohort of 1042 adolescents aged 15 years. They found sufficient fit for children younger than 10 years, but systematically lower mean z-scores for FEV1 and FVC for children older than 10 years. In the German study, mean FEV1 and FVC z-scores were higher for those aged 6–9 years (e.g. mean FEV1 z-scores 0.01 for females and 0.02 for males), than in our study (−0.41 for females, −0.42 for males). In contrast to our study where females had lower z-scores than males (14–17 years −0.70 for females, −0.53 for males), older males had lower FEV1 and FVC z-scores than older females in the two German cohorts (e.g. mean FEV1 z-scores for those aged 15–18 years −0.24 for females, −0.41 for males). In their study and in ours, fit for FEV1/FVC ratio was appropriate for all age groups. Fasola et al. [18] found that GLI references produced mean z-scores >0.5 for FEV1/FVC for 1243 males (mean 0.75) and females (0.81) aged 7–16 years from southern Italy, which led to an underestimation of children with reduced FEV1/FVC ratio. In contrast, we found a good fit of FEV1/FVC z-scores. Because mean z-scores were reduced both for FEV1 and FVC, the ratio of these two remained normal.
The poor fit of GLI references for FEV1, FVC and FEF25–75 for adolescents from our study may be related to differences in body size or timing of pubertal growth spurts that affect lung growth. Age of puberty onset, height and timing of growth spurt vary across European countries [31, 32]. Children in northern countries generally have higher median height and later onset of puberty compared to southern countries [31, 32]. The height range in our study population was 113–194 cm for males and 105–180 cm for females aged 6–17 years, whereas the height range in the GLI reference population was 96–199 cm for males and 100–188 cm for females aged 6–18 years [33]. The mean height-for-age of our participants was higher (mean±sd z-score 0.56±1.01) than the WHO references [28]. This confirms that children from the canton of Zurich are on average taller than the WHO reference population [34]. Previous studies reported a slightly higher age of peak height growth in Switzerland (13.9 years for males, 12.2 years for females) than the UK (13.6 years for males, 11.7 years for females), Canada (13.4 years for males, 11.8 years for females) and the United States (13 years for males, 11 years for females) [35–37].
Another determinant of lung function in children is body composition [6]. Children considered overweight or obese have higher values for both FEV1 and FVC, particularly for FVC, so that the FEV1/FVC ratio is lower than for normal-weight children [38, 39]. This aligns with our findings and may be due to asymmetric lung growth (i.e. increased dysanapsis) for children considered overweight and obese [9]. In line with our findings, previous studies have described associations between BMI and GLI-based lung function z-scores in children [40, 41].
Implications
GLI references are too high for the lung function of healthy adolescents from our study. The GLI-based LLN over-detects abnormal results in dynamic lung volumes and forced flows. This misclassification affects interpretations of lung function data in research and clinical practice [42]. In research, this affects cross-sectional studies when the lung function of children with a disease is compared to normal values. It also affects longitudinal analyses, when using GLI references might make lung function in older children at follow-up appear poorer than at baseline. Clinicians commonly use spirometry to diagnose respiratory disease in children with respiratory symptoms. Over-detection of pulmonary restriction and obstruction may induce unnecessary treatments and surveillance burdens on patients, as well as unnecessary costs on healthcare systems [42]. For example, underestimated FEV1 in adolescents may lead to misdiagnosis of asthma [43]. Many clinical trials use FEV1 to enrol or assign individuals to an intervention. Use of GLI-based FEV1 z-scores would classify more children below LLN and may lead to inclusion of children with lower severity. People with chronic respiratory disease are often included in national registries [44–46]. GLI-based FEV1 z-scores of Swiss registry participants may wrongly seem to decline from early school-age through adolescence only because of the age-related change in fit with GLI z-score, not because their disease becomes worse. This can affect international comparisons of lung health. Clinicians and researchers interpreting GLI-based FEV1, FVC and FEF25–75 z-scores should consider the possibility of finding results below LLN in slimmer, yet otherwise healthy children. The GLI developed lung function equations based on data from a very large number of subjects. This was a major strength, but also a challenge due to the heterogeneity introduced by the use of datasets from different centres and lung function laboratories, with differences in measurements of spirometry and height, for example, or in definitions of healthy participants. Ideally, future references could be redone using carefully validated and controlled data and common inclusion criteria. Future studies should assess ways to compensate for BMI on spirometry indices.
In conclusion, we found evidence that GLI-based spirometry reference equations fit data for White children aged 6–11 years, but not for adolescents aged 12–17 years in Switzerland. BMI is a possible driver of deviations in FEV1, FVC and FEF25–75 z-scores and future longitudinal studies should assess the impact of timing of pubertal growth spurt on these indices. We also advise caution when interpreting borderline spirometry results in slimmer adolescents, and encourage further studies to assess fit of GLI in paediatric populations.
Supplementary material
Supplementary Material
Please note: supplementary material is not edited by the Editorial Office, and is uploaded as it has been supplied by the author.
Supplementary material 00618-2021.SUPPLEMENT
Acknowledgements
We thank the staff from the schools, the children and their families for taking part in the study, as well as LUIS study fieldworkers for their technical support during the study. We thank Marcel Zwahlen (Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland) for his statistical advice. We thank Johanna M. Kurz, Andras Soti, Marc-Alexander Oestreich and Corin Willers (Division of Paediatric Respiratory Medicine and Allergology, Dept of Paediatrics, Inselspital, Bern University Hospital, University of Bern) and Léonie Hüsler, Eugénie Collaud and Carmen C.M. de Jong (Institute of Social and Preventive Medicine, University of Bern) for their help in the assessment of the quality of the spirometry flow–volume curves. We thank Kristin Marie Bivens (Institute of Social and Preventive Medicine, University of Bern) for her editorial contributions.
Footnotes
Provenance: Submitted article, peer reviewed.
The LuftiBus in the School (LUIS) study group: Alexander Moeller, Jakob Usemann (Division of Respiratory Medicine, University Children's Hospital Zurich and Childhood Research Centre, University of Zurich, Switzerland); Philipp Latzin, Florian Singer and Johanna M. Kurz (Division of Paediatric Respiratory Medicine and Allergology, Dept of Paediatrics, Inselspital, Bern University Hospital, University of Bern, Switzerland); Claudia E. Kuehni, Rebeca Mozun, Cristina Ardura-Garcia, Myrofora Goutaki, Eva S.L. Pedersen and Maria Christina Mallet (Institute of Social and Preventive Medicine, University of Bern, Switzerland); and Kees de Hoogh (Swiss Tropical and Public Health Institute, Basel, Switzerland).
Author contributions: C.E. Kuehni, A. Moeller and P. Latzin conceptualised and designed the study. A. Moeller supervised data collection. R. Mozun analysed the data and drafted the manuscript. C. Ardura-Garcia and E.S.L. Pedersen supported the statistical analysis. All authors gave input for interpretation of the data. All authors critically revised and approved the manuscript.
Conflict of interest: A. Moeller reports receiving consulting fees from Vertex; payments or honoraria for lectures, presentations, speaker bureaus, manuscript writing or educational events received from Vertex and Vifor; participation on a data safety monitoring or advisory board for Vertex; and leadership or fiduciary roles in other boards, societies, committees or advocacy groups, paid or unpaid, held for ERS Assembly 7 (Secretary), SGP board, SGPP board, SWGCF (co-president) and SSSCS (vice-president). All disclosures made outside the submitted work. F. Singer reports support for the present manuscript received from the Bern Lung League foundation and Kinderinsel foundation; and payment or honoraria for lectures, presentations, speaker bureaus, manuscript writing or educational events received from Vertex and Novartis, outside the submitted work. P. Latzin reports receiving grants or contracts from Vertex and Vifor, outside the submitted work; payment or honoraria for lectures, presentations, speaker bureaus, manuscript writing or educational events received from Vertex, Vifor and OM Pharma, outside the submitted work; and participation on a data safety monitoring or advisory board for Polyphor, Santhera (DMC), Vertex, OM Pharma and Vifor, outside the submitted work. J. Usemann reports receiving grants or contracts from the Swiss Lung Foundation and Palatin Foundation, Basel, Switzerland, outside the submitted work; and payment or honoraria for lectures, presentations, speaker bureaus, manuscript writing or educational events received from Vertex, outside the submitted work. R. Mozun reports support for the present manuscript received from the Institute of Social and Preventive Medicine, University of Bern, at which she is employed. The remaining authors have nothing to disclose.
Support statement: Lunge Zürich, Switzerland, funded the study set-up, development and data collection with a grant to A. Moeller. Lunge Zürich, and University Children's Hospital Zurich and Children's Research Center, University of Zurich, Switzerland, fund LUIS data management, data analysis and publications. Analysis was supported by a grant from the Swiss National Science Foundation (320030_182628) to C.E. Kuehni. J. Usemann and F. Singer received grants from the Swiss lung foundation and the Bern lung foundation. Funding information for this article has been deposited with the Crossref Funder Registry.
- Received November 2, 2021.
- Accepted February 17, 2022.
- Copyright ©The authors 2022
This version is distributed under the terms of the Creative Commons Attribution Non-Commercial Licence 4.0. For commercial reproduction rights and permissions contact permissions{at}ersnet.org