Abstract
Background Asthma and COPD are among the most common respiratory diseases. To improve the early detection of exacerbations and the clinical course of asthma and COPD new biomarkers are needed. The development of noninvasive metabolomics of exhaled air into a point-of-care tool is an appealing option. However, risk factors for obstructive pulmonary diseases can potentially introduce confounding markers due to altered volatile organic compound (VOC) patterns being linked to these risk factors instead of the disease. We conducted a systematic review and presented a comprehensive list of VOCs associated with these risk factors.
Methods A PRISMA-oriented systematic search was conducted across PubMed, Embase and Cochrane Libraries between 2000 and 2022. Full-length studies evaluating VOCs in exhaled breath were included. A narrative synthesis of the data was conducted, and the Newcastle–Ottawa Scale was used to assess the quality of included studies.
Results The search yielded 2209 records and, based on the inclusion/exclusion criteria, 24 articles were included in the qualitative synthesis. In total, 232 individual VOCs associated with risk factors for obstructive pulmonary diseases were found; 58 compounds were reported more than once and 12 were reported as potential markers of asthma and/or COPD in other studies. Critical appraisal found that the identified studies were methodologically heterogeneous and had a variable risk of bias.
Conclusion We identified a series of exhaled VOCs associated with risk factors for asthma and/or COPD. Identification of these VOCs is necessary for the further development of exhaled metabolites-based point-of-care tests in these obstructive pulmonary diseases.
Tweetable abstract
Several risk factors for obstructive pulmonary diseases, including smoking, diet and BMI, may affect volatile organic compound (VOC) profiles. VOCs linked to risk (confounding) factors should be carefully considered when interpreting exhaled breath profiles. https://bit.ly/3HhJ1EV
Introduction
Asthma and COPD are among the most common respiratory diseases, affecting millions of people, and are among the top 20 causes of disability worldwide [1]. The complex pathogenesis and pathophysiology of asthma and COPD complicate improvements in the early detection of exacerbations and clinical outcomes.
To provide insight into the nature of these diseases, and help to better diagnose, phenotype and treat patients, several biomarkers have been studied, such as fractional exhaled nitric oxide (FeNO) [2], sputum eosinophils [3], blood eosinophils [4], blood neutrophils [5], fibrinogen [6] and soluble receptor of advanced glycation end products (sRAGE) [7]. Even with the currently available diagnostic and prognostic tools and well-recognised guidelines, the personal and societal burden of asthma and COPD has remained, which calls for continued care and optimisation of these conditions [8, 9]. Therefore, the search continues for new tools and techniques for the improved detection, stratification and monitoring of asthma and COPD.
The lung's ability to provide noninvasive biological samples in relatively large amounts directly from the organ (exhaled air) makes it an appealing source for potential biomarkers. Volatile organic compounds (VOCs) can be detected in exhaled breath and they originate from systemic, as well as local, metabolic, inflammatory and oxidative processes [10]. Comprehensive analysis of these VOCs provides opportunities for noninvasive biomarker development.
Over the past few years, exhaled VOCs and breath signatures have gained increasing attention and there is increasing evidence on diagnosing, phenotyping and monitoring of pulmonary diseases [11–18]. Hundreds of VOCs, which are carbon-based, low molecular weight compounds that are volatile at room temperature, have been reported [19]. Typically, exhaled VOC analysis is conducted using mass spectrometry driven techniques and/or chemical sensor-driven devices, which are often referred to as electronic noses or, in short, eNoses [20]. A mass spectrometry method enables the detection of individual VOCs, which is particularly helpful in understanding the underlying biological mechanisms and pathophysiology [21]. However, eNose systems work by combining multiple cross-reactive chemical sensors to generate a composite signal [22]. Subsequently, pattern recognition-based analysis strategies are applied to provide probabilistic classifications based on differences between those composite signals. Detection of those differences could be useful for diagnostic purposes in cross-sectional settings. However, eNoses can detect changes in VOC patterns over time too, which may be indicative of disease progression or response to treatment [23, 24]. High technical expertise is required for mass spectrometry analysis, and the data produced must be pre-processed before statistical analysis can be undertaken. eNoses require fewer data processing steps, which, combined with their ability to provide real-time responses, makes them very suitable for point-of-care diagnostics.
Integrating both VOC detection techniques in one (research) pipeline is very attractive. During a biomarker discovery phase, mass spectrometry techniques enable the identification of new VOC-based markers. When the target compounds are defined, a selection of the most suitable chemical sensors can be made and the development of a fit-for-purpose eNose, and further translation into a clinical setting, can start. The continuous advancement of chemical sensors enables increased translation of this technology into clinical settings [25].
The noninvasive, and (virtually) inexhaustible, source of exhaled air has a drawback. An air sample does not only contain endogenous VOCs, but also exogenous VOCs that arise from the environment and diet or daily life activities. These exogenous VOCs could be potential confounders when studying underlying disease processes and may have a significant impact on exhaled VOC-driven biomarkers. Therefore, one must understand how possible confounding factors, i.e. risk factors for the development of obstructive pulmonary diseases, affect the VOC profiles in the exhaled breath. Several studies have established the effects of smoking [26, 27], a significant factor affecting breath content. Other possible confounding factors include environmental factors [28], occupational exposure [29], gender [30], age [31], exposome [32] and breathing routes [33]. In addition, Krilaviciute et al. [34] demonstrated the significance of diet and lifestyle as confounders in breath VOC analyses.
In this systematic review, we aim to compile a comprehensive list of exhaled VOCs detected by mass spectrometry-based methods linked to risk factors for asthma and/or COPD. These results may facilitate the discrimination of identified VOCs that are the result of underlying asthma and COPD disease processes, from VOCs that are markers of risk factors for these conditions.
Methods
Search strategy and information sources
A systematic search in PubMed, Embase and Cochrane Libraries using the PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) approach was conducted from 1 January 2000 until 1 February 2022 (CRD42022311655) (the full protocol is available on www.crd.york.ac.uk/PROSPERO/). The primary review question was to find VOCs associated with risk factors for obstructive pulmonary diseases and clarify whether VOCs can discriminate between these risk factors. These included risk factors were based on a systematic umbrella review by Holtjer et al. [35] which studied risk factors for chronic respiratory diseases, ranging from previous disease history to environmental, occupational or lifestyle factors. The review examined 14 risk factors: (e-cigarette) smoking, environmental exposure (air pollution, dust, cleaning and disinfection agents, pets), occupational exposure, diet (soft drink consumption, red meat intake), allergy history, body mass index (BMI), farming, oral breathing, socioeconomic status, early menarche, Staphylococcus aureus enterotoxins, acetaminophen intake, gender and age. We extracted data on how these results were verified in different studies and the techniques used for sample collection, sample processing and statistical analysis.
The search was conducted with a combination of the following terms: breathomics, VOCs, exhaled breath, breath test and risk factors for obstructive pulmonary diseases. Further details on the methodology and search terms can be found in the supplementary materials.
Eligibility
Studies were included if they: 1) were full-length studies published in English using primary data; and 2) measured VOCs in human exhaled breath (by any collection or mass spectrometry method). Editorials, abstracts, reviews, studies using secondary data and studies of exhaled breath condensate were excluded.
Study selection
Study selection followed the PRISMA 2020 statement [36] and is depicted in a flow chart (figure 1). After the removal of duplicates, two reviewers (S. S. Khamas and A. Bahmani) independently assessed the titles and abstracts of all records retrieved by the literature search in order to identify possibly relevant research. The reviewers screened the submissions using Rayyan, a web-based software platform for systematic reviews (https://rayyan.ai/). Any disagreement was resolved with a discussion with a third reviewer (P. Brinkman).
Data extraction
Data were extracted from the full texts and included information on study design, methodology, participant demographics, types and prevalence of VOCs, statistical analysis, p-values and description of VOC identification features.
Results
Search results
The search strategy returned 2865 records; following the removal of duplicate entries, 2209 articles were screened further. After title and abstract review, 2124 were excluded, leaving 85 for full-text screening (figure 1). Reference list searching yielded one paper which was further screened. Based on the inclusion/exclusion criteria, we included 24 articles in the qualitative synthesis. Based on the original 14 risk factors for asthma and/or COPD reported by Holtjer et al. [35], eight relevant topics were described in the remaining papers and included in this study: (e-cigarette) smoking, gender, occupational exposure, diet (food), oral breathing, age, BMI and environmental exposure.
Study characteristics
In the 24 articles included in the study, 4672 subjects were studied; 4504 of whom were healthy individuals. Most subjects (4490) were adults, and two studies studied paediatrics (182). The biggest and the smallest study populations were 1447 and 7, respectively. Most studies (71%) included cross-sectional designs, and 29% were case–control studies. Included studies were published between 2002 and 2021, and 52% were published in the last 5 years. The studies included [26–31, 33, 34, 39–43, 46, 48–51, 66–71] are summarised in table 1 and table 2.
In the 24 included studies, the methodologies used were heterogeneous, from sample collection to statistical analysis. Given this heterogeneity and the variety of compounds in exhaled breath, we determined a narrative study synthesis.
Reported VOCs
Table 1 of the online supplementary material provides an extensive list of discriminant VOCs categorised by chemical families that were discovered to be relevant for any of the studies included in the qualitative synthesis. Table 3 shows the VOCs that are common between different risk factors or reported in two or more independent studies. A total of 232 individual VOCs were found in the included studies; 58 compounds were reported in more than one study. Nineteen VOCs were reported in five studies as markers of gender. BMI and occupational exposure were studied twice with 11 and 15 reported VOCs, respectively. Smoking was the risk factor that was most commonly linked to VOCs; 156 VOCs were identified as indicators of smoking. The following VOCs linked to smoking were reported in three or more independent studies: p-xylene, 1,3-butadiene, toluene, pyridine, 2,4-dimethylfuran, 1,3-cyclopentadiene, 2,5-dimethylfuran, benzene, 2-butanone, 2-ethylfuran, 2,4-hexadiene and o-xylene. Also, 34 e-cigarette-related VOCs were reported in two studies in which isoprene, trichloroethylene, d-Limonene, 3-methylheptane, toluene and benzaldehyde were common with smoking. Likewise, 26 VOCs were related to diet, and the following were the most commonly reported, 2-ethylhexanol, α-pinene, cyclopentane, undecane, menthol and dodecane. One study included the effects of ageing and environmental exposure, whereby three VOCs were reported for each risk factor. Similarly, one study studied oral breathing and reported the effect of this factor on acetone, hydrogen sulphide, allylmethyl sulphide and 1-methylthyopropene. The most frequent compound families were hydrocarbons (39%), aromatic hydrocarbons (16%), nitrogen compounds (12%) and ketones (9.7%).
Breath sampling
Included studies differed in sampling portion and breath collectors (table 4). Mixed expiratory breath was used in 46% of the studies included in this systematic review. Many studies used an inert polymer, usually a Tedlar® bag followed by an aluminised bag (Mylar®), to collect breath samples, while one used a glass syringe. The BIO-VOC® and QuinTron® collectors were used to a lesser degree. Chen et al. [39], Filipiak et al. [27] and Gaida et al. [40] implemented their own devices to selectively sample air from the lower respiratory tract.The most frequent setup among the included studies was employing off-line pre-concentration sampling methods in conjunction with gas chromatography-mass spectrometry (GC-MS).
Data handling
As presented in table 1 and table 2, approaches to data processing and pre-processing varied, both in the techniques used and the extent to which they were reported. As a result of the high dimensionality of breathomics datasets (more peaks measured than the number of subjects in the study), statistical analyses can be challenging. Most articles conducted untargeted analyses in which no substances were identified a priori. With large data sets, these types of analyses are prone to overfitting. The resultant VOC models then require validation, without which the model's performance cannot be considered accurate. Cross-validation within a study may impart rigour; however, the small sample size of many of the studies included may have limited the degree of rigour this process imparts. Six studies mentioned undertaking some form of validation; only three used a new subject group as a validation group [41–43].
Critical appraisal
Table 5 summarises the results of the NOS scoring system for the risk of bias assessment of the included studies. Analysing the risk of bias revealed concerns across domains of comparability and with patient selection posing the greatest risk of bias. The majority of studies did not clearly indicate whether a random sampling of patients was used. There was minimal risk of bias in the ascertainment of exposure, even though, to date, there is no validated measurement tool for VOC analysis. All studies described their tools and methods. No major concerns regarding outcome/exposure were highlighted; outcome assessment was subjective (self-reported) and the nonresponse rate was not applicable in the majority of studies. Statistical analysis of VOC data is also a potential source of bias, because of a lack of consensus regarding the best statistical approach to analyse multivariate data and their dependence on internal validation [44]; however, it is challenging to improve this, since analytical procedures (e.g. GC-MS) and statistical approaches (e.g. principal component analysis (PCA)) often require study specific optimisation.
Discussion
The application of breathomics is accompanied by the presence of confounding factors. In this systematic review, we summarised and discussed individual VOCs associated with risk factors for obstructive pulmonary diseases, as well as the various methods and technologies reported in recent years for exhaled breath analysis. The papers included in this review reported 232 individual VOCs associated with risk factors for asthma and/or COPD in total.
Out of 24 included studies, 14 evaluated the effect of smoking on the contents of exhaled breath. The most frequently reported VOCs were 2,5-dimethylfuran, benzene, toluene and xylenes. Benzaldehyde, toluene, methylheptane, trichloroethylene and d-Limonene were common with e-cigarettes. The production of hydrocarbons in the human body is mainly affected by the mechanism of oxidative stress. Since smoking induces oxidative stress in the body [45], it is not surprising that hydrocarbons and aromatic hydrocarbons, such as benzene, toluene and xylenes, were the most common compound family for smoking-related VOCs. Alonso et al. [46] reported that 2,5-dimethylfuran was a specific breath biomarker of smoking status independent of smoking habits. Benzene was a marker for lower cigarette consumption, and toluene and xylenes were able to detect recent heavy smoking. According to Filipiak et al., measurements of these aromatic hydrocarbons revealed lower amounts in exhaled breath from nonsmokers than in indoor air, showing that the consumption of these compounds is primarily tied to the smoking habit [27].
Recent food consumption, and diet in general, prior to breath testing affect exhaled VOC composition [47]. Five studies examined the effect of diet (food) on VOCs and showed that diet could change concentrations of exhaled VOCs when switching to ketogenic [48], whole grain [49], gluten/gluten-free [50] and high-fibre/low-fibre diets [51]. These changes may be related to changes in the gut microbiome since this is one of the sources of bodily-generated VOCs [52]. α-Pinene, menthol, undecane and toluene were the most commonly reported VOCs. Hydrocarbons and alcohols were the dominant family compound in VOCs related to diet. Alcohols may enter the blood through diffusion after consumption of food and beverages [53], but they can also be generated through hydrocarbon metabolism [54]. Krilaviciute et al. [34] detected α-pinene in individuals who frequently ate beef, leeks, garlic, legumes, fish, meat, porridge and pickled products in their diet (based on the median of yearly food consumption frequency).
Five studies evaluated the association between gender and VOCs. Toluene was reported in two studies as a marker of gender, where Blanchet et al. [31] reported that its presence is likely to be related to occupational exposure. Since metabolism differs according to gender, variances in breath contents might be anticipated. In this group of studies, hydrocarbons and aldehydes were the most common compound families. Aldehydes are considered essential components of certain physiological processes in the body [53].
Environmental or occupational exposures based on direct inhalation or dermal exposure may have an effect on the exhaled breath. The upper airways will retain a portion of inhaled substances, while the lung will receive another portion for further mass transfer, distribution and metabolism. Three included studies evaluated the effect of such exposure on VOCs. Based on the results of this group, hydrocarbons were the most common compound family. Decane was reported in two studies as a marker of occupational exposure (firefighting and metal-casting workshops). This is presumably because the elevation in straight-chain alkanes, such as decane, may be linked to oxidative stress brought on by exposure and lipid peroxidation in the lung cell membranes [27, 54].
The effect of confounding factors on VOCs should be carefully considered before linking them to the disease of interest. We found overlap between VOCs linked to risk factors that were also reported as potentially relevant as clinical markers (figure 2). In this systematic review, p-xylene and o-xylene were reported as markers of smoking and high BMI, but in another study, they were discriminative VOCs between controls and patients with COPD [40]. Similarly, in one study, higher amounts of undecane were found in subjects with paucigranulocytic asthma and it was reported as a marker of this phenotype [15]. Van Berkel et al. [55] reported a model containing undecane and 12 other VOCs that was able to classify controls and patients with COPD. In another report, 2-pentanone and 2-butanone were reported as markers of COPD [40]; however, 2-butanone was also related to asthma [11]. In Cazzola et al.’s [56] study, decane was positively correlated with COPD and was increased in frequent COPD exacerbators. However, in two other studies, it was reported as a marker of asthma [57, 58]. Two VOCs related to smoking, toluene, and benzene were reported in several studies as markers of asthma [11, 59–61] or COPD [40, 62]. Also, among VOCs related to smoking, octane and heptane were found to be markers of COPD/asthma [63, 64] and COPD [17], respectively. Two VOCs related to food consumption, α-pinene and acetaldehyde, were positively correlated with COPD [63]. The presented results will support the further development of exhaled markers and will help to make a better distinction between VOCs related to risk factors (contaminants) and VOCs that can be used as future biomarkers.
Alongside the positive results of all included studies, there is considerable heterogeneity among them. The studies used a variety of collection methods and breathing manoeuvres. Expiratory flow rate and breath-holding duration were found to have a substantial impact on the exhaled breath [65]. Most studies recorded the alveolar breath after filtering out the dead space air. Additional investigation is necessary to decide whether it is essential to filter the dead space air. Unless undertaking online analysis, the storage media should also be taken into account once breath has been captured. Furthermore, increased utilisation of external validation strategies and reporting according to internationally accepted guidelines could help to develop VOCs based biomarkers towards clinical application.
Points for clinical practice
The noninvasive nature, low patient burden and ability to directly sample from the target organ make the adoption of breathomics in real-world clinical practice attractive. However, clinical implementation has proved to be challenging. Our study has shed light on different factors affecting the VOC content of exhaled breath that are valuable for obstructive pulmonary diseases. In future studies, the VOCs in table 3 should be taken into account when interpreting results from exhaled breath analysis. Our results suggest that a more detailed analysis of potential covariates in breath analysis is required before moving onto developing an advanced analysis of biomarker detection. Further research should target the differences in exhaled breath profiles to eliminate the risk factor-specific compounds mistakenly attributed as future biomarkers. To determine whether these putative VOCs have causal evidence as risk factors for obstructive pulmonary diseases, a follow-up sensitivity analysis should be performed. Relevant sensitivity analyses are able to establish whether results are robust and not due to other factors.
External validation of breath biomarkers in independent populations is important and helps to produce reliable predictions that can be reproduced in other clinical settings. Our review only shows a small overlap between specific biomarkers reported by various groups, which could be partially explained by the differences in methodology and reporting tools. The important step towards establishing a breathomics-based biomarker in clinical settings is to regulate practices, including agreed common standardised operating procedures for breath collection, storage and analysis, which is common practice when developing biomarkers (ISO 13485 and the United States Food and Drug Administration regulations).
Questions for future research
Methods based on exhaled breath analysis are promising for biomarker detection. Study designs must consider the best way to assess VOCs. Future studies (trials) should focus on studying the effect of risk factors on exhaled breath VOCs. It is necessary to further investigate the pathways of putative biomarkers in order to provide exhaled breath markers with high sensitivity and specificity. Studying the origin and mechanisms of the generation of VOCs would help justify the clinical importance of potential breath biomarkers found in the future. Targeted analysis can reduce the possibility of falsely detecting associations and potentially validate these exhaled biomarkers. Also, in order to further implement breath analysis in clinical practice, the reliability of VOCs must be demonstrated in large-scale studies.
Limitations
To find all research that included breath VOCs analysis, we used a comprehensive and all-encompassing search technique. The risk of bias may be negatively impacted by this method, particularly for cross-sectional studies due to the limitations of NOS. Most studies included a small number of participants and only two included children. The majority of the studies did not include external validation of the results, further limiting the generalisability of the findings. Moreover, the number of published studies, including breath VOCs analysis of risk factors for obstructive pulmonary diseases, was relatively low, limiting the comparison between different risk factors.
Conclusion
Here, we presented a systematic review on exhaled VOCs associated with risk factors for asthma and/or COPD. Although this review's findings are promising, they still require independent validation. Lager study samples, recognition of important confounders, and methodological and analytical standardisation will reduce inter-variability (decreasing the chance of noncomparable data) among studies and produce robust results. Identification of these confounding VOCs will support further development of exhaled metabolite-based point-of-care tests, smoothing the transition to clinical implementation.
Supplementary material
Supplementary Material
Please note: supplementary material is not edited by the Editorial Office, and is uploaded as it has been supplied by the author.
Supplementary material 00143-2023.SUPPLEMENT
Acknowledgements
We would like to thank Reza Shahbazi Khamas for helping us illustrate the main figure of this paper.
Footnotes
Provenance: Submitted article, peer reviewed.
Conflict of interest: S. Shahbazi Khamas and A.H. Alizadeh Bahmani have nothing to disclose. S.J.H. Vijverberg has received research funding from the Dutch Lung Foundation, she is the chair of Young Investigators of Netherlands Respiratory Society (unpaid) and she is early career member representative of the Pediatrics Assembly of the European Respiratory Society (unpaid). P. Brinkman has received funding from Amsterdam UMC, Vertex, Stichting Astma Bestrijding, Boehringer Ingelheim, Eurostars and the Horizon Europe Framework Program. A.H. Maitland-van der Zee received unrestricted research grants from Vertex and Boehringer Ingelheim, received funding from the Dutch Lung Foundation and Stichting Astma Bestrijding, and an Innovative Medicine Initiative 3TR research grant; received consulting fees (paid to institution) from AstraZeneca and Boehringer Ingelheim, and received honoraria (paid to institution) for lectures by GSK. She is also the principal investigator of the P4O2 (Precision Medicine for more Oxygen) public–private partnership sponsored by Health Holland involving many private partners who contribute in cash and/or in kind. Partners in the P4O2 consortium are the Amsterdam UMC, Leiden University Medical Center, Maastricht UMC+, Maastricht University, UMC Groningen, UMC Utrecht, Utrecht University, TNO, Aparito, Boehringer Ingelheim, Breathomix, Clear, Danone Nutricia Research, Fluidda, MonitAir, Ncardia, Ortec Logiqcare, Philips, Proefdiervrij, Quantib-U, RespiQ, Roche, Smartfish, SODAQ, Thirona, TopMD, Lung Alliance Netherlands and the Lung Foundation Netherlands (Longfonds). The consortium is additionally funded by the PPP Allowance made available by Health∼Holland, Top Sector Life Sciences and Health (LSHM20104 and LSHM20068), to stimulate public–private partnerships and by Novartis), and she is the president of the Federation of Innovative Drug Research in the Netherlands (FIGON) (unpaid) and president of the European Association of Systems Medicine (EASYM).
- Received March 6, 2023.
- Accepted April 21, 2023.
- Copyright ©The authors 2023
This version is distributed under the terms of the Creative Commons Attribution Non-Commercial Licence 4.0. For commercial reproduction rights and permissions contact permissions{at}ersnet.org