Abstract
Rationale Acute respiratory distress syndrome (ARDS) is currently diagnosed by the Berlin definition, which does not include a direct measure of pulmonary oedema, endothelial permeability or pulmonary inflammation. We hypothesised that biomarkers of these processes have good diagnostic accuracy for ARDS.
Methods Medline and Scopus were searched for original diagnostic studies using minimally invasive testing. Primary outcome was the diagnostic accuracy per test and was categorised by control group. The methodological quality was assessed with QUADAS-2 tool. Biomarkers that had an area under the receiver operating characteristic curve (AUROCC) of >0.75 and were studied with minimal bias against an unselected control group were considered to be promising.
Results Forty-four articles were included. The median AUROCC for all evaluated tests was 0.80 (25th to 75th percentile: 0.72–0.88). The type of control group influenced the diagnostic accuracy (p=0.0095). Higher risk of bias was associated with higher diagnostic accuracy (AUROCC 0.75 for low-bias, 0.77 for intermediate-bias and 0.84 for high-bias studies; p=0.0023). Club cell protein 16 and soluble receptor for advanced glycation end-products in plasma and two panels with biomarkers of oxidative stress in breath showed good diagnostic accuracy in low-bias studies that compared ARDS patients to an unselected intensive care unit (ICU) population.
Conclusion This systematic review revealed only four diagnostic tests fulfilling stringent criteria for a promising biomarker in a low-bias setting. For implementation into the clinical setting, prospective studies in a general unselected ICU population with good methodological quality are needed.
Abstract
Accuracy of diagnosis of acute respiratory distress syndrome (ARDS) is associated with risk of bias. There is a lack of validated diagnostic tests in an unbiased setting, emphasising the need for quality driven diagnostic research in ARDS. https://bit.ly/2GfPAvf
Introduction
Acute respiratory distress syndrome (ARDS) is characterised by the acute onset of non-cardiogenic pulmonary oedema and hypoxaemia and is associated with high mortality and morbidity [1, 2]. The combination of increased permeability of the endothelium and injury to the alveolar epithelium results in protein-rich alveolar fluid [3]. Procoagulatory and inflammatory proteins and metabolites of oxidative stress are abundant in the alveoli of ARDS patients [1]. Translational evidence suggests that lung injury can be initiated via alveolar inflammation as well as endothelial injury [3].
ARDS is currently diagnosed by means of the Berlin definition [2, 4]. The criteria utilise information that is commonly available at the bedside: it captures hypoxaemia via arterial oxygen tension (PaO2)/inspiratory oxygen fraction (FIO2) and alveolar oedema via bilateral opacities on chest radiography. Interpretation of chest images gives inconsistent results, which makes diagnosing ARDS subjective and challenging [5]. But even without this limitation, chest radiography can only diagnose alveolar oedema after onset, and it will never identify the molecular mechanism resulting in alveolar oedema.
A diagnostic test that captures extravascular lung water earlier or that identifies the pathophysiological mechanisms resulting in lung injury is likely to lead to better diagnostic accuracy for the diagnosis of ARDS. Biomarkers are objective, can be derived from plasma, urine, bronchoalveolar lavage fluid or breath with minimally invasive methods, and can be reflective of key biological pathways known to be involved in ARDS development [3]. A clear diagnostic biomarker could help clinical decisions in several matters: 1) earlier recognition of pulmonary oedema can help to prevent fluid overload; and 2) identification of the biological pathways resulting in lung injury can inform targeted treatments in personalised medicine randomised controlled trials. Up to now, a clear diagnostic biomarker has not been identified [6], and because the accuracy of a biomarker may be biased by the quality of the performed study, it is important to evaluate potential bias when reviewing the currently available evidence.
The aim of this review is to give an overview of the minimally invasive diagnostic tests available for assessing the pathogenesis of ARDS in order to achieve an early and objective diagnosis of ARDS in patients on the intensive care unit (ICU). We hypothesised that tests for 1) pulmonary oedema, 2) endothelial permeability, 3) pulmonary inflammation, 4) coagulation and 5) oxidative stress have good diagnostic accuracy and that the diagnostic accuracy is lower in well-conducted studies than in biased studies.
Methods
Search
A systematic review following preferred reporting items for systematic reviews was performed [7]. We searched Medline and Scopus for potentially relevant articles up to June 9, 2020. The following terms were used: ARDS, acute lung injury (ALI), inflammation, biomarker, cytokine, breath, oedema, lung water, diagnosis, diagnostic, human and adult. The exact search can be found in the supplementary material. Two researchers (LB and LH) independently reviewed the abstracts and/or full text manuscripts and selected relevant articles. Disagreements were resolved in a consensus meeting. The review protocol is registered at PROSPERO (www.crd.york.ac.uk/prospero, CRD42020186974).
Selection criteria
Inclusion criteria were 1) original research with a diagnostic purpose that 2) reported the diagnostic accuracy of 3) a minimally invasive test for 4) pathophysiological mechanisms of ARDS 5) comparing patients having ARDS with relevant other patients. A relevant control group was defined as patients at risk for ARDS, for example receiving mechanical ventilation or respiratory support. Studies with a focus on treatment or prediction of ARDS were excluded. Other exclusion criteria were articles not available in English, animal or preclinical studies, studies in children, and unclear reference or index test. Finally, studies were excluded if the primary endpoint, diagnostic accuracy of the index test, was not available and could not be inferred from the data. Details are given below.
Reference test
The first American–European consensus criteria (AECC) date back to 1994 [8]. Studies from 1994 until 2012 were included if they used the applicable AECC definition or criteria that were closely related. Patients within the category of “acute lung injury” were included in this review as having ARDS, since in 2012 the newly introduced Berlin definition included this group as mild ARDS. For studies from 2012 onward the Berlin ARDS definition was used as the reference test [2].
Index test
The index tests were categorised into the following domains, regarding the pathophysiological mechanisms: 1) endothelial permeability, 2) pulmonary oedema, 3) inflammation, 4) coagulation or 5) oxidative stress. The index test should assess one of the pathophysiological mechanisms of ARDS, so studies looking into diagnostic tools based on cardiac function or, for example, terms in the electronic health record were excluded. Second, the tests were categorised based on the sample material: plasma, breath, alveolar fluid or other. The limit for invasiveness of the test was set at performing a bronchoalveolar lavage procedure; all tests more invasive than this method were excluded. Effectively, this excludes any type of biopsy. Finally, index tests were categorised on diagnostic accuracy: a potentially clinically relevant diagnostic accuracy was defined as an area under the receiver operating characteristic curve (AUROCC) of >0.75.
Outcome and data extraction
The primary outcome was the AUROCC of the diagnostic test. If not available, sensitivity and specificity were used in a secondary analysis. In case both results were not reported, and the paper included a figure with individual data points, we extracted the data from the figure and recalculated the AUROCC, sensitivity and specificity. If this was unsuccessful, the study was excluded. The study population was categorised into: 1) general ICU patients, 2) cardiopulmonary surgery or cardiac ICU patients, 3) sepsis patients or 4) highly selected populations (such as only trauma patients or organ transplant patients). The control group was categorised into: 1) unselected ICU patients, 2) patients with cardiopulmonary oedema (CPE) and 3) patients with (suspected) pneumonia.
Methodological assessment and categorisation
The methodological quality of each article was assessed with the QUADAS-2 tool [9]. Risk of bias was assessed concerning patient selection, blinding and use of index test, blinding and use of reference test, and regarding patient flow. Timing of the test was considered to have a low risk of bias when the index test and reference test were performed on the same day or subsequent day. All tests that were performed later were classified as having a high risk of bias. For the assessment of the overall methodological quality of the papers, a cumulative score was calculated. The risk and concern scores were classified as follows: “High” 1 point; “Unclear” 0.5 points; “Low” 0 points, resulting in a cumulative score between 0 and 6. Based on the cumulative score, studies were categorised into tertiles: “Low”, “Intermediate” and “High” biased studies with the following cut-offs: Low: ≤1.5; Intermediate: >1.5 and ≤2.5; High: >2.5 points.
Statistical analysis
The AUROCC was summarised for each index test (so one study investigating multiple tests would provide multiple AUROCCs) and stratified for the following domains:
- pathophysiological processes: endothelial permeability, pulmonary oedema, inflammation, coagulation or oxidative stress
- population: general ICU, sepsis, cardiac care unit (CCU) or a specific group
- control group: unselected ICU, CPE or pneumonia
- sample material: plasma, breath, alveolar fluid or other
- quality of the study: low, intermediate or high risk of bias
Subsequently, the AUROCC was compared between the groups with one-way ANOVA. Significant results, defined as a p-value <0.05, were further studied using post hoc analysis with pairwise t-tests. The influence of the processes resulting in changes in biomarker concentration, such as tested material, pathophysiological mechanism, population and control group on the association between bias and diagnostic accuracy, was evaluated using two-way ANOVA. For studies that reported sensitivity and specificity, meta-analysis of diagnostic accuracy was performed using the mada package to visually confirm any association that was found for AUROCC in the primary analysis [10]. All analyses were performed in R version 3.6.1 using the R-studio interface.
Results
The Medline and Scopus searches were last updated on June 9, 2020 and revealed 1096 articles, of which 958 remained after removing duplicates (figure 1). Title screening resulted in 143 eligible articles, of which 52 remained after reading the abstracts. After reading the full texts, 44 articles were included (figure 1 and table 1). Assessment of the included articles yielded a total of 84 index tests, including 68 different types of tests. Plasma biomarkers were most frequently studied (48 out of 84 (57%)). Categorisation based on pathophysiological mechanisms led to the following numbers: 39 tests for inflammation, 20 for endothelial permeability, 15 for pulmonary oedema, eight for oxidative stress and two for coagulation (supplementary table S1). The following populations were included in the studies: 29 studies with general ICU population (66%), nine studies with a specific population (20%), five studies with sepsis patients (11%) and one study with a CCU population (2%). The control group consisted of patients with CPE in 11 studies (25%) and pneumonia in two studies (5%). The other studies included a cohort of ICU patients that did not have CPE or pneumonia specifically (70%).
Flowchart of article selection.
Included studies
Accuracy for diagnosing ARDS
For 74 of the 84 tests (88%) the AUROCC was available. The median AUROCC was 0.80 with an interquartile range (IQR) from 0.72 to 0.88. A good diagnostic accuracy (AUROCC >0.75) was shown in 47 of the 74 tests (64%), spread over all different processes associated with ARDS development.
Diagnostic accuracy was higher in tests comparing ARDS patients with CPE patients (median AUROCC: 0.89, IQR: 0.81–0.93) than in ARDS patients compared to unselected ICU patients (median AUROCC: 0.78, IQR: 0.71–0.84, p=0.0095). The AUROCC was not different between studies with a control group of pneumonia patients compared with unselected ICU patients (p=0.82) or between pneumonia patients compared with CPE patients (p=0.14; figure 2). No differences in AUROCC were found for the type of studied pathophysiological mechanism (p=0.76), the studied biological material (p=0.51) and the population (p=0.60).
Association between diagnostic accuracy and risk of bias, stratified for the control group. Each dot represents a diagnostic test; multiple tests could be evaluated per study. Area under the receiver operating characteristic curve (AUROCC) was higher in tests comparing acute respiratory distress syndrome (ARDS) patients with cardiopulmonary oedema (CPE) patients, then in ARDS patients compared with general intensive care unit (ICU) patients (p=0.0095). The AUROCC was not different between studies with control group of pneumonia patients compared to unselected ICU patients (p=0.82) or between pneumonia patients compared to CPE patients (p=0.14). a) ICU all; b) CPE; c) pneumonia.
Sensitivity and specificity were available or could be calculated for 46 out of 84 (55%) studies. Similar patterns regarding the influence of the type of test, the studied biological material, the population and the control group were found when these studies were evaluated based on a single cut-off (supplementary figures S1–S4).
Assessment of bias in study methodology
The methodological quality as assessed by the QUADAS-2 tool is shown in table 2. The final score varied among studies, with a cumulative score with a median of 2.0 (IQR: 1.5 to 3.00). Categorisation into tertiles based on the cumulative score led to 14 studies in the “Low” bias category, 17 studies in the category “Intermediate” and 13 studies in the “High” category (table 2). The risk of bias was most frequently observed for patient selection, blinding of the index test and in the timing of the index test.
Bias assessment
Association between bias and diagnostic accuracy
The risk of bias of the study was associated with the diagnostic accuracy of the index test (p=0.0023). The median AUROCC was 0.75 (IQR: 0.69 to 0.82) for low-bias studies, 0.77 (IQR: 0.72 to 0.88) for intermediate-bias studies and 0.84 (IQR: 0.79 to 0.90) for high-bias studies (figure 3). Based on the pairwise comparison, the AUROCC was significantly higher for studies in the high-bias category, compared to the intermediate- and low-bias category (p=0.020 and p=0.00077, respectively). Two-way ANOVA showed that this association was consistent after correction for the type of test (p=0.0027), sample material (p=0.0026), population (p=0.0026) and control group (p=0.0011). Figure 3 shows the diagnostic accuracy per test after stratification for risk of bias. The other comparisons are visualised in supplementary figures S5–S7. The same trend was visible in the analysis of sensitivity and specificity with respect to the risk of bias (supplementary figures S1–S4).
The risk of bias of a study was associated with the diagnostic accuracy of the evaluated test. Each dot represents a diagnostic test; multiple tests could be evaluated per study. The area under the receiver operating characteristic curve was significantly higher in the group with high risk of bias, compared to the intermediate-bias and low-bias groups (p=0.020 and p=0.00077, respectively). AUROCC: area under the receiver operating characteristic curve.
Low-bias studies with good diagnostic accuracy
Nine tests showed a good diagnostic accuracy in the low-bias group. Of these, five compared ARDS versus CPE and four compared ARDS versus the general ICU population. The studies comparing ARDS versus ICU patients measured biomarkers in plasma and metabolites in exhaled breath. The plasma biomarkers assessed were Club cell protein 16 (CC16) and soluble receptor for advanced glycation end-products (sRAGE), assessing inflammation and permeability. In exhaled breath a panel with three metabolites and a panel with nine metabolites were assessed. The three metabolites describe oxidative stress and the nine metabolites most likely do too, but no clear reporting on this topic was available. The first three studies performed the test on the day of ARDS diagnosis or the day after, providing early information on diagnosis of ARDS. For the last test it was unclear at what time it was performed.
Discussion
When comparing patients with ARDS to patients who are also admitted to the ICU, only four studies yielded a good diagnostic accuracy with a limited risk of potential bias. We identified CC16 and sRAGE in plasma and two exhaled breath tests for biomarkers of oxidative stress as tests that currently have the strongest rationale for further validation. This review provides strong evidence that the diagnostic accuracy of minimally invasive tests for the diagnosis of ARDS is highly dependent on the potential bias of the study and the type of control group that is included.
Diagnostic accuracy varied widely between tests and studies included in this review. We identified that the inclusion of CPE patients as a control group consistently resulted in a higher diagnostic accuracy, suggesting that CPE can be better distinguished from ARDS than ICU patients at risk for ARDS or pneumonia patients. An attractive explanation could be that the test differentiates between protein-rich and hydrostatic oedema, but this explanation was rejected because most tests did not evaluate this phenomenon directly and could still separate these groups. For example, cardiac injury markers are also able to distinguish between CPE and ARDS, but instead of being a relevant test for ARDS, it rather signifies the homogeneity of the CPE population [55]. Importantly, ARDS patients differ from CPE patients in many more aspects than the type of pulmonary oedema alone. For example, ARDS patients showed increased levels of inflammation parameters compared to CPE patients, but this is not necessarily related to ARDS but may be due to an underlying syndrome such as sepsis, pneumonia or pancreatitis. Indeed, when compared to an unselected ICU population with similar risk factors as the ARDS patients, these markers had a lower diagnostic accuracy.
The risk of bias assessed by the QUADAS-2 tool was strongly associated with the diagnostic accuracy of the study. A large part of the studies showed risk of bias due to the method of patient selection, performance and interpretation and timing of the index test. Unfortunately, this relationship and the fact that biased studies are known to overestimate diagnostic accuracy [56] makes it hard to rely on results from studies with a considerable amount of bias. It will be necessary to redo studies with tests showing good diagnostic accuracy but then in a low-bias setting before any firm conclusions can be drawn.
Focusing on studies with good diagnostic accuracy with low risk of bias, nine tests remained. Only four of them compared ARDS to an unselected ICU population. One test assessed the plasma concentration of CC16 [39]. This protein is suggested to protect the lungs against oxidative stress as well as inflammation [57]. However, CC16 is also a marker of increased permeability of the epithelial barrier and therefore seems to be involved in multiple processes of ARDS development [39, 57]. Another test assessed sRAGE in plasma, which is released by lung inflammation and leads to epithelial injury and therefore is a marker of increased permeability [32, 58]. The other two were exhaled breath tests [18, 54], with the major advantage that it can be obtained noninvasively. One test assessed a panel of three biomarkers, octane, acetaldehyde and 2/3-methylheptane, that reflect oxidative stress [18]. Of these three compounds, octane explained most of the diagnostic accuracy. Octane is generated through oxygenation of oleic acid and previous data suggest that ARDS is associated with an increased concentration of oleic acid in the circulation [59]. The other breath test assessed a larger biomarker panel, with nine exhaled breath biomarkers, not clearly reflective of one pathophysiological mechanism [54]. A drawback of exhaled breath tests is the fact that they are experimental and are therefore not directly suitable for clinical implementation. All tests seem to relate to oxidative stress, inflammation and increased permeability in the lungs, which are all known to be important in the early course of ARDS and are related to pulmonary pathophysiology directly.
This is the first review to systematically assess the diagnostic accuracy of minimally invasive techniques for ARDS while considering potential biases of each study. Our analyses show that it is pivotal to evaluate the methodological quality of the study to reveal the confounding factors while interpreting the results. This approach is one of the most important strengths of this study. Furthermore, papers not reporting diagnostic accuracy directly were not excluded when we could deduce the accuracy from figures showing individual data points. To our knowledge, no other study in critical care has utilised this approach up to now. Finally, we did not limit the definition of ARDS to those patients with a PaO2/FIO2 <200 mmHg by including patients who were labelled as “acute lung injury” according to the 1994 AECC definition. Since ARDS nowadays involves a heterogeneous population, of which patients with mild ARDS are a large part, it is important to recognise this group also [60]. Another strength is the exclusion of studies that used healthy volunteers as control group leaving only more relevant control groups and hopefully resulting in a more accurate comparison between similarly ill patients.
The main limitation of this review is the small number of studies that is left in each category after stratification. This sometimes led to groups with few studies, for example only one study assessed the CCU patients and only two studies compared diagnosis of ARDS with patients with pneumonia. With regard to pneumonia, it is questionable if unilateral pneumonia is the appropriate control group for ARDS, as many patients with ARDS have pneumonia and because unilateral and bilateral pneumonia in the ICU have similar outcomes [61]. Furthermore, both studies that compared ARDS to pneumonia scored high on the risk of bias. A second limitation is the fact that the AUROCC was not for all studies reported. Therefore, the analysis was performed in two parts with two different approaches, which yielded similar results. We also acknowledge that the diagnostic tests cannot be categorised into completely distinct groups, for example, there is considerable overlap between markers of oxidative stress and inflammation and our attempted separation of the two is arbitrary. Another limitation of this study is the fact that the definition of ARDS has changed over the years and therefore the “case-definition” is slightly different between studies, which might have confounded the diagnostic accuracy of specific tests. Finally, it should be noted that we assessed multiple diagnostic tests described in a single paper as independent tests, which they potentially are not. To our knowledge, there is no adequate multilevel alternative to study this phenomenon otherwise.
Results of this review show that there is no validated minimally invasive method to diagnose ARDS in an unselected ICU population. Four promising tests were identified in a low-bias setting and these warrant validation. New diagnostic studies should better attempt to minimise bias and should be reported according to Standards for Reporting Diagnostic Accuracy (STARD) guidelines [62].
A diagnostic test does not have to separate ARDS patients perfectly. This is likely impossible due to the biological heterogeneity observed in ARDS patients. Indeed, another way to evaluate these results is to appreciate the heterogeneity that is shown and advocate a personalised approach based on pathophysiological characteristics of each patient shown through the diagnostic tests that are described here [63]. A biomarker may have value when it identifies a phenotype that consistently responds to a specific type of treatment, a so-called treatable trait [64].
Conclusion
There is no minimally invasive diagnostic test for ARDS that is validated in a low-bias setting against an adequate control group. Many studies that evaluated diagnostic tests for ARDS showed risk of bias, which makes it hard to rely on the reported diagnostic accuracy. The plasma concentration of CC16, sRAGE and two panels of oxidative stress biomarkers in exhaled breath did show high diagnostic accuracy in a low-bias setting and warrant external validation. For implementation into the clinical setting, prospective studies in a general unselected ICU population with good methodological quality are needed.
Supplementary material
Supplementary Material
Please note: supplementary material is not edited by the Editorial Office, and is uploaded as it has been supplied by the author.
Supplementary material 00504-2020.supplement
Footnotes
This article has supplementary material available from openres.ersjournals.com.
Members of the DARTS consortium: Amsterdam UMC, Amsterdam, the Netherlands (Laura A. Hagens, Marry R. Smit, Marcus J. Schultz and Lieuwe D.J. Bos); Maastricht UMC, Maastricht, the Netherlands (Nanon F.L. Heijnen, Dennis C.J.J. Bergmans and Ronny M. Schnabel); and Philips Research, Eindhoven, the Netherlands (Alwin R.M. Verschueren, Tamara M.E. Nijsen and Inge Geven).
Conflict of interest: L.A. Hagens has nothing to disclose.
Conflict of interest: N.F.L. Heijnen has nothing to disclose.
Conflict of interest: M.R. Smit has nothing to disclose.
Conflict of interest: M.J. Schultz has nothing to disclose.
Conflict of interest: D.C.J.J. Bergmans has nothing to disclose.
Conflict of interest: R.M. Schnabel has nothing to disclose.
Conflict of interest: L.D.J. Bos reports grants from the Dutch Lung Foundation (young investigator grant, public–private partnership grant and Dirkje Postma Award) outside the submitted work.
Support statement: Lieuwe D.J. Bos is supported by Health Holland via the Dutch Lung Foundation (longfonds) industry–academia partnership and via the Dirkje Postma Award. They had no role in the design, conduct or interpretation of this review.
- Received July 17, 2020.
- Accepted September 18, 2020.
- Copyright ©ERS 2021
This article is open access and distributed under the terms of the Creative Commons Attribution Non-Commercial Licence 4.0.