Abstract
Introduction Continuing inhaled corticosteroid (ICS) use does not benefit all patients with COPD, yet it is difficult to determine which patients may safely sustain ICS withdrawal. Although eosinophil levels can facilitate this decision, better biomarkers could improve personalised treatment decisions.
Methods We performed transcriptional profiling of sputum to explore the molecular biology and compared the predictive value of an unbiased gene signature versus sputum eosinophils for exacerbations after ICS withdrawal in COPD patients. RNA-sequencing data of induced sputum samples from 43 COPD patients were associated with the time to exacerbation after ICS withdrawal. Expression profiles of differentially expressed genes were summarised to create gene signatures. In addition, we built a Bayesian network model to determine coregulatory networks related to the onset of COPD exacerbations after ICS withdrawal.
Results In multivariate analyses, we identified a gene signature (LGALS12, ALOX15, CLC, IL1RL1, CD24, EMR4P) associated with the time to first exacerbation after ICS withdrawal. The addition of this gene signature to a multiple Cox regression model explained more variance of time to exacerbations compared to a model using sputum eosinophils. The gene signature correlated with sputum eosinophil as well as macrophage cell counts. The Bayesian network model identified three coregulatory gene networks as well as sex to be related to an early versus late/nonexacerbation phenotype.
Conclusion We identified a sputum gene expression signature that exhibited a higher predictive value for predicting COPD exacerbations after ICS withdrawal than sputum eosinophilia. Future studies should investigate the utility of this signature, which might enhance personalised ICS treatment in COPD patients.
Abstract
Sputum gene expression may have utility in biomarker development for identifying subjects who are at higher risk of exacerbation after ICS withdrawal https://bit.ly/3gYl2OL
Introduction
Inhaled corticosteroids (ICSs) are commonly used in the treatment of patients with COPD. However, only a subset of COPD patients benefits clinically from ICS, whereas they are ineffective and associated with adverse effects in others. Nevertheless, many patients use them and so far, it has been difficult to determine which patients may safely sustain ICS withdrawal [1, 2]. The Global Initiative for Chronic Obstructive Lung Disease (GOLD) guidelines recommends initiation of ICS treatment in combination with long-acting bronchodilators (LABAs) in COPD patients with a history of recurrent or severe exacerbations, a history of concomitant asthma, or blood eosinophil levels >300 cells·µL−1 [2]. Whether blood eosinophil counts are sufficient to guide ICS positioning in COPD patients is, however, debated [1–4]. Associations between blood eosinophils and clinical outcomes, including exacerbations, are not congruent across all observational cohort studies [1]. Hastie et al. [3] have shown in the SPIROMICS cohort (N=827) that sputum eosinophils are poorly correlated with blood eosinophils but are a better predictor of COPD exacerbations.
Applying RNA-sequencing (RNA-seq) allows us to assess genome-wide expression profiles related to airway inflammation, which could be a useful tool to enhance personalised treatment decisions regarding ICS treatment in COPD. Whole-transcriptomic profiling in sputum represents a promising approach that might provide further insight into the molecular biology related to ICS treatment responses in COPD [5, 6]. Using this approach, Baines et al. [6] identified a sputum gene signature that discriminated inflammatory phenotypes and predicted ICS treatment responses in asthma.
We previously showed in the SYMBEXCO trial that the likelihood of exacerbations after stopping ICS in COPD patients could be predicted by assessing sputum inflammation, particularly sputum eosinophil levels [7]. The SYMBEXCO trial was a randomised study comparing the efficacy of budesonide/formoterol versus prednisolone versus placebo during 2 weeks in COPD patients once they had an acute exacerbation [8]. In the current manuscript, we applied sputum transcriptome profiling (RNA-seq) in the same cohort to explore molecular networks related to exacerbations after ICS withdrawal. Further, we investigated whether an unbiased gene signature could be identified to further improve the prediction of exacerbations after ICS withdrawal, compared to sputum eosinophils. In addition, we performed Bayesian network modelling to explore coregulatory networks related to the onset of COPD exacerbations after ICS withdrawal.
Methods
Study design and sample collection
The study design of the SYMBEXCO trial (ClinicalTrials.gov Identifier: NCT00259779) has been described in detail elsewhere [7]. Briefly, COPD subjects were enrolled in a prospective cohort and followed for multiple consecutive visits until they developed an exacerbation and were randomised to treatment with prednisolone, budesonide/formoterol, or placebo [8]. When subjects were on ICS treatment at the time of enrolment, this was discontinued after their initial study visit. This manuscript presents the analysis of the period from the baseline visit to the time of the first exacerbation of those subjects in whom ICS treatment was withdrawn and who were able to produce an adequate sputum sample at baseline. Subjects who did not experience an exacerbation were followed until the end of their study participation. All participants were current or former smokers without a history of asthma or other significant respiratory diseases. The use of oral corticosteroids was not allowed for at least 4 weeks before inclusion. At each study visit, subjects underwent lung function testing followed by sputum induction with nebulised 4.5% sodium chloride solution for 3×5 min, with an adapted protocol for safety reasons when forced expiratory volume in 1 s (FEV1) was below 1.5 L [9]. An exacerbation was defined by increased breathlessness and at least two of the following symptoms for 24 h or more: increased cough frequency or severity, increased sputum volume or purulence, and increased wheeze, requiring extra prednisolone and/or antibiotics after ICS discontinuation as judged by a medical doctor [10]. The local medical ethics committee approved the study.
Sample processing and sequencing
RNA was isolated using a RNeasy Mini kit (QIAGEN, Venlo, the Netherlands) and cDNA was synthesised and stored at −80°C as described before [8, 11]. A NEBNext Ultra II Library Prep Kit (New England Biolabs, Ipswich, MA, USA) was used to prepare RNA-seq libraries, including fragmentation, end repair, adaptor ligation, and barcoding. DASH was employed to deplete unwanted rRNA sequences. Pre-amplification library preparation was completed in a single batch using an Echo liquid handling robot (Labcyte). RNA libraries were sequenced on a NovaSeq sequencer (Illumina, San Diego, CA, USA) by 150-bp paired-end sequencing. As a quality control step, every sputum sample was sequenced twice. Quality control was conducted on the raw sequence data using FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and RNASeqC [12]. The Spliced Transcripts Alignment to a Reference (STAR) version 2.4.2a was used to align and identify all reads that belong to the human genome [13]. Samples were excluded if they had fewer than 10 000 transcripts with a minimum of eight copies.
Data analysis
Discovery of the gene signature
Data were analysed with R statistical software version 3.6.1 (figure S1). Sputum RNA-seq data were analysed for the association between gene expression level and time to exacerbation after ICS withdrawal, using the likelihood ratio testing method in the edgeR package (R-package version 3.26.6) [14]. The variable ‘time to exacerbation’ (days) was log2-transformed to normalise the distribution and contained the time to first exacerbation or the time of monitoring for participants who did not experience an exacerbation before the end of their study participation (figure S2). We adjusted for smoking status, age, and sex and maintained a genome-wide false discovery rate (FDR) below 0.1 to control for multiple testing [15]. Genes significantly associated with time to exacerbation were assigned to signatures of relatively upregulated and downregulated genes. These signatures were separately summarised by using gene set variation analysis (GSVA) [16], and a gene set enrichment score was calculated per subject. The obtained gene signature enrichment scores (ESs) were compared to each other, time to exacerbation, absolute counts (log10 transformed) of sputum inflammatory cells (eosinophils, lymphocytes, macrophages and neutrophils), and supernatant proteins (eosinophilic cationic protein and leukotriene-B4), using Spearman correlation testing.
Survival analyses
Next, univariate Cox regression analyses were performed to determine the hazards of experiencing an exacerbation after ICS withdrawal, for the ES of a derived gene signature (stratified into tertiles), sputum eosinophil percentages (stratified into tertiles), sex, smoking status, and history of exacerbations. Further, the clinical covariates pack-years of smoking (dichotomised by ≥ or < median), sputum eosinophil percentages (≥ or <3%), and the season of ICS withdrawal (dichotomised by November, December, January as opposed to outside those months) were investigated, which were previously identified as significant hazards in the SYMBEXCO trial [7]. Subsequently, significant hazards (p<0.05) were included in a multiple Cox regression model to adjust for potential confounders.
Bayesian network modelling
To determine coregulatory networks related to the onset of COPD exacerbations after ICS withdrawal, we integrated the normalised (FPKM method) sputum RNA-seq data in a Bayesian network model. The variable “time to exacerbation” was dichotomised (> or < mean) into an early (N=20) and late/nonexacerbation phenotype (n=23). We pre-selected the top 600 genes that were found significant (nominal p<0.05) in our likelihood ratio analysis. Their expression profiles were used as input together with the exacerbation phenotype as well as the demographic covariates age, sex, and smoking status. We built a Bayesian network using CGBayesnet version 7.14.14 in MATLAB version R2017b with the calculation of the exacerbation phenotype as binary primary phenotype and using the exhaustive search option [17].
Evaluation of the predictive value of the obtained gene signature for ICS-induced improvement in lung function in an independent RNA-seq dataset
The predictive value of the obtained gene signature was assessed in an independent RNA-seq dataset including airway epithelial brushings from 16 subjects with stable asthma, before and after treatment with inhaled budesonide, 180 µg twice daily for 8 weeks. Additional details on this study are reported elsewhere [18]. The expression of the signature genes was summarised by using GSVA and the obtained gene signature ESs from samples before treatment with ICS were compared to improvement in FEV1 after treatment, using Spearman correlation testing. Also, the ESs of the gene signatures were compared to each other as well as to blood eosinophil levels. Further, the ICS sensitivity of the expression of the signature genes was assessed, both in SYMBEXCO (before and after 8 weeks of ICS withdrawal) as well as in the independent RNA-seq dataset (before and after 8 weeks of ICS treatment).
Results
RNA-seq expression data from induced sputum samples at baseline were available for 45 participants. Two subjects restarted ICS treatment during study participation on request of their general physician, but without meeting the criteria of an exacerbation and were excluded from further analysis. Table 1 summarises the baseline characteristics of the 43 included participants, who were monitored for a median time of 144 days (interquartile range: 71–244). 38 participants experienced an exacerbation after a median time of 120 days (interquartile range: 69–203).
Association of baseline gene expression with time to exacerbation after ICS withdrawal
In total, 16 094 genes were present with sufficiently large counts (filtering according to the default method in edgeR) to be analysed in association with time to exacerbation after ICS withdrawal. We identified six genes (LGALS12, ALOX15, CLC, IL1RL1, CD24, EMR4P) of which a higher expression prior to ICS withdrawal was associated with a relatively short time to exacerbation, and this set of genes was then further analysed as PRISE #1 (coPd exaceRbation Ics-withdrawal Sputum gEne signature) (genome-wide FDR<0.1) (figure 1 and table S1). In addition, we identified three genes (SEPP1, CCDC152, ALG3) of which a higher baseline expression was associated with a relatively long time to exacerbation, and this set of genes was then further analysed as PRISE #2 (FDR<0.1) (figure 1 and table S1). Subsequently, ESs were calculated for each gene signature.
PRISE #1 and #2 were highly correlated with each other as with time to exacerbation, sputum eosinophil and macrophage cell counts
The ES of PRISE #1 was highly correlated with time to exacerbation (rho= −0.54; p=0.00019) as well as sputum eosinophil (rho=0.39; p=0.01), and macrophage cell counts (rho= −0.39; p=0.0098) (figure 1 and table S2). Further, the ES of PRISE #2 was highly correlated with PRISE #1 (rho= −1.0; p<2.2×10−16) as well as with sputum eosinophil (rho= −0.37; p=0.014), and macrophage cell counts (rho=0.4; p=0.0071) (table S2). Both gene signatures did not correlate with lymphocyte and neutrophil cell counts or eosinophilic cationic protein and leukotriene-B4 levels (table S2).
Monovariate Cox regression analyses
Since both gene signatures were so highly correlated, the hazards of experiencing an exacerbation after ICS withdrawal were only determined for PRISE #1. Cox regression results, regarding PRISE #2, can be found in the online supplementary material. The ESs of PRISE #1 were divided into three equal tertiles and tested in a univariate Cox regression model, where the first tertile served as a reference which was compared to the second and third tertile. The patients in the third tertile (ES ≥0.69) had a higher risk for developing a COPD exacerbation after ICS withdrawal compared with patients in the first tertile (hazard ratio (HR) 3.7, p=0.003, the proportion of the variance explained (R2)=0.202; table 2 and figure 2). Also, the covariates sex (male: HR 0.31, p=0.012, R2=0.12), history of exacerbations (HR 1.5, p=0.046, R2=0.08), season of ICS withdrawal (HR 0.3, p=0.004, R2=0.152), and pack-years of smoking ≥38 (HR 0.49, p=0.038, R2=0.095) were significant predictors, whereas smoking status was not (current smoking: HR 0.68, p=0.024, R2=0.03). Further, sputum eosinophil percentages were tested with two independent cut-offs. Sputum eosinophils ≥3% at the time of ICS withdrawal was associated with a significantly increased hazard (HR 2.3, p=0.021, R2=0.11) of experiencing a COPD exacerbation after ICS withdrawal, representing the cut-off that was previously identified in the SYMBEXCO trial [7]. In addition, sputum eosinophils percentages were divided into three equal tertiles, and patients in the third tertile had a higher risk for developing a COPD exacerbation after ICS withdrawal compared to patients in the first tertile (HR 2.8, p=0.02, R2=0.124).
Multiple Cox regression analyses
Next, we performed a multiple Cox regression analysis, including PRISE #1, sputum eosinophil percentages (dichotomised by < or ≥3%) as well as the covariates sex, history of exacerbations, the season of ICS withdrawal, and pack-years of smoking. In this model, PRISE #1, but not sputum eosinophil level, remained statistically significant, as did sex, history of exacerbation, and the season of ICS withdrawal (table S3). When sputum eosinophil percentages were stratified by tertiles, instead of dichotomised by 3% similar results were obtained. Further, we determined the difference in explained variance (generalised R2) between PRISE #1, and sputum eosinophil percentages in two separate multiple linear models, including either sputum eosinophil percentages or PRISE #1 (table 3). The multiple linear model, including PRISE #1, explained a higher percentage of the variance for experiencing an exacerbation after ICS withdrawal than sputum eosinophils percentages, i.e. 49.5% versus 37.7%.
Bayesian network modelling
The gene network associated with the binary exacerbation phenotype consisted of 86 nodes and 112 edges (figure 3). The exacerbation phenotype was linked to three subnetworks directly connected through IL1RL1, SPRR2E, and TATDN3, respectively. Sex was the only demographic covariate related to binary exacerbation phenotype that was identified in the network analysis, next to gene expression profiles. Here, six out of seven female subjects belonged to the early exacerbation phenotype (figure S5).
Evaluating PRISE #1 and #2 in an independent RNA-seq dataset
The predictive value of PRISE #1 and #2 was assessed in an independent RNA-seq dataset. There was no information on the expression of CD24 as part of PRISE #1 in this dataset. The ESs of PRISE #1 and #2 from samples before ICS treatment were not associated with a change in FEV1 (PRISE #1: rho= −0.28; p=0.47; PRISE #2: rho=0.28; p=0.47) after ICS treatment for 8 weeks (figure S6). However, the strong correlations between the ESs of PRISE #1 and PRISE #2 (rho= −1.0; p<2.2×10−16) as well as with eosinophils levels in blood samples (PRISE #1: rho=0.58; p=0.00054) could be replicated in this independent dataset, which were also identified in the SYMBEXCO dataset (figure S7).
Next, we assessed the ICS sensitivity of the expression profiles of PRISE #1 and #2, both in SYMBEXCO (before and after 8 weeks of ICS withdrawal) as well as in the independent RNA-seq dataset (before and after 8 weeks of ICS treatment). In SYMBEXCO, transcriptional profiles of four out six PRISE #1 genes increased after 8 weeks of ICS withdrawal (CLC, LGALS12, CD24, ALOX15) (table 4). EMR4P and IL1RL1 showed the same trend, but missed statistical significance. The transcriptional profiles of PRISE #2 genes did not change. In the independent RNA-seq dataset, transcriptional profiles of four out six PRISE #1 genes decreased after 8 weeks of ICS treatment (CLC, LGALS12, ALOX15, IL1RL1). EMR4P showed the same trend, but missed statistical significance. The change of transcriptional profiles of PRISE #2 genes was inconsistent, since the expression of ALG3 was decreased, whereas CCDC152 and SEPP1 did not change. In addition, the ESs of PRISE #1 decreased significantly after 8 weeks of ICS treatment (p=0.0042), whereas PRISE #2 showed the opposite trend, but missed statistical significance (p=0.013) (figure S8).
Discussion
We identified nine genes predictive of an early exacerbation after ICS withdrawal. The addition of an unbiased gene signature to a multiple Cox regression model explained more variance in time to exacerbations compared to a model using sputum eosinophils. These findings suggest that sputum gene expression may have utility in biomarker development for identifying subjects who are at higher risk of exacerbation after ICS withdrawal.
PRISE #1 exhibited a high baseline expression, which was associated with a relatively early time to exacerbation. This indicates that a higher expression of the included genes is associated with an early exacerbation risk after ICS withdrawal. Several genes included in this signature have been previously identified to be involved in airway type 2 (T2) eosinophilic inflammation (IL1RL1, LGALS12, EMR4P, and CLC [19–21]). Other pathways of interest represented in the signature include macrophage-related inflammation (IL1RL1, ALOX15 [22]), CD4+ (CLC [23]), and CD8+ T-cell differentiation (IL1RL1, CD24 [24]).
IL1RL1 (interleukin (IL)-1 receptor-like 1) encodes the ST2 receptor for IL-33 cytokine signalling, a key mechanism in airway T2 inflammation [21, 25]. Importantly, the IL1RL1–IL-33 axis does not only reflect T2 inflammation but is also involved in other immune cells and signalling pathways, including regulatory T-cell and macrophage-associated pathways [26]. ALOX15 is expressed in monocyte-derived macrophages and thereby involved in orchestrating the nonimmunogenic removal of apoptotic cells as well as in facilitating inflammation resolution [22, 27]. CD24 is involved in diverse functions of the adaptive immune response, including B-cells, neutrophils, and the regulation of CD8+ T-cell activation through HMGB1-mediated engagement of T-cell RAGE [24, 28]. The CLC protein is a major constituent of eosinophils and is also expressed by basophils and CD4+ CD25+ regulatory T-cells [23, 29]. In asthma patients, its expression was decreased in corticosteroid responders, indicating its potential to predict clinical responses to corticosteroids [30].
Our Bayesian network model identified IL1RL1 to be one of the key hub genes, followed by the remaining genes of PRISE #1. In addition, PRISE #1 correlated positively and negatively with sputum eosinophil and macrophage cell counts, respectively. Therefore, it could be speculated that a cascade, consisting of both eosinophils and macrophages as innate effector cells, is associated with the propensity to exacerbate in COPD patients after ICS withdrawal. Further, it is tempting to speculate that anti-IL1RL1/IL-33 may represent a treatment option for frequent exacerbators with COPD, yet this remains to be investigated in future studies.
Two additional gene subnetworks were identified in the Bayesian network model, which were directly connected to an early versus late/nonexacerbation phenotype. Genes identified in PRISE #2 were connected via a TATDN3 subnetwork, which is a protein-coding gene related to deoxyribonuclease activity. The third gene subnetwork was directly connected through SPRR2E, which encodes the small proline-rich protein 2E, and is involved in epithelial cell-related processes. SPRR2E expression was previously found to be increased in primary bronchial epithelial cells from subjects with COPD compared to healthy controls, suggesting that this gene might be involved in airway epithelial inflammatory or remodelling processes in COPD [31]. As additional genes of the multigene SPRR family, as well as keratin genes, were connected to this subnetwork, our results indicate that the airway epithelial cell function might be associated with the exacerbation phenotype after ICS withdrawal.
Importantly, sex, but not smoking status was another significant demographic predictor for experiencing an exacerbation after ICS withdrawal, which was also identified as a direct subnetwork related to the exacerbation phenotype in the Bayesian model. Sex-based differences in COPD exacerbation frequency have been shown previously, with a higher rate of exacerbations among women compared to men [32]. Our network model indicates that female COPD patients are more prone to early exacerbations after ICS withdrawal as six out of seven female subjects belonged to the early exacerbation phenotype. Importantly, this observation is based on a relatively low number of participants as only seven out of 43 participants were women, and therefore requires further investigation.
The predictive value of PRISE #1 and #2 could not be validated in an independent RNA-seq dataset concerning ICS-induced improvement in lung function; however, this analysis was likely underpowered. In addition, the clinical outcome, as well as airway specimen, differed compared to the SYMBEXCO dataset. Still, the strong associations between the ESs of PRISE #1 and #2 as well as eosinophils counts in blood were validated in this dataset. Also, we showed that the expression of PRISE #1 genes was steroid-sensitive, both in SYBMBEXCO and the independent RNA-seq dataset, indicating that the expression of these genes increases with ICS withdrawal and decreases with ICS treatment, in subjects with COPD and asthma.
There are some limitations related to our study. The identified gene signature should be validated in an independent exacerbation cohort, which would be crucial to investigate its utility as a biomarker. To our knowledge, such a replication cohort is currently not available. Further, our results are based on a COPD study population with predominantly moderate airflow obstruction and it is not clear to what extent our observations also apply to milder or more severe COPD.
In summary, we identified a sputum gene signature that exhibited a higher predictive value for predicting COPD exacerbations after ICS withdrawal than sputum eosinophilia. Future studies should investigate the utility of this signature, which might enhance personalised ICS treatment in COPD patients.
Supplementary material
Supplementary Material
Please note: supplementary material is not edited by the Editorial Office, and is uploaded as it has been supplied by the author.
Supplementary material 00097-2021.SUPPLEMENT
Footnotes
Author contributions: J.J.W. Liesker, E. Bathoorn and H.A.M. Kerstjens conducted the SYMBEXCO study. A. Sarma, S. Caldera, C. Langelier and S.A. Christenson were responsible for the preparation and RNA sequencing of the sputum samples. B. Ditz, A. Sarma, J.M. Vonk, V. Bernal, P. Horvatovich and V. Guryev performed the statistical analysis of the data. B. Ditz wrote the initial draft of the manuscript with additional content provided and critical revisions from all authors.
This article has supplementary material available from openres.ersjournals.com
Conflict of interest: B. Ditz has nothing to disclose.
Conflict of interest: A. Sarma reports grants from National Heart, Lung, and Blood Institute, during the conduct of the study.
Conflict of interest: H.A.M. Kerstjens reports a grant from AstraZeneca as well as grants and fees for consultancy or advisory board participation from GlaxoSmithKline, Boehringer Ingelheim, and Novartis, and a grant from Chiesi, all outside of the submitted work and all paid to his institution.
Conflict of interest: J.J.W. Liesker has nothing to disclose.
Conflict of interest: E. Bathoorn has nothing to disclose.
Conflict of interest: J.M. Vonk has nothing to disclose.
Conflict of interest: V. Bernal has nothing to disclose.
Conflict of interest: P. Horvatovich has nothing to disclose.
Conflict of interest: V. Guryev has nothing to disclose.
Conflict of interest: S. Caldera has nothing to disclose.
Conflict of interest: C. Langelier has nothing to disclose.
Conflict of interest: A. Faiz has nothing to disclose.
Conflict of interest: S.A. Christenson reports consulting fees from AstraZeneca, GlaxoSmithKline, Amgen and Glenmark; personal fees for invited lectures from Sunovion and Genentech; and personal fees for writing for UpToDate, all outside the submitted work.
Conflict of interest: M. van den Berge has nothing to disclose.
Support statement: The submitted work was co-financed by the Ministry of Economic Affairs and Climate Policy by means of the PPP. A. Faiz was supported by a junior Longfond grant (4.2.16.132JO). Funding information for this article has been deposited with the Crossref Funder Registry.
- Received February 8, 2021.
- Accepted April 26, 2021.
- Copyright ©The authors 2021
This version is distributed under the terms of the Creative Commons Attribution Non-Commercial Licence 4.0. For commercial reproduction rights and permissions contact permissions{at}ersnet.org