Abstract
Introduction There is a critical need to understand the optimal treatment regimen in patients with potentially resectable stage III-N2 nonsmall cell lung cancer (NSCLC).
Methods A systematic review of randomised controlled trials was carried out using a literature search including the CDSR, CENTRAL, DARE, HTA, EMBASE and MEDLINE bibliographic databases. Selected trials were used to perform a Bayesian fixed-effects network meta-analysis and economic modelling of treatment regimens relevant to current-day treatment options: chemotherapy plus surgery (CS), chemotherapy plus radiotherapy (CR) and chemoradiotherapy followed by surgery (CRS).
Findings Six trials were prioritised for evidence synthesis. The fixed-effects network meta-analyses demonstrated an improvement in disease-free survival (DFS) for CRS versus CS and CRS versus CR of 0.34 years (95% CI 0.02–0.65) and 0.32 years (95% CI 0.05–0.58) respectively, over a 5-year period. No evidence of effect was observed in overall survival although point estimates favoured CRS. The probabilities that CRS had a greater mean survival time and greater probability of being alive than the reference treatment of CR at 5 years were 89% and 86% respectively. Survival outcomes for CR and CS were essentially equivalent. The economic model calculated that CRS and CS had incremental cost-effectiveness ratios of £19 000/quality-adjusted life-year (QALY) and £78 000/QALY compared to CR. The probability that CRS generated more QALYs than CR and CS was 94%.
Interpretation CRS provides an extended time in a disease-free state leading to improved cost-effectiveness over CR and CS in potentially resectable stage III-N2 NSCLC.
Abstract
Chemoradiotherapy plus surgery improves disease-free survival against both chemotherapy and surgery, and chemoradiotherapy. This extended time within a disease-free state translates into improved cost-effectiveness through improved quality of life. https://bit.ly/3jXsWv4
Introduction
Uncertainty exists as to the optimal management strategy for patients with potentially resectable stage III-N2 nonsmall cell lung cancer (NSCLC). While consensus exists that optimal treatment must include both systemic treatment for distant control and local treatment for local control (e.g. surgery, radiotherapy), the optimal combination of treatments has not been established. This results in multiple treatment options being recommended within international lung cancer guidelines without consensus agreement as to the optimal strategy [1–7]. These treatment combinations include chemotherapy plus surgery (CS), chemotherapy plus radiotherapy (CR), and chemotherapy, radiotherapy and surgery (CRS). Numerous randomised controlled trials (RCTs) and meta-analyses have failed to show one treatment combination to be definitively superior to another in overall survival (OS) [8–14], but there are notable findings within these studies that continue to spark debate. The Intergroup 0139 trial of CRS versus CR reported a significant increase in median progression-free survival of 12.8 months for CRS versus 10.5 months for CR as well as the percentage of patients without disease progression at 5 years (22% versus 11%) but did not demonstrate a difference in OS [10]. Concern was raised about a high mortality in patients undergoing pneumonectomy and a post hoc unplanned analysis of only patients who had a lobectomy demonstrated higher median OS (33.6 versus 21.7 months) compared with statistically matched patients who received chemoradiotherapy. The weight that should be placed on this finding continues to be debated. Furthermore, a meta-analysis of CRS versus CR combined the results of the Intergroup 0139 study with a Nordic randomised controlled trial of CRS versus CR which recruited nearly 400 patients before closing early and was only published in abstract form. This meta-analysis was very close to reaching statistical significance for an improved survival with CRS (HR 0.87, CI 0.75–1.01, p=0.068) [12]. While these findings might represent evidence of benefit from CRS over CR, RCTs and meta-analyses of CS versus CR and CRS versus CS have failed to show any evidence for the superiority of one treatment strategy over another. Given these findings, the ongoing debate as to the optimal treatment strategy and that different multimodality treatments represent significant yet different healthcare costs, there is an urgent need to synthesise the published evidence and develop an economic model to define the most cost-effective treatment strategy in potentially resectable stage III-N2 NSCLC. This area was identified by the National Institute of Health and Care Excellence (NICE) for network meta-analysis (NMA) and health economic modelling as part of the 2019 update to its guideline on “Lung Cancer: Diagnosis and Management”, and this paper reports the results. The views expressed in this manuscript are those of the authors and not necessarily those of NICE.
Methods
We conducted a systematic review of RCTs comparing curative-intent multimodality treatments (CS, CR or CRS) in people with stage III-N2 NSCLC that were suitable for surgical resection. The literature search included the Cochrane Database of Systematic Reviews (CDSR), the Cochrane Central Register of Controlled Trials (CENTRAL), the Database of Abstracts of Reviews of Effects (DARE), the Health Technology Assessment (HTA) Database, the Excerpta Medical database (EMBASE) and the Medical Literature Analysis and Retrieval System Online (MEDLINE) bibliographic databases and identified 4241 studies for title and abstract screening. A similar search with economic filters found 956 titles and abstracts. A Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA)-based checklist is available in supplementary data 1. Following further review, six trials were prioritised for evidence synthesis. Other trials were excluded from the analysis due to the irrelevance of the pairwise comparisons contained within them to current practice. This included trials in which CRS was given as chemotherapy followed by surgery followed by radiotherapy [15, 16]. No economic studies were available to be included in the review (figure 1). The included studies are listed in table 1. Based on these data we, in consultation with the NICE Guideline Committee, concluded that the patients and interventions were reflective of those seen in current practice and that the trials were appropriate to pool.
Study selection for network meta-analysis and economic modelling.
Summary of trials included in the network meta-analysis: study settings, patients and interventions
NMA
NMA is a technique for quantitatively synthesising direct and indirect evidence of relative treatment effects. It is frequently used by NICE to aid guideline committee decision-making where more than two treatment options exist. As is common in cancer studies, we specified the two most important outcomes as OS and disease-free survival (DFS). Upon inspection of the Kaplan–Meier (KM) plots for these outcomes in the included trials, it was clear that the proportional hazards assumption seldom held because the survival curves frequently crossed or diverged. An NMA of published hazard ratios was therefore deemed inappropriate. Instead, we calculated and synthesised the area under each KM curve at the longest common follow-up time among studies. This is equivalent to the mean time patients spent alive (OS) or alive and DFS within the restricted time period. Time spent in a health state is also an important input from a patient perspective and for health economic models. The longest common follow-up time among all studies was 4 years but we had 5-year follow-up data for five of the six studies, with the sixth study being the smallest, lowest quality and least applicable [17]. We decided that the primary analysis would be conducted using the 5-year follow-up data with 4-year data being used in a sensitivity analysis because the extra information gained from the longer follow-up outweighed that from the small, low-quality trial. DFS and OS were jointly synthesised in an NMA to account for the correlation between these outcomes, and a separate NMA was specified for the probability of survival at 5 years to inform the economic model. All NMAs were conducted in a Bayesian framework; full methodology for data extraction and evidence synthesis, including the programming code, is available online [18, 19]. The fit of fixed- and random-effects NMA models was assessed and compared using the posterior mean of the residual deviance and deviance information criterion; lower values are preferred and differences of at least three points were considered meaningful [20]. To assess the consistency assumption of NMA, i.e. no conflict between the direct and indirect evidence, the fit of an unrelated mean effects model was similarly compared to that of the selected NMA model [21]. The NMA input data are shown in table 2.
Network meta-analysis input data; trial data for evidence synthesis
Health economics
We built a health economic model that accrued healthcare costs and quality-adjusted life-years (QALYs) for each intervention over a lifetime time horizon. We used the results from the NMAs to inform the first 5 years of the economic model. The NMAs dictated the time patients in each model arm spent in the disease-free and post-recurrence states as well as their probability of survival beyond 5 years. Patients surviving beyond 5 years were assumed to be disease-free and effectively cured of their NSCLC, and hence no further time was spent in the post-recurrence state after 5 years. The DFS and OS curves in the underpinning RCTs lent some support to this assumption by being well converged and plateauing at 5 years.
To inform the disease-free state beyond 5 years in the economic model, the proportion surviving at 5 years, along with an external estimate of mean time spent disease-free beyond 5 years were required. The absolute proportion surviving at 5 years in each model arm was calculated by adding the log-odds ratios of each treatment versus CR from the NMA and the baseline log-odds probability of survival for those receiving CR, which was informed by the CR arm reported in van Meerbeeck et al. [9], the largest trial. As this was also the oldest trial and as OS has improved in this patient population over time, these data may not be reflective of current practice, and so we tested this assumption in a sensitivity analysis. The mean time spent disease-free beyond 5 years was calculated based on a post 5-year survival curve fitted to individual patient survival data in the Surveillance, Epidemiology and End Results (SEER) database (supplementary figure S1). We matched the patient population in our trials to 2865 similar patients with NSCLC stage IIIA-N2 conditional on having survived for 5 years post diagnosis (3703 patients in the 4-year sensitivity analysis [22]).
Adverse events were not reported in all trials. Where possible, we obtained the number of grade 3+ adverse events and multiplied the area under the curve (AUC) by the sample size in each arm to obtain the population years at risk and used these data to calculate the relevant incidence rate for CRS. We then fit an NMA model [23] to these data and used the resulting hazard ratios to calculate the mean number of events experienced by patients in each arm, which were costed as an inpatient stay and had no quality-of-life decrement attached. Given the small differences between the interventions and the short-term nature of the events, on average, these simplifying assumptions were assessed as minor. The results favoured CRS over the other two interventions, which was unexpected, given that it is the most intensive intervention. These parameters were therefore omitted in sensitivity analyses.
As the DFS and OS curves were assumed to be fully converged by 5 years, we multiplied 1 minus the proportion of people alive by the proportion of disease recurrences that were deaths (fit using another NMA model [23] applied to pooled data from CS arms) to calculate the total number of patients whose disease had recurred by 5 years. We costed these recurrences as being treated with platinum doublet chemotherapy, having no data on further lines of treatment or whether the probability that patients received further lines of treatment could reasonably be expected to differ between the arms. We did not cost downstream use of newer targeted and immunotherapies for NSCLC, firstly because it would have been impossible to determine what proportion of patients that generated the survival data used in our model would have received these treatments (due to either the age of the studies or individual ineligibility) and secondly because these treatments are often priced at society's maximum willingness to pay for one QALY and therefore do not affect the overall cost-effectiveness of the treatment pathway. Consequently, any related survival improvement in patients in current clinical practice over those in the trials that underpin our analysis is unlikely to have a big effect on the cost-effectiveness results.
Economic discounting within the first 5 years was resolved via a separate NMA, documented elsewhere [23], which apportioned events across those years. No directly applicable health-related quality of life (HRQoL) values were available at the time of analysis, so we assigned well-established values for advanced NSCLC [24] for pre- and post-progression advanced NSCLC to these health states within the model. This may underestimate the HRQoL of patients within our model. We also obtained data on temporary QALY decrement from surgery [25] and applied this to the surgical arms of the model. Full tables of input parameters for the economic model are available online [23]. The model's structure, input data and assumptions were validated by the NICE guideline committee and all analyses were performed in line with the NICE reference case [26].
Results
Network meta-analysis
The fixed-effect model was preferred on the basis of model fit and due to insufficient data for the random effects model to be reliably estimated. The fixed-effects network meta-analyses demonstrated an improvement in mean DFS time for CRS versus CS and CRS versus CR of 0.34 years (95% CI 0.02–0.65) and 0.32 years (95% CI 0.05–0.58) respectively within the first 5 years after treatment, equating to ∼4 months in each case (table 3). There was no evidence of improvement between the interventions in terms of OS or probability of being alive at 5 years although point estimates favoured CRS. The probability that CRS had a greater mean survival time than CR was 89%, and there was an 86% chance that CRS patients had a greater probability of being alive at 5 years compared to CR. CS had similar point estimates and confidence intervals to CR for all three outcomes. The broad conclusions of the 5-year analysis were replicated in the 4-year sensitivity analysis (figure 2). Inconsistency checks were performed using unrelated mean effects models [21], and no evidence of inconsistency was found. Overall, the NMA showed that CRS is associated with greater DFS than both CS and CR, and there was no evidence that the interventions were more effective than the others for any other outcome.
Network meta-analysis results (chemotherapy+surgery and chemoradiotherapy+surgery versus chemoradiotherapy)
a–d) Difference in interventions for four key outcomes, fixed- and random-effects models for 5- and 4-year data. The only outcomes that are statistically significant are for progression-free survival (part a). CS: chemotherapy plus surgery; CR: chemotherapy plus radiotherapy; CRS: chemoradiotherapy followed by surgery; FE: fixed effects; RE: random effects.
Economic model
The economic model calculated that CRS and CS had incremental cost-effectiveness ratios (ICERs) of £19 000/QALY and £78 000/QALY compared to CR (table 4). Sensitivity analyses varying the economic model's input parameters within plausible ranges did not alter these conclusions (supplementary data 2). The probability that CRS generated more QALYs than CR was 94%, and the probability that CRS generates more QALYs than CS was 85%. The one notable exception to this was setting the probability of being alive at 5 years equal among all three interventions (there was no evidence of improvement between the interventions in terms of OS, though point estimates favoured CRS, and the probability that CRS had a greater mean survival time than CR was 89%), which increased the ICER for CRS versus CR to £41 000/QALY gained, although the probability that CRS generated more QALYs than CR was still very high at 89%. The ICERs were also much more favourable for the surgical options if using data on the baseline probability of survival at 5 years from the more modern ESPATUE trial [8]. The very high uncertainty in the ICER for CS versus CR, as evidenced by the wildly variable sensitivity analyses, is due to the very small and uncertain differences in QALYs between the two strategies.
Economic model results (absolute costs and QALYs)
Discussion
Key findings
Of the three interventions for potentially resectable stage III-N2 NSCLC examined in this analysis, CRS was demonstrated to be the most superior treatment in efficacy and cost-effectiveness. CRS was cost-effective at NICE's commonly accepted decision threshold of £20 000–£30 000 per QALY gained and extendedly dominated the cost-effectiveness of CS and CR. This dominance of CRS over CS and CR in cost-effectiveness was driven by the extended time patients spend in a disease-free state following CRS compared to the alternative treatment strategies and the improved quality of life associated with this. The trials included in this NMA incorporated all levels of disease burden under the umbrella of “potentially resectable” stage III-N2. For example, one of the largest trials (Intergroup 0139) included 76% of patients with a single N2 nodal station metastasis. The conclusions are, therefore, not restricted to patients with higher disease burden where the role of chemoradiotherapy has traditionally been placed.
Results in context of published literature
Other studies have also synthesised trial data in this area through meta-analysis [12–14, 27, 28] and did not find any statistically significant differences between interventions. However, the analyses in these studies are confined to conventional pairwise meta-analysis of hazard ratios and dichotomous outcomes. Furthermore, they did not include the same trials (i.e. pooling interventions that were not of interest or including studies that would not have met our protocol such as interventions unrelated to current practice and conference abstracts); secondly, we drew a distinction between CS and CRS as separate interventions rather than pooling them; and thirdly, the proportional hazards assumption does not hold for the vast majority of the OS and DFS KM data in the included trials. Hazard ratios could, therefore, be considered inappropriate to pool and may not fully capture treatment differences that are seen in the differences between survival curves. It is quite common for survival curves to exhibit non-proportional hazards properties in trials of surgical versus non-surgical treatment because mortality can be initially higher (if the invasiveness of the surgery influences survival for some people) and subsequently lower (e.g. if the surgery provides a cure) in the surgical arms. It was for this reason that it was felt more appropriate to pool data using the area-under-the-curve method rather than hazard ratios. While this method is well known in the field of health economics because the amount of time patients spend in a particular health state is crucial for QALY calculations, it is less common in clinical evidence synthesis. To illustrate these differences with a specific example we compare our study to that of Zhao et al. [27], given this was also an NMA. Our results are likely to differ because: Zhao et al. used hazard ratios; the interventions are disaggregated to the extent that the majority of the network is simply the same pairwise data as reported in the trials but with extra statistical uncertainty stemming from a shared random effects term; there are a lot of trials that included single modality therapies that would not have met our protocol; and there is no analysis of DFS, which is the outcome where we identified benefits of CRS.
Strengths and limitations
An NMA requires consistency across the included studies in terms of trial setting, patient characteristics and treatment delivery. The only impact upon outcomes is therefore the type of treatment used, and all patients within the selected studies would be eligible for any of the treatments being studied within the NMA. The studies included in this NMA were well balanced for patient characteristics and conducted across similar multi-national western healthcare services. The NMA found no statistical evidence of inconsistency across the included studies providing strength to the findings and conclusions. Furthermore, this study is the first non-hazard ratio-based meta-analysis of outcomes for radical treatments for potentially resectable stage III-N2 NSCLC. It included a wide range of network meta-analyses of treatment outcomes relevant to this population and restricted itself only to treatment options that are relevant to current practice. This is the first economic analysis in this patient population, and both the statistical and economic work have benefited from the agreement of underlying assumptions and input parameters by a committee of experts and from examination at public consultation through the NICE Guidelines process. The conclusions of this study were robust to sensitivity and scenario analyses.
However, it is important to acknowledge that the included studies were conducted over different time periods with recruitment periods extending from 1994 to 2013. Lung cancer staging has changed significantly in this time period with the introduction of positron emission tomography imaging [29] and endobronchial ultrasound [30] as well as modernisation of peri-operative care, surgical techniques and radiotherapy techniques. OS estimates differed somewhat between the studies, with patients typically surviving longer in the more recent trials, which may reflect these improvements in staging and treatment as well as treatment options for distant disease recurrence in the last decade. As a matter of theory, higher baseline OS might provide more scope for similar relative treatment effects to achieve a greater overall magnitude of benefit. It is unlikely that this would have biased our analysis in favour of CRS; however, as the study that contributed the most weight towards the positive finding for progression-free survival, Albain et al. [10] was also the second oldest in the NMA. It should be noted that while Albain et al. is the only study with a statistically significant DFS benefit for CRS, the point estimates for DFS at 5 years in all the other studies in the NMA favour CRS over its comparator. Additionally, the point estimates for OS time at 5 years favoured CRS over its comparator in each study included in the NMA, although neither these studies nor the NMA demonstrated statistical evidence of a more effective treatment for this outcome. We also make strong reference to the post hoc analysis in the Intergroup 0139 trial showing a highly significant improvement in survival for patients undergoing lobectomy in the CRS arm matched to patients suitable for lobectomy undergoing CR. A further potential limitation of our study is that health economic data including costs and HRQoL were not collected in the underpinning RCTs. While it is not unusual for health economic models to combine data from disparate sources, it would have been preferable to have used direct evidence. Perhaps the most important limitation in terms of cost-effectiveness is the lack of evidence of effect for the proportion of patients surviving into the long-term model, which, when set equal, raised the ICER substantially.
Future implications
Although an established treatment for stage III-N2, CRS uptake in the UK is very low [31]. The findings of this study, therefore, have significant impacts on lung cancer treatment pathways and warrant careful consideration on patient selection, efficient transition from chemoradiotherapy to surgery and minimising the risk of failure to complete all aspects of CRS. However, the most important limitation to the findings of this study in relation to future care is how rapidly the treatment paradigm for stage III-N2 lung cancer is evolving. There are recently published RCTs that have created new standards of care, but that would not have been included in our NMA even if available at the time. These include adjuvant tyrosine kinase inhibitor (TKI) treatment in patients with epidermal growth factor receptor mutation positive NSCLC that has been completely resected, which significantly improves DFS (HR 0.17, 95% CI 0.11–0.26, p<0.0001 [32]). However, this and other trials of adjuvant treatments would not have been included in the NMA as patients were selected for this treatment after complete surgical resection, based on pathological staging (including stage II and IIIA and therefore not specific to stage III-N2), following adequate recovery from surgery and after molecular profiling of the resected tumour, i.e. randomisation occurred post-operatively. This could not be applied to an NMA of upfront treatment decisions based on clinical staging. Furthermore, the recently published PACIFIC trial has demonstrated improved DFS and OS with the addition of maintenance immunotherapy following CR in patients with stage III-N2 deemed unsuitable for surgical resection [33, 34]. This trial would have also been excluded from this NMA as it selects patients unsuitable for surgery. We note there is concern about the wide variation in the definition of resectability across a global trial and that some patients with potentially resectable N2 disease could have been included but ultimately the trial could not have been included in this NMA. In real-life clinical practice though, it is possible that chemoradiotherapy followed by adjuvant immunotherapy is being recommended in patients that have “potentially resectable” stage III-N2 NSCLC. The one very recently published trial that might have been included in this NMA and that has potential to change practice in stage III-N2 NSCLC is Checkmate 0816, which has demonstrated the addition of neoadjuvant immunotherapy to neoadjuvant chemotherapy prior to surgical resection improves DFS significantly (HR 0.63, 95% CI 0.43–0.91; p=0.005) [35]. While this treatment has not been compared to CRS, the hazard ratios in this study suggest this could represent an optimal treatment regimen in stage III-N2 while once again noting the trial included stages IB-IIIA and was not specific to stage III-N2. The eligibility criteria in clinical practice for this new treatment regimen are yet to be established nor is it clear whether this will be based on predictive marker testing such as programmed cell death ligand-1 (PD-L1) expression, as was the case for adjuvant immunotherapy after concurrent chemoradiotherapy in the UK. All of this new data highlights the need for expert tumour board discussions in this complex and rapidly changing field of lung cancer management. However, we strongly believe this study provides important information to support treatment decisions, particularly if there are specific scenarios in the future in which patients are not eligible for adjuvant/neoadjuvant TKI/IO therapies.
Conclusions
Overall, the results of this NMA and health economic analysis provide evidence that in patients with potentially resectable stage IIIA-N2 NSCLC, CRS provides improved DFS. Living within a disease-free state is known to be associated with improved quality of life compared to a post-disease recurrence state. It is the extended period within a disease-free state, and the assumed improvement in quality of life, that drives the improved cost-effectiveness of CRS over CR and CS in our economic model. While lacking evidence of effect, there are also indications towards improved OS with CRS. The trials included in this NMA enrolled patients over 10 years ago at least, and there have been practice-changing RCTs published in the last few years relevant to this area of lung cancer treatment. The results of this NMA, however, may still provide useful insights into patients deemed ineligible for newer systemic agents such as TKIs and immunotherapy.
Supplementary material
Supplementary Material
Please note: supplementary material is not edited by the Editorial Office, and is uploaded as it has been supplied by the author.
Supplementary material 00299-2022.SUPPLEMENT
Footnotes
Provenance: Submitted article, peer reviewed.
Author contributions: S. Aslam, D. West and N. Navani were members of the NICE Guideline Expert Committee for the 2019 update to the Diagnosis and Management of Lung Cancer. R. Maconachie, T. Mercer, C.H. Daly and N.J. Welton led the NMA and economic analysis for the NICE Lung Cancer Guideline Committee, which this paper reports, and had direct access to the data to verify the results. M. Evison led the writing of the manuscript, and all authors reviewed and agreed the final version. NICE provided permission for manuscript submission. The views expressed in this manuscript are those of the authors and not necessarily those of NICE.
Conflicts of interest: R. Maconachie currently works as Associate Director, Value, Access and Devolved Nations, Merck, Sharp and Dohme (UK) Ltd (MSD). During the time of this work, his role was Technical Adviser, Centre for Guidelines, National Institute for Health and Care Excellence (NICE). MSD market treatments for lung cancer but this work was completed entirely while in employment with NICE and there are no obvious conflicts of interest related to MSD's activities. NICE funds the technical support unit at the University of Bristol which supported C.H. Daly and N.J Welton for the work on this manuscript. N. Navani is supported by a Medical Research Council Academic Research Partnership (MR/T02481X/1). This work was partly undertaken at The University College London Hospitals/University College London that received a proportion of funding from the Department of Health's National Institute for Health Research (NIHR) Biomedical Research Centre's funding scheme. N. Navani reports honoraria for non-promotional educational talks or advisory boards from Amgen, AstraZeneca, Boehringer Ingelheim, Bristol Myers Squibb, Guardant, Janssen, Lilly, Merck Sharp & Dohme, OIympus, OncLive, PeerVoice, Pfizer and Takeda. M. Evison reports honoraria for non-promotional educational talks or advisory boards from AstraZeneca, Boehringer Ingelheim, Bristol Myers Squibb, Lilly, Merck Sharp & Dohme and Pfizer. N.J Welton has received honoraria for delivering masterclasses/workshops/courses on behalf of Association of the British Pharmaceutical Industry (ABPI), Takeda, Cochrane Ireland, NICE International and NICE Scientific Advice, and Centre for Global Development, all outside the submitted work. The remaining authors have nothing to disclose.
- Received June 21, 2022.
- Accepted December 20, 2022.
- Copyright ©The authors 2023
This version is distributed under the terms of the Creative Commons Attribution Non-Commercial Licence 4.0. For commercial reproduction rights and permissions contact permissions{at}ersnet.org