Introduction

Due to their high internal validity, classical randomized controlled trials (cRCTs) unequivocally sit atop the medical evidence hierarchy for examining management and therapeutic interventions [13]. To limit organizational constraints, costs, and patient dropouts, many cRCTs are of short duration, although 1-year cRCTs have been carried out [46]. Of particular importance, they are designed to include tightly controlled, well-characterized patient populations so as to minimize confounders and avoid loose causality relationships between an intervention and an outcome, seeking instead to identify clear cause and effect. For example, patients typically recruited to asthma cRCTs tend to have a clear-cut diagnosis; frequent use of rescue medication; substantial, concurrent reversibility to short-acting β2-agonists; high adherence with study drugs and procedures; good inhaler technique; and no (or negligible) comorbid illness. The resultant highly characterized trial population represents but a small subgroup of the broadly heterogeneous asthma population treated in everyday clinical practice.

For common chronic conditions (eg, asthma) that affect a wide, varied patient population, concerns exist regarding the external validity of cRCT data and the ability to broadly extrapolate these data to the real world patient population [79, 10•]. This paper examines the extent to which the findings of cRCTs in respiratory medicine, particularly with respect to asthma, can be generalized to the broader population. In addition, we consider complementary study designs and the role they play in supplementing the cRCT evidence base.

Gaps in the Evidence Base

Do Classical Randomized Controlled Trial Asthma Populations Represent Real Life Patients with Asthma?

Randomized controlled trials (RCTs) can be designed to show superiority, noninferiority, or equivalence and are used to evaluate the safety profile and efficacy of emerging therapies. Classical RCT design aims to maximize internal validity and to establish an unequivocal cause-and-effect relationship between an intervention and an outcome, and only a limited number of outcomes are evaluated. However, to establish causality, any potential confounding factor that may compromise results and their interpretation must be eliminated (as far as is realistically possible).

However, the strict patient selection criteria used in most cRCTs tend to result in a highly restricted (normally idealized) study population. The inclusion criteria of asthma cRCTs generally demand excellent inhaler technique and compliance with the designated treatment, prespecified levels of airflow obstruction and reversibility, absence of comorbidities (including obesity) and polypharmacy, and specific smoking status (eg, current nonsmoker with smoking history of ≤10 pack-years). In contrast, patients with asthma treated in clinical practice frequently have comorbidities, variable compliance, and questionable inhaler technique (Table 1) [1115]. In addition, in the real world, a significant proportion are current smokers and/or overweight [16, 17]. The net result is that cRCT study populations tend to represent only very limited populations within the real world asthma population [1820].

Table 1 Comorbid and lifestyle factors present in real world patients with asthma who are frequently excluded from classical randomized controlled trial populations

Global Initiative for Asthma–Based Asthma Diagnostic Criteria Used in Classical Randomized Controlled Trials: Relevance in the Real World

The Global Initiative for Asthma (GINA) publishes management and therapeutic recommendations that draw heavily on cRCT evidence [3]. To be included in cRCTs, patients must fulfill strict GINA-based diagnostic criteria to ensure that observed effects are not obscured by other actors (“confounders”) introduced, for example, by inclusion of patients with obstructive lung disease other than asthma (eg, chronic obstructive pulmonary disease [COPD]). GINA stipulates that when evaluating a patient’s response to asthma therapy, the following characteristics are consistent with an asthma diagnosis: an increase in forced expiratory volume in 1 s (FEV1) of ≥12% (and 200 mL) after administration of a bronchodilator (ie, indicative of reversible airflow limitation), an improvement in peak expiratory flow (PEF) of 60 L/min (or ≥20% of the prebronchodilator PEF) after inhalation of a bronchodilator, or diurnal variation in PEF of more than 20% (with twice-daily readings, >10%) [3].

The generalizability of criteria used to inform the GINA therapeutic recommendations was evaluated through a survey of respiratory health carried out in Wellington, New Zealand [18]. The New Zealand researchers identified the eligibility criteria common to all cRCTs published in the past 30 years that were cited by GINA and were designed to evaluate asthma drug efficacy in a minimum of 400 adult patients (n = 17 of the 215 cRCTs cited as level A or B evidence by GINA). Common eligibility criteria in these trials included a diagnosis of asthma, age older than a lower age limit, and bronchodilator reversibility. Other inclusion criteria were a specified FEV1 range, inhaled corticosteroid (ICS) use, specified symptoms or use of rescue drugs, an upper age limit age, and peak flow variability. The proportion of Wellington survey participants with current asthma and full questionnaire responses and pulmonary function testing (n = 127) who met the eligibility criteria is summarized in Table 2; only 29% met the GINA-recommended reversibility diagnostic criterion, and 44% met the PEF variability criterion [18].

Table 2 Representative nature of cRCTs in asthma: percentage of real world patients with an asthma diagnosis who meet typical cRCT inclusion criteria

A similar study was conducted in Norway to evaluate the extent to which a real life obstructive lung disease population met criteria commonly used in cRCTs [20]. A minority of patients met the following commonly used asthma eligibility criteria: FEV1 50% to 85% of predicted (37.1% eligible), reversibility of 12% within the past year (14.9% eligible), absence of comorbidity (9.6% eligible), and nonsmoker (or ex-smoker with a nicotine burden <10 pack-years; 5.4% eligible). If patients were also required to be symptomatic and to have regular ICS usage, the percentage of eligible patients fell further to 3.3% (Table 2) [20]. Thus, in asthma, the typical cRCT population represents a small minority of patients with asthma treated in real world everyday clinical practice.

This finding holds true in other areas of respiratory disease. For example, a recent prospective cohort study found that of patients treated for allergic rhinitis in everyday practice, only 7.4% of the 311 patients examined would have been eligible for major placebo-controlled RCTs of persistent and intermittent allergic rhinitis [21]. The most common reasons for which the real world patients with allergic rhinitis would have been excluded from a cRCT were the absence of an allergy diagnosis based on skin testing and/or serum-specific IgE testing, insufficient disease severity, and the presence of comorbidities.

Applicability of Classical Randomized Controlled Trial Results in Specific Subgroups

Concerns surrounding the limited representative nature of cRCTs arise where there are data to suggest that cRCT findings may not hold true in particular subgroups of the population. Subgroups of particular note within the larger asthma population include those with poor inhaler technique, current smokers, obese patients, those with comorbid conditions (eg, rhinitis), and those with low adherence to therapy [1114, 2228]. When considering inhaler handling, for example, data suggest a link between poor technique and poor asthma control, as improper use of pressurized metered-dose inhalers (pMDIs) for the delivery of ICS is associated with decreased asthma control [13].

Smoking and severity of rhinitis are also important determinants of asthma control, as patients with severe rhinitis and/or higher average cigarette use exhibit poorer control [11]. Cigarette smoking is known to reduce the effect of ICS therapy [2730]. Other factors that have been shown to affect patients’ response to asthma therapy are obesity, likely through inflammatory mechanisms [23], and presence of comorbidities such as COPD and heart failure [15]. It is often difficult in practice to differentiate between asthma and COPD, whereas only patients with a clear-cut diagnosis of asthma are included in cRCTs.

Therefore, the importance of real world factors in asthma (eg, smoking, comorbidities, patient adherence, and inhaler technique) should not be overlooked, as they may explain the wide gap between the level of asthma control that can be achieved in cRCTs [6] and the frequently disappointing results observed in observational studies carried out among less selected populations [31]. However, data on the comparative effectiveness of therapies in these subgroups remain lacking. Appropriately designed studies to address these unanswered research questions may reveal differential effectiveness of therapeutic options in particular patient subgroups and could be used to help guide more tailored, individualized asthma management.

Other Gaps in the Evidence Base: Limited Outcome Evaluation, Duration of Trials, and Ethical Considerations

Another limitation of potential concern resulting from the high cost of cRCTs is that with few exceptions [46], they tend to be limited in the number of outcomes evaluated and short in duration, yet the resultant data are used to inform guidelines for asthma, a chronic disease requiring long-term management.

Gaps in the evidence base can also arise if ethical considerations prohibit the completion of a cRCT. The MASCOT trial, for example, was withdrawn because of an inability to identify eligible patients [32]. MASCOT was designed to evaluate the use of combination ICS/long-acting β2-agonist (LABA) therapy in children of school age, but most patients who were potentially eligible were already receiving the study medication. Altering patients’ routine therapy when it equates to best standard of care is unethical and can present problems for trial recruitment.

Plugging the Gaps in the Evidence Base: Complementary Trial Designs

The need to look beyond asthma cRCTs when faced with gaps or limitations in the existing evidence base was recognized in the 2009 European Respiratory Society/American Thoracic Society Taskforce paper on asthma control and exacerbations [2]. The Taskforce proposed the use of composite measures when evaluating asthma control and called for the measurement properties to be validated in clinical trials and in “large, prospective studies in ‘real-world’ settings (eg, trials designed pragmatically to reflect everyday clinical practice) to ensure they provide content validity as well as reflect clinically meaningful outcomes” [2]. Similarly, the Cochrane Collaboration, when reviewing the effects of ICS use on linear growth in children, recognized the need for longer outcome periods than offered by typical cRCTs and specifically advised that “research efforts should concentrate on evaluating the long-term effects of inhaled steroids” [33]. The 2008 Brussels Declaration on Asthma echoed this sentiment by stating in 1 of its 10 key points that there is a need to “include evidence from real world studies in treatment guidelines” [34]. The Declaration’s rationale was that “asthma treatment guidelines are primarily based on evidence from large clinical trials that frequently assess lung function as the primary outcome. However, inflammatory biomarker levels, asthma exacerbations, and other outcomes may worsen regardless of lung function status” [34].

Sir Michael Rawlins [35••], chairman of the United Kingdom’s National Institute for Health and Clinical Excellence, added his voice to the debate in 2008 when he suggested that cRCTs should be complemented by a diversity of approaches that involve analyzing the totality of the evidence base. He argued that cRCTs “often miss the value of a therapeutic intervention and tend to be carried out in specific types of patients for relatively short periods of time” [35••]. He contrasted this approach with clinical practice “where treatments tend to be used on a long-term basis in a broad variety of patients who often have comorbid conditions” [35••].

Thus, although cRCTs are the cornerstone of medical evidence, there are specific areas in which their design (eg, strict inclusion/exclusion criteria, brevity of duration, limited outcome evaluation, interventional nature, control arm requirement) results in gaps in the full evidence base. Therefore, a role may exist for other study designs, such as pragmatic trials and observational studies, to provide data on the effectiveness and comparative effectiveness of therapies (ie, efficacy as evaluated in nonidealized patients in more naturalistic, real world settings).

The Role of Pragmatic Clinical Trials

Evaluating Real World Effectiveness

The term pragmatic trials was first used in 1967 by Schwartz and Lellouch [36] to describe trials designed to help choose between care options. They contrasted this with explanatory trials that were designed to test causal research hypotheses (eg, that an intervention causes a particular biological change). Schwartz and Lellouch [36] considered there to be a continuum rather than a dichotomy between explanatory and pragmatic trials, and characterized pragmatism as an attitude to trial design rather than a characteristic of the trial itself—an attitude that has come to be understood as one that maximizes applicability of trial results to usual care settings, relies on unequivocally important outcomes (eg, mortality and severe morbidity), and is tested in a wide range of participants [3740].

A trial using this approach for allergic rhinitis found that guideline-based treatment was more effective than free treatment choice [41]. Thus, pragmatic clinical trial designs offer a means of testing a hypothesis in a more naturalistic, real world setting than cRCTs by modeling and reflecting everyday clinical practice in their scheduling (eg, longer treatment exposure) and in their approach to patient recruitment (eg, including patients with relevant comorbidities). While consenting patients are still assigned randomly to predefined study arms, pragmatic trials have broader inclusion criteria [42] than cRCTs and tend to be longer in duration [43, 44].

The international UPLIFT trial was a pragmatic trial designed to evaluate the long-term effect of tiotropium compared with placebo on lung function in patients with COPD [43]. The UPLIFT design was more naturalistic than that used in cRCTs, as it allowed patients to continue taking their standard background therapy (any respiratory medications except anticholinergic drugs) throughout the 4-year trial. Moreover, UPLIFT used a true intention-to-treat approach when evaluating the mortality end point; thus, if patients discontinued study medication during the trial, they were still included in the final mortality analysis. It is unfortunate that the other trial end points were not treated in the same manner [43]. Interestingly, UPLIFT found the rate of lung function decline in the “placebo” group (in which two thirds of patients were receiving an ICS and/or LABA through continuation of their standard background therapy) to be similar to that observed in the ICS, LABA, or ICS/LABA groups of the TORCH study, another long-term trial comparing these treatments with placebo [45].

Another pragmatic trial of note, commissioned by the United Kingdom government, is ELEVATE, an equivalence trial evaluating leukotriene receptor antagonists in primary care at steps 2 and 3 of the national asthma guidelines [44]. Broad inclusion criteria were defined and effectiveness outcomes were measured over a 2-year outcome period; the primary outcome measure was asthma-related quality of life (QoL), a patient-oriented measure of effectiveness. Furthermore, the pragmatic trial design ensured continued patient participation in the study even if patients did not receive and complete the full prescribed regimen—a true intention-to-treat approach. As a result, the dropout rate was only 4% over 2 years in ELEVATE [44], which compares favorably with cRCT rates (eg, 25% in GOAL [46] and 16% in IMPACT [47]). Because patients continued to receive care at their usual practices, ELEVATE achieved high levels of complete data for the primary end point and high levels of clinical data from routine practice: more than 90% of patients supplied data at 2 years for the primary end point, and more than 95% for health care resource and asthma exacerbations. Previous cRCTs comparing step 2 and 3 asthma therapies have yielded inconsistent results on the relative efficacy of the treatment options available [4854]. In ELEVATE, leukotriene receptor antagonists were equivalent to the comparators at 2 months with regard to QoL, and although equivalence in QoL was not shown at 2 years, there were no significant differences in secondary measures at either time point [44].

For seasonal allergic rhinitis, a cluster randomized trial in primary care showed that patients receiving a treatment following the international consensus on rhinitis demonstrated a large improvement compared with those receiving free treatment choice [55]. In order to be closer to patients’ needs, the primary end point was QoL.

Evaluating Real World Adherence

Pragmatic trials also offer the possibility of evaluating adherence in a more naturalistic setting than cRCTs. Not only do cRCTs often demand unrepresentatively high levels of treatment adherence, but their inherently interventional nature can also artificially drive adherence.

One approach to more naturalistic adherence data capture is illustrated by a pragmatic trial designed to evaluate the impact of dosing regimen on adherence to asthma medication. Price et al. [56] compared adherence rates to once-daily and twice-daily mometasone furoate by capturing adherence data using patient self-report and dose counters. Patients with poor adherence remained in the study and thus were included in the final analysis. Using this approach, a discernible difference in adherence rates was measured between treatment arms, with greater adherence recorded for the once-daily regimen [56].

Evaluating Real World Cost-Effectiveness

Evaluating cost-effectiveness of health care interventions is complicated by the difficulties in placing a monetary value on improved QoL and reduced morbidity and mortality. Indeed, drug treatment to prevent morbid events is rarely cost-saving or cost-neutral, and the ultimate questions that have to be addressed are as follows:

  1. 1.

    To what extent a patient will benefit from the treatment and at what cost?

  2. 2.

    How much is the health care system willing to spend to prevent one morbid event [57]?

Understanding the most appropriate and meaningful data on which to base cost-effectiveness evaluations can also be challenging. A study designed to evaluate the external validity of published cost-effectiveness studies compared the data used in the published studies (typically based on cRCTs) with observational data from actual clinical practice. The authors concluded that cost-effectiveness evaluations based solely on cRCT data lack external validity and, as they do not represent patients in actual clinical practice, should not be used to inform prescribing policies [58]. In light of this, treatment and health technology assessments should move away from analyses in carefully screened populations toward actual cost-effectiveness trials using real world clinical data [57].

Remaining Limitations of Pragmatic Trials

Nonetheless, however naturalistic the design of a pragmatic trial, it still requires patient consent and the involvement of “trial-minded” physicians. As such, pragmatic trials will still deal with a defined subgroup of the overall patient and physician populations, and their protocols still require closer monitoring of clinical and biological parameters and more frequent contact with health care professionals than occurs in standard clinical practice [9, 5961].

As pragmatic trials are designed to study real world practice, they are less effective than efficacy trials, sacrificing internal validity to achieve generalizability [62]. Poor design and/or execution (true also of some cRCTs) can bias results toward similar efficacy across treatment arms and therefore bias the study toward a finding of equivalence. Better designed pragmatic trials often include objective outcome measures (eg, survival, test results) and subjective measures (eg, QoL surveys), which, if broadly consistent, can diminish concerns about potential bias [62].

Additional challenges arise from the fact that the very characteristics of real world practice that pragmatic trials are designed to capture—including variable adherence, use of concomitant therapies, presence of comorbidities, changeable symptoms over time—tend to reduce measureable differences between therapies. Such concerns highlight the benefit of including both an intention-to-treat and a per-protocol analysis to allow regression to equivalence [63, 64]. Some of these shortcomings can be addressed—to varying degrees—by observational studies.

The Role of Observational Studies

Their Contribution

As defined by the National Centre for Biotechnology Information, an observational study is a “type of nonrandomized study in which the investigators do not seek to intervene, instead simply observing the course of events” [65]. As such, observational studies using clinical databases offer another method of studying the comparative effectiveness of outcomes as evaluated in real world patients in a noninterventional, naturalistic setting. They also provide a means to study, characterize, and better understand real world prescribing practices and adherence to guidelines in clinical practice. Observational studies involve accessing, collating, and analyzing information held in patient records and can be cross-sectional or longitudinal in design [9]. Although they are limited by the lack of treatment randomization and potential bias through subjectivity of treatment choice, their use of routine clinical data gives them high external validity.

Prospective cohort studies provide important information, but they can be expensive to conduct and take many years to generate results; moreover, for logistical reasons, only a relatively small number of patients can be observed. This limits their power to detect differences in outcomes between subgroups, especially when considering relatively rare outcomes (eg, exacerbations, death). Conversely, retrospective studies look at events in the past (as recorded in patients’ clinical records), allowing the generation of more immediate results. Retrospective studies are also less restricted by patient numbers than prospective studies, as cohort definition can ensure sufficient numbers to demonstrate differences in treatment response (where such differences really exist). Well-designed database studies, while inherently retrospective, define patient cohorts and outcomes a priori based on a prespecified index event, such as a recorded treatment change (see later study examples), which can result in useful hypothesis testing. This contrasts with more skeptical views of databases being used inversely to suggest rather than answer questions.

The General Practice Research Database and the Doctors Independent Network (DIN-Link) Database in the United Kingdom are clinical databases that have been extensively used for research [6669]. They comprise patient records collected over years, not months, thereby allowing investigation of longitudinal treatment effects. These features are invaluable in hypothesis generation and testing and can also help refine the design and powering of RCTs [9].

Evaluating Guideline Implementation

Observational studies can help evaluate guideline implementation in everyday clinical practice. In the United States, an observational study was carried out using an integrated managed care database to characterize the patterns of care observed in patients prior to emergency department (ED) treatment of acute asthma [70]. The study was motivated by the limited data on resource use prior to ED attendance and the recognition that better understanding of treatment patterns prior to an ED visit may help identify opportunities for improved interventions. The study explored adherence to guideline recommendations through evaluation of ICS therapy in the year prior to the ED visit and through quantification of short-acting β2-agonists and oral corticosteroids and rescue medications in the year before and in the month after the ED visit. Also investigated was the impact of the acute care intervention in the ED on altering the prescription of ICS and other asthma medications in the 2 months after the ED event. The study demonstrated a high dependence on rescue medications—short-acting β2-agonists and oral corticosteroids—in this population prior to ED attendance, and that the ED event resulted in only an incremental short-term improvement in ICS-containing controller treatment. Such characterization of prescribing patterns in real world clinical practice requires observational, noninterventional methods and cannot be achieved through cRCTs [14].

Evaluating Real World Influence of Inhaler Device Type

Another area of asthma management in which study alternatives to cRCTs can provide useful information is in the evaluation of inhaler type. Asthma cRCTs typically train recruited patients in inhaler technique and often require trial participants to be able to demonstrate and maintain proper inhalation technique throughout. However, in real world practice, many patients use their inhaler devices incorrectly, and proper inhaler technique is infrequently reinforced [14, 71, 72].

The REALITY study used the General Practice Research Database to evaluate the comparative effectiveness of different inhaler types as used in everyday asthma management [7375]. Participating patients commenced or increased ICS therapy via a range of different inhaler types (pMDIs, breath-actuated metered-dose inhalers [BAIs], and dry powder inhalers). No requirements were placed on patients’ inhaler training beyond routine standard care. Significant differences in the odds of achieving successful asthma control were found for both BAI- and dry powder inhaler–treated patients compared with patients using pMDIs [74]. Moreover, these differences had significant health economic implications, with BAIs being on average more cost-effective than pMDIs [73].

Evaluating Real World Effectiveness

Although well-designed observational studies have good external validity, they lack the internal validity of cRCTs. One of the main challenges of observational studies is ensuring adequate treatment group comparability, but expertise is growing and methodologies are continually evolving that assess baseline comparability of the study cohorts and also validate outcomes for consistency across multiple subgroups [76]. When study cohorts are similar at baseline, outcome analysis can proceed with suitable statistical adjustments being made for any characteristics that are statistically or clinically significantly different between cohorts or are strongly predictive of the outcome [77, 78]. Where cross-sectional baseline data reveal substantial differences between groups, a matched cohort analysis (or other suitable methodology, such as propensity-based matching) should be used. Patients are matched on key demographic and clinical baseline characteristics to minimize any differences in baseline disease severity and to ensure that strongly confounding baseline effects are comparable across treatment groups, thus allowing study outcomes to be appropriately interpreted [79].

In this regard, Price et al. [77, 78] employed a matched cohort approach to an effectiveness and cost-effectiveness comparison of extra-fine hydrofluoroalkane beclomethasone dipropionate (EF HFA-BDP; QVAR [Teva Respiratory, Horsham, PA]) pMDI, fluticasone propionate (FP) pMDI, and chlorofluorocarbon-BDP pMDI therapies. To ensure similarity of asthma severity at baseline, patients in the EF HFA-BDP and FP [77], and in the EF HFA-BDP and BDP [78] groups were matched on important demographic and asthma-related baseline characteristics prior to outcome evaluation (Table 3). The two separate matched cohort analyses found that in a real world setting, patients receiving EF HFA-BDP had a similar or better chance of achieving asthma control at lower prescribed doses than with FP [77] or chlorofluorocarbon-BDP (Table 4, Table 5, and Fig. 1) [78]. These findings were reinforced by similar outcomes in the unmatched cohort analyses and the consistency of subanalysis results for age group and smoking status.

Table 3 Summary of demographic and clinically important matching criteria used by Price et al. [77] and Barnes et al. [78] to ensure baseline similarity of patients in the different treatment arms
Table 4 Summary of co–primary outcomes: OR for achieving asthma control and rate ratio of exacerbationsa
Table 5 Distribution of prescribed doses at the index date in the Price et al. [77] and Barnes et al. [78] 2-way matched analyses of EF HFA-BDP vs FP, and EF HFA-BDP vs CFC-BDPa
Fig. 1
figure 1

a and b Illustration of the distribution of prescribed doses at the index date in Barnes et al. [78] and two-way matched analyses of extra-fine hydrofluoroalkane beclomethasone dipropionate (EF HFA-BDP) versus chlorofluorocarbon beclomethasone dipropionate (CFC-BDP). (Reprinted from J Clin Exp Allergy, 14 July 2011, Barnes N, Price D, Colice G, et al.: Asthma control with extrafine-particle hydrofluoroalkane–beclometasone vs. large-particle chlorofluorocarbon–beclometasone: a real-world observational study, doi: 10.1111/j.1365-2222.2011.03820.x. [Epub ahead of print], copyright 2011, with permission from John Wiley and Sons.) c and d Illustration of the distribution in Price et al. [77] two-way matched analysis of EF HFA-BDP versus fluticasone propionate (FP). Prescribed doses were significantly different between treatment cohorts in both the initiation and step-up populations for both the EF HFA-BDP versus CFC-BDP and the EF HFA-BDP versus FP matched analyses (P < 0.001). (Reprinted from J Allergy Clin Immunol vol. 126, Price D, Martin RJ, Barnes N, et al.: Prescribing practices and asthma control with hydrofluoroalkane-beclomethasone and fluticasone: a real-world observational study, pages 511–518 e511-510, copyright 2010, with permission from Elsevier.) Prescribed doses were significantly different between treatment cohorts in both the initiation and step-up populations for both the EF HFA-BDP versus CFC-BDP and the EF HFA-BDP versus FP matched analyses (P < 0.001)

Remaining Limitations of Observational Studies

In addition to the difficulties in achieving comparability of treatment arms, another past criticism of observational studies has been their purported tendency to overestimate treatment effects [79]. More recent reviews suggest that such concerns are largely unfounded. A pooled analysis of observational studies and cRCTs (taken from the Medline, Abridged Index Medicus, and Cochrane databases between 1985 and 1998), in which two or more treatments or interventions for the same condition were evaluated, found little evidence to suggest that treatment effects reported in the observational studies were consistently larger than or qualitatively different from those reported in the cRCTs. The analysis involved 136 reports across 19 diverse treatments. In only 2 of the 19 analyses of treatment effects did the combined magnitude of the effect in the observational studies lie outside the 95% CI for the combined magnitude reported in the cRCTs [80].

The knowledge of how to work with clinical databases and quality of practice-based patient data is continually improving as researchers collaborate and work with contributing practices to ensure that relevant, high-quality data are recorded. The matched cohort approach and validation of outcomes across subgroups that has emerged in recent observational research should help mitigate concerns around confounding of findings through differences in study populations. However, their real strength will continue to lie in hypothesis generation and testing, helping to identify areas in which further rigorous clinical trials are required.

To increase confidence in the results of database analyses, it is of the utmost importance to describe (a priori) in a detailed study protocol all planned analyses before they begin, exactly as for a cRCT. In that respect, the move toward greater ethical transparency in medical research, which requires increasing numbers of publicly and privately funded clinical trials to be registered and published in online study databases and centralized repositories (eg, http://www.clinicaltrials.gov), should also help improve communication of quality methodologies and gradually drive out poorly designed studies that can undermine the field of observational research.

The reporting of observational studies should follow the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) guidelines [81], while reporting of real life trials should follow the CONSORT (Consolidating Standards of Reporting Trials) guidelines for cluster randomized trials [82] and pragmatic trials [40].

Conclusions

Data from cRCTs represent the gold standard for evaluating treatment safety and efficacy, owing to rigorous trial design and strong internal validity. However, no study design is without its limitations, and questions exist concerning the generalizability of cRCT findings to the widely heterogeneous asthma population, and their accuracy over the longer term. Observational studies have high external validity and may assist in answering some of the questions that cRCTs have not yet answered or cannot answer. Contrastingly, their internal validity is often poor, but it can be improved by detailed a priori analysis planning and by grounding database analyses on rational hypotheses. Pragmatic clinical trials with appropriate quality checks are positioned between the two. As recently proposed by ARIA (Allergic Rhinitis and its Impact on Asthma) and GA2LEN (Global Allergy and Asthma European Network), a combination of all these approaches is probably needed because all have advantages and drawbacks and do not answer the same question [83].

The evidence base in asthma and respiratory medicine must be reviewed openly and broadly, and while cRCTs unequivocally lie at the core, quality contributions from well-designed pragmatic trials and observational studies also should be recognized and considered for the complementary role that they play.