Abstract
Background Alpha-1 antitrypsin deficiency (AATD) is an under-recognised genetic cause of chronic obstructive lung disease, and many fewer cases than estimated have been identified. Can a reported respiratory and hepatic disease history from a large AATD testing database be used to stratify a person's risk of severe AATD?
Methods We analysed data extracted from the AATD National Detection Program. Demographics and medical history were evaluated to predict AATD PI*ZZ genotype. Logistic regression and integer programming models identified predictors and obtained risk scores. These were internally validated on a subset of the data.
Results Out of 301 343 subjects, 1529 (0.5%) had PI*ZZ genotype. Predictors of severe AATD were asthma, bronchitis, emphysema, allergies, bronchiectasis, family history of AATD, cirrhosis, hepatitis and history of abnormal liver function tests. The derived model establishes a subject's risk of severe AATD, and scores ≥0 had an estimated risk of 0.41%, sensitivity 84.62% and specificity 24.32%. A model simulating guideline recommendations had an estimated risk of 0.51% with a sensitivity of 37.98% and specificity 46.60%. By recommending screening for scores ≥0, we estimate that more subjects would be screened (75.7% versus 53.4%) and detected (84.6% versus 58.2%) compared to a guideline-simulated model.
Conclusion This medical history risk model is a useful predictive tool to detect subjects at greater risk of having severe AATD and improves sensitivity of detection. Scores <0 are at lower risk and may need not be screened; testing is recommended for scores ≥0 and consistent with current guidelines.
Tweetable abstract
Alpha-1 antitrypsin deficiency is an underrecognised genetic disease. Using the National Detection Program database, a medical-history-only prediction model was constructed to risk stratify subjects, and increase testing and detection. https://bit.ly/3r4Rgih
Introduction
Alpha-1 antitrypsin deficiency (AATD) is a common yet underdiagnosed genetic disease due to an inherited mutation in the SERPINA1 gene. This mutation produces the misfolded alpha-1 antitrypsin (AAT) protease inhibitor (PI) and predisposes to COPD and liver cirrhosis [1, 2]. This disease is inherited in an autosomal codominant pattern, and the normal AAT allele is termed “M” while the most common abnormal alleles are “Z” and then “S”. The allelic homozygous PI*ZZ combination accounts for 95% of persons with severe AATD, and it makes up 0.6–2% of all COPD cases [2–5]. Epidemiological data estimate that there are ∼100 000 individuals with severe AATD in the USA, although only ~10 000 have been identified [6]. People ultimately diagnosed with AATD see multiple physicians and experience long delays from symptom onset to diagnosis, which is associated with worsened clinical symptoms and overall survival, underscoring the importance of identifying persons with AATD earlier [7, 8].
While the World Health Organization (WHO), American Thoracic Society (ATS) and European Respiratory Society (ERS) recommend testing all individuals with COPD for AATD, there is low uptake of this recommendation in clinical practice [9–12]. Failure to fully implement these guidelines is due to inadequate awareness of AATD, unclear test results, cost, testing taking too much time and the belief that testing will not impact clinical care [13, 14]. In clinical practice, some physicians subscribe to a traditional understanding of AATD characteristics, such as younger age and family history of AATD, to guide testing [13]. Strategies have been developed to enhance detection and increase testing prevalence, but there has been minimal improvement in the rate of detection of AATD [5, 15–17].
Clinical prediction models are tools that estimate the prognosis, specific outcomes or an individual's risk for a certain diagnosis; for example, the widely used Well's score predicts the risk of a subject having a pulmonary embolism. These models are most frequently utilised by general practitioners and viewed favourably when guiding testing and avoiding additional costs [18]. At this time, there are no models based only on medical history designed and implemented for the diagnostic prediction of AATD.
The University of Florida Alpha-1 Antitrypsin Genetic Laboratory, through the National Detection Program (NDP), has tested >300 000 people throughout the USA and its territories for AATD. Testing is offered free to patients and demographic and clinical information are collected. We hypothesise that evaluating a large database of tested patients may identify certain patient characteristics associated with AATD as well as lead to development of a diagnostic prediction model.
Study design and methods
National Detection Program
The NDP was a study approved by the University of Florida (IRB201801599) to collect data and test individuals for AATD. The study was based at the University of Florida from 2003 until 2014, when it was transferred to GeneAidyx. The programme is available to all healthcare providers and is available throughout the USA and its territories [19]. Healthcare providers are given information about AATD and a blood collection card. An attached questionnaire collects demographic and clinical information including gender, race, date of birth, smoking history (recorded as never, past, current, passive smoke as a child and passive smoke as an adult), hepatic disease history (including cirrhosis, jaundice, hepatitis, liver transplant or history of abnormal liver function test), respiratory disease history (asthma, chronic bronchitis, emphysema, COPD, allergies, bronchiectasis, AATD, family history AATD or family history COPD), and AAT augmentation therapy status. The laboratory testing and protocol performed by the NDP has been described previously [20, 21].
Outcome measures
The primary outcome of the study was a subject having severe AATD based on medical history. Alpha-1 antitrypsin deficiency is defined by the ATS and ERS as a serum AAT level <100 mg·dL−1 (<20 µmol·L−1), although a serum level 57 mg·dL−1 (11 µmol·L−1) is regarded as a “protective threshold” value and lower values represent severe deficiency [11]. The genotype PI*ZZ accounts for the majority of severely deficient serum AAT levels; therefore, for the purpose of this study we defined “severe AATD” as PI*ZZ genotype, although we also compared models of PI*ZZ or PI*SZ as a composite outcome [21, 22].
Modelling approach
We set out to construct a simple, interpretable clinical decision support tool to predict severe AATD based on variables collected by the NDP and to internally validate this tool on a subset of the data. We separately considered solutions using two sets of possible predictors: medical history only (excluding liver transplantation and self-reported AATD) and medical history together with smoking history. We compared a variety of predictive model families in order to establish performance benchmarks, but we considered only four from which an interpretable tool could be derived: decision rules, decision trees, logistic regression and integer programming. Ultimately, we decided to construct a risk score using either logistic regression or integer programming.
Risk scores
A risk score is a sparse solution to an integer optimisation problem. Given binary predictors the goal is to obtain point values (an intercept and linear coefficients) that yield the log-odds of the genotype class y. Where i indexes one case with predictor values and encodes an abnormal genotype, then the risk score produces the probability estimate that case i has abnormal genotype. The problem is to find point values that maximise some measure of predictive performance subject to two constraints: that they take integer values within pre-specified bounds, and that the number of nonzero values satisfies a pre-specified trade-off against the marginal performance gain. We considered multiple choices of constraints in order to obtain several candidate risk scores.
We used two methods to obtain candidate risk scores. One method uses penalisation to obtain a logistic regression model with a desired number of terms, scales its coefficients to desired bounds and rounds the scaled coefficients to integer point values. This procedure is described in detail by Sullivan et al. [23], and we refer to it as rounded regularised logistic regression. The other method is an integer programming solution that takes bounds on the number of terms and on their point values as inputs and returns a pool of integer-valued risk scores [24]. The procedure, called FasterRisk, improves upon previous state-of-the-art integer programming solutions.
Data pre-processing
Reported values of passive smoke exposure are infrequent in the data (roughly 8000) and were combined with “never” for the analysis. In cases where multiple samples were submitted for the same patient, only the first sample was used in the analysis, and the repeat tests were removed (n=7745). For comparison, we modelled current screening guidelines from WHO, ATS and ERS as a risk score with one composite item for COPD, chronic bronchitis or emphysema (“guidelines”). Medical and smoking history models were built using the 237 882 (79%) of cases with nonmissing smoking responses.
Model selection
We followed a three-phase machine learning workflow to optimise, evaluate and select our models. In phase 1, we used cross-validation on a one-sixth sample of the data to compare all predictive model families. In phase 2, we optimised model hyperparameters and compared some of the best-performing families on a four-fifths training set containing the one-sixth sample, again using cross-validation. Finally, in phase 3, we modified the best-performing interpretable families to yield simple decision support tools, fit these to the entire training set and evaluated them on the testing set comprising the remaining fifth of the data. Fitted models were compared using several criteria described in the next section. All samples were stratified by AAT genotype.
Statistical analysis
Key criteria for the risk scores were overall performance, trade-offs with respect to current screening guidelines, discriminability and usability. We assessed overall performance at detecting severe AATD visually using received operating characteristic ROC curves and quantitatively using the area under these curves as well as sensitivity, specificity, risk and screening burden for each score. We report contingency tables for each candidate score based on two cut-offs, namely the scores that obtained specificity below and above, but otherwise closest to, that of guidelines. We assessed discriminability visually using histograms and quantitatively as the quotient of the standard deviation of the values of a score by its theoretical range. We assessed usability primarily by the number of items in a score and secondarily by the largest point value of any item in a score, both of which count against. Through discussion based on these criteria, we selected one risk score for each of the two predictor sets.
Results
Characteristics of study participants
A total of 301 343 separate individuals submitted a sample and underwent testing (figure 1). Supplementary table S1 describes the subjects' characteristics. In this cohort, the median age was 56.7 years (interquartile range 43.2–66.5 years) and 57.9% were female. The most common comorbidities were COPD (46.3%), asthma (25.4%) and emphysema (15.2%); however, 90.6% and 33.2% had no liver or lung disease, respectively. Smoking was noted in the past (48.6%), current (27.7%) and never (23.7%).
Predicting severe AATD
The testing data comprised 47 000 cases, of which 208 subjects had severe AATD. The models consistently performed better at predicting the PI*ZZ genotype than the composite PI*ZZ/PI*SZ outcome. Throughout model selection, logistic regression and integer programming were the best-performing model families. The final risk scores selected for both sets of predictors had been obtained using FasterRisk. Table 1 compares their point values. A subject's score is calculated by adding the point values for those items reported by the subject.
Figure 2 compares the models' ROC curves along with that of guidelines. For each model, the values with specificity on either side of that of guidelines (46.6%) are outlined. Contingency tables show the estimates for the number screened and detected using a model simulating the guidelines, medical history only, and medical history with tobacco history (table 2). Unlike guidelines, neither model achieves roughly equal sensitivity and specificity at some cut-off. Instead, as shown in table 3, any cut-off significantly improves one at some cost to the other, in comparison to guidelines.
Roughly half of the PI*ZZ cases were flagged by the guidelines and final models, medical history only (score ≥0) and medical history with tobacco history (score ≥−1), while most of the remaining PI*ZZ cases were flagged by both final models, but missed by guidelines (supplementary figure S1). The most significant pattern among those remaining is that several were flagged by guidelines and by the medical history only model, but not by the medical and smoking history model, whereas none were missed by the medical history only model.
The medical history only model, with a cut-off score of ≥0, was estimated to flag 75.7% (35 586 out of 47 000) subjects for testing and detect 84.6% (176 out of 208) of severe AATD cases, while missing 15.4% (32 out of 208) cases. The guideline model flagged 53.4% (25 106 out of 47 000) subjects, detected 58.2% (121 out of 208) cases and missed 41.8% (87 out of 208) cases. By increasing the cut-off to a score of ≥1, the testing burden would be reduced to 20.3% (9564 out of 47 000) of subjects at the cost of detecting only 38.0% (79 out of 208) of cases. Figure 3 is an alluvial plot of how PI*ZZ cases were treated by guidelines and the medical history model scored at either ≥0 or ≥1.
The medical history only model with a cut-off score of ≥0 was ultimately selected due to the similar sensitivity and specificity without missing any additional severe AATD subjects when compared to medical and smoking history model (table 4). The point totals range from −2 to 13 with the relative risks and proportions of people with this score or higher that have severe AATD in table 5. The fourth column is the estimated share of people who would be screened at each threshold, a measure of screening burden.
Discussion
This is the first study to describe a diagnostic prediction model for severe AATD using only a subject's clinical history. It is derived and internally validated on >300 000 patients tested for AATD, making this the largest assessment of AATD testing reported to date. The proposed risk score suggests three clinically relevant risk categories: individuals with scores <0 are at low risk and may not be indicated for screening; patients with scores of 0 are at moderate relative risk and would be indicated for screening; and patients with scores >0 are at a greater relative risk and would be strongly indicated for screening. Adding the model to current guidelines of testing all persons with COPD has the potential to increase testing of subjects more likely to have severe deficiency and decrease the burden of testing subjects with a lower risk. Secondly, this approach could address some of those barriers experienced by healthcare providers in clinical practice.
Greulich et al. [13] demonstrated from a survey of healthcare providers that cost and time resources limit AAT testing. The described AATD diagnostic prediction model provides a minimum-cost, evidence-based approach to stratify subjects for testing, which is viewed favourably by primary care providers [18]. In addition, it was shown that providers believe patients with AATD are only seen by specialists [13]. Primary care providers are chiefly responsible for diagnosing and managing subjects with COPD and are subsequently a source of under-diagnosis and delayed diagnosis of AATD [27–30]. A strength of this model is that it provides an actionable plan to screen subjects, which primary-care providers value from such models [18]. Lastly, the prediction model lacks age, race, gender and tobacco use variables, which are factors known to introduce bias into the selection of subjects for AAT testing [10, 13].
A number of detection strategies have been implemented, although there are differences and strengths from the proposed prediction model [31, 32]. These strategies focus on identifying a singular entity, such as a diagnosis of COPD, use of bronchodilators or airflow obstruction on pulmonary function tests (PFTs) to establish a subject's risk [5, 15, 33, 34]. In contrast, the prediction model allows for the inclusion of multiple factors to individualise risk of a severe genotype. Additionally, the prediction model is portable and may be paired with PFT software, medical documentation, electronic clinical reminders, a questionnaire linked to appointment reminders, when checking-in for an appointment, mobile applications or an interactive web-based calculator [35].
In addition, the model demonstrated novel findings. The presence of asthma and allergies alone reduced the estimated probability of severe AATD, which diverges from our current understanding. Eden et al. [36] reviewed >1000 subjects from the AATD Registry, and found that asthma and allergies were reported in 38% and 29%, respectively. Similarly, Campos et al. [27] showed not only that a prior diagnosis of asthma was reported in 43.3% of subjects registered with AlphaNet, but also that asthma was an increasing reason for AATD testing. It is important to recognise that people with AATD have a higher prevalence of asthma and asthma-like features, yet the pathomechanisms connecting these two processes is unclear and continues to be investigated [37]. The negative contributions of these factors, amid additional, stronger predictors, may reflect that their co-occurrence with AATD is largely due to their having comorbidities in common. Similarly, bronchiectasis has been considered an association with AATD and current guidelines recommend testing for AATD in persons with unexplained bronchiectasis. In this study, bronchiectasis was found to have an increased risk of severe AATD, although Carreto et al. [38] had a low detection rate of AATD when screening persons with bronchiectasis at two centres in the UK. This also raises the question of co-occurrence of AATD with bronchiectasis and the impact of geography and prevalence of AATD detection rates.
Also of note, active and past history of smoking reduced the probability of severe AATD. AATD is classically considered a cause of emphysema that is out of proportion to a subject's tobacco use, and a model that includes smoking history and favours nonsmokers would align with this. However, we caution against including a smoking history in a clinical decision support tool, for two reasons: its inclusion introduces a bias against testing persons who use(d) tobacco; and persons who use tobacco who are ultimately diagnosed with AATD have higher cessation rates [39, 40]. Moreover, the medical history only model performed similarly overall without missing any additional severe AATD cases.
This study and the prediction model have a number of limitations. First, we recognise that a prediction model that includes additional medical history variables, e.g. recurrent respiratory infections and oxygen use, PFT measurements or chest imaging findings could produce more accurate and precise results. However, we believe that it would come at the cost of an additional cognitive burden, time to complete the questions and further testing with additional expenses [18]. Second, while the ROC of 0.606 is poor, it is a substantial improvement over our model of current guidelines. Moreover, if used as in table 5, our model would detect more severe AATD cases than guidelines while providing more specific expectations and risk estimates in each case. Third, there are chances for missed populations, such as clinically unaffected subjects who would not produce a score high enough to recommend AAT testing. We note that the same would be true for the ATS, ERS and WHO recommendations. Furthermore, the model does not address issues with heterozygotes, such as PI*MZ, who are also predisposed to disease [41], although heterozygotes with symptomatic disease, e.g. bronchitis or emphysema, would score high enough to warrant testing and matches guideline recommendations. Fourth, the data are not without potential biases. Most conspicuously, predictors were self-reported and subject to recall bias and interpretation. Also, the selection of subjects for AATD testing was at the discretion of the healthcare provider; thereby introducing selection bias. Regrettably, there currently exists no other large AATD database to avoid these biases, nor would it be readily feasible to construct to a large enough database, of prospectively and consecutively tested subjects to meet this request. Fifth, our comparison model, which approximates guidelines put forth by leading organisations, does not reflect the real-world processes that currently lead some, but far from all, at-risk patients to be tested for AATD. Besides professional guidelines, these processes are subject to access to care, confounding comorbidities and clinician judgement, among other factors. Lastly, the model was internally validated on a portion of the NDP's data. Validation on an external dataset would provide a better estimate of the model's performance, stronger demonstration of its usefulness and indications for further improvement.
Interpretation
This study describes a novel diagnostic prediction model that uses only a subject's medical history in order to identify those with a higher risk of having severe AATD. Compared to a model simulating guideline recommendation, the diagnostic model would improve sensitivity, testing and detection. This modelling approach could augment current guidelines and help increase the diagnosis of severe AATD.
Supplementary material
Supplementary Material
Please note: supplementary material is not edited by the Editorial Office, and is uploaded as it has been supplied by the author.
Table S1 00302-2023.TABLES1
Figure S1 00302-2023.FIGURES1
Acknowledgement
We thank Jiachang Liu (Duke University, Durham, NC, USA) for insight and guidance on FasterRisk.
Footnotes
Provenance: Submitted article, peer reviewed.
This study was approved by the University of Florida (IRB201801599) to collect data and test individuals for AATD.
Conflict of interest: None declared.
- Received May 11, 2023.
- Accepted June 20, 2023.
- Copyright ©The authors 2023
This version is distributed under the terms of the Creative Commons Attribution Non-Commercial Licence 4.0. For commercial reproduction rights and permissions contact permissions{at}ersnet.org