A comparative analysis of heterogeneity in lung cancer screening effectiveness in two randomised controlled trials

admin

8 months ago

A comparative analysis of heterogeneity in lung cancer screening effectiveness in two randomised controlled trials

Heterogeneity in LC screening effectiveness was evaluated in individual-level data from the two largest screening trials through traditional sub-group analyses, predictive-modelling and machine-learning. Estimates were similar across methodologies, for both NELSON (median LCM reduction across methods = 27.2%, 95% CI: 10.8–40.9%) and NLST (median = 15.3%, 95% CI: 3.7–25.5%) after accounting for differences in participant characteristics between trials. The greater reduction in LCM in NELSON compared to NLST may reflect differences in trial designs, such as differences in control arms (NELSON: no screening. NLST: chest radiography screening) and number of screening rounds (NELSON: four screening rounds. NLST: three screening rounds). Screening effectiveness diminished with increasing pack-years (LCM reductions across trials = 26.8–50.9% in the lowest pack-year groups compared to 5.5–9.5% in the highest pack-year groups), former smokers compared to current smokers (LCM reductions = 37.8–39.1% versus 16.1–22.7%) and women compared to men (LCM reductions = 24.6–25.3% versus 8.3–24.9%). LC risks estimated by risk-prediction models were lower for groups with higher estimated relative screening effectiveness. However, histology was identified as the main effect modifier of heterogeneity in LC screening effectiveness. Screening was effective for adenocarcinoma and OTH, reducing mortality by 17.8–23.0% (medians NELSON&NLST) and 24.5–35.5% (medians NELSON&NLST), respectively. In contrast, screening was less effective for small-cell cancers, reducing mortality by 9.7–11.3% (medians NELSON&NLST) and discordant results were found for squamous-cell carcinoma (NELSON: median reduction of 52.2% (95% CI: 25.7–69.1% decrease); NLST: median increase of 27.9% (95% CI: 69.8% increase to 4.5% decrease)). Our findings are consistent with natural-history models that estimate longer preclinical durations and greater screen-detectability for histologies for which we find greater screening effectiveness^9,19. In particular, we find greater screening effectiveness for histologies that predominantly develop in locations that allow for easier detection through screening. For example, adenocarcinomas develop predominantly in the periphery of the lungs, as opposed to small-cell cancers that tend to be centrally located^20,21. Furthermore, our findings are consistent with regards to observed relations between histology and smoking behaviour, and variations in survival by histology and smoking behaviour^{22,23,24,25,26,27}. Consequently, these mechanisms may drive the heterogeneity in screening effectiveness found in our study.

Although squamous-cell carcinoma incidence has decreased, it still represents over 20% of LC²⁸. While analyses based on NLST have suggested screening may not be beneficial for squamous-cell carcinoma, our analyses suggest it was beneficial in NELSON³. This may be in part due to differences in nodule management protocols. Semi-automated measurements of nodule volume and volume doubling time as applied in NELSON has been shown to be more accurate in detecting nodule growth than the manual measurements of nodule diameter used in NLST²⁹. This is supported by a recent review that demonstrates that there may not be a significant change in volume at a three month follow-up scan, even when the volume doubling time is less than 400 days³⁰. Consequently, future studies should evaluate the impact of the differences in nodule management protocols between the trials on histology-specific screening effectiveness.

Screening effectiveness was greater for women, those with fewer accumulated pack-years and former smokers, due to the higher prevalence of histologies for which screening effectiveness was greater. Thus, the 2021 USPSTF recommendation to lower the minimum pack-years for screening eligibility and the ACS recommendation to relax restrictions on the numbers of years since smoking cessation will improve eligibility among individuals in whom histologies with greater screening effectiveness are more prevalent^10,11,31. These relaxations have also been shown to improve eligibility among individuals of African-American ancestry who are more likely to develop squamous-cell carcinoma compared to individuals of European ancestry^28,32,33. Consequently, lung cancer screening effectiveness for African-Americans should be further evaluated. Our analyses suggest a potential relation between family history of LC and adenocarcinoma (Supplementary Tables S11 and S15). This is of particular importance for regions with high LC incidence in never-smokers, whom predominantly develop adenocarcinomas. Consequently, studies investigating screening in never-smokers should further evaluate the impact of heterogeneity in screening effectiveness by histology³⁴.

Integrating smoking cessation support has been shown to enhance the effectiveness of LC screening in reducing LCM through reducing the risk for developing LC. Our analyses suggest that integrating smoking cessation support further enhances the effectiveness of LC screening in reducing LCM through two additional pathways. Firstly, our analyses suggest that screening effectiveness is greater for former smokers compared to current smokers, particularly for long-term former smokers. Secondly, successful smoking cessation prevents the further accumulation of additional pack-years, which our analyses suggest is associated with reduced screening effectiveness. Consequently, the findings of our study may be used to further improve the uptake of smoking cessation services in LC screening programs.

Our study suggests that relaxing eligibility criteria improves selection of individuals in whom histologies with greater screening effectiveness are more prevalent. However, this also expands eligibility to lower-risk individuals. Consequently, criteria relaxations may yield diminishing returns in additional deaths prevented and reduce screening efficiency (screens required per LC detected). Currently ongoing implementation efforts, like the United Kingdom’s targeted lung health checks, are predominantly focused on regions with high LC rates to optimize screening efficiency and available healthcare resources^35,36. However, as these areas are more likely to be populated with high-risk (heavier smoking) individuals, LC screening effectiveness could be lower than anticipated. While studies indicate that extending screening to lower risk individuals may be cost-effective in the U.S., this requires additional health-care resources³⁷. Consequently, full evaluations of the trade-offs between screening effectiveness, efficiency, health inequities and required health-care resources are essential to guide implementation efforts. Furthermore, studies should evaluate how information on heterogeneity in screening effectiveness can be included and impact shared decision-making processes.

Previous studies evaluated heterogeneity in LC screening effectiveness^3,5,6,7,8. Wille indicated greater effectiveness for individuals with Chronic Obstructive Pulmonary Disease who smoked ≥35 pack-years⁵. Infante suggested greater effectiveness for those with <40 pack-years, current smokers and a Forced Expiratory Volume in 1 s (FEV1)% ≥ 80⁸. Nevertheless, these studies evaluated trials with non-significant results and should be interpreted with caution¹³. Pinsky found heterogeneity in screening effectiveness by histology and potential heterogeneity by sex in NLST³. However, their analysis considered one-variable-at-a-time rather than predictive approaches, precluding the identification of effect modification by patient characteristics. Our study confirms heterogeneity in screening effectiveness between histologies, but also accounted for confounding and effect modification by patient characteristics. For example, Pinsky et al. found greater screening effectiveness for current smokers compared to former smokers. Our results are consistent when smoking status alone is considered (Supplementary Fig. S16). However, our results indicate greater benefits for long-term former smokers when time since smoking cessation is also taken into consideration, demonstrating the importance of including sufficient granularity in former smoking behaviour. Furthermore, our study expands on these findings by identifying groups in which histologies with greater screening effectiveness are more prevalent, providing important guidance to clinicians. Kovalchik evaluated screening effectiveness across risk-groups in NLST⁶. Similarly to our findings, they found relative screening effectiveness was constant across risk-groups and absolute effectiveness increased with risk, however their approach did not evaluate screening effectiveness across different risk-factors nor consider histology. Kumar evaluated the cost-effectiveness of risk-based screening in the NLST through a multistate model⁷. Similarly to Kovalchik, they found absolute screening effectiveness increased with risk, but did not consider histology⁶. Our results indicate that while screening effectiveness does not vary by overall risk, it does vary across individual components of risk. Therefore, future studies should not only consider overall LC(M) risk, but also consider individual components of risk with sufficient granularity.

In contrast to previous studies, we evaluated heterogeneity in LC screening effectiveness in two trials with statistically significant results. We performed a comprehensive analysis that accounted for participant risk-factors, LC risk, and histology. The definitions of the risk-factors were aligned between trials and showed similar effect sizes in both trials. We considered both relative and absolute effectiveness through various methods. We equalized post-screening follow-up and accounted for differences in participant characteristics between trials.

We evaluated different methods with different strengths and limitations, as outlined in Supplementary Table S2. These strengths vary from straightforward interpretation (one-variable-at-a-time, risk-prediction models), explicitly accounting for different covariates (predictive modelling and machine-learning approaches) to not requiring assumptions for linearity (machine learning). However, they are also subject to limitations such as no or limited accounting for other covariates (one-variable-at-a-time, risk-prediction models), not allowing interactions between screening effectiveness and covariates (risk-modelling), overfitting (effect-modelling) or non-straightforward interpretation (machine-learning). However, despite the differences in their underlying assumptions, differences in strengths and limitations, the results were consistent across methodologies, demonstrating the robustness of our findings. Furthermore, we evaluated calibration and discrimination, which are often poorly reported for both predictive-modelling and machine-learning approaches^13,15.

Our findings are based on two trials. While overall screening effectiveness was greater in NELSON than NLST after accounting for participant characteristics and post-screening follow-up, the trials also differed in number of screening rounds and screening interval lengths^1,2. Furthermore, while NELSON compared CT screening to no screening, NLST compared CT screening to chest-radiography screening^1,2. Nodules detected at incidence screening rounds vary in LC risk from those detected at baseline, which may affect screening effectiveness across screening rounds³⁸. Furthermore, it is uncertain whether there were differences in LC treatment patterns between the trials. Finally, while NELSON compared CT screening to no screening, NLST compared CT screening to chest-radiography screening. The Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial found a non-significant LCM reduction of 9% for chest-radiography screening³⁹. Consequently, CT screening effectiveness in NLST may be underestimated. However, smoking exposure in PLCO was lower (20% of PLCO participants were NLST-eligible; 45% were never-smokers), which our investigation suggests would lead to greater prevalence of LC with favourable screening effectiveness. This is supported by comparing the NLST and PLCO chest-radiography arms LC histology distributions (Supplementary Table S20). Thus, chest-radiography screening in PLCO may have been more effective than NLST due to a greater prevalence of LC with favourable screening effectiveness. Hence, future studies should evaluate whether and how differences in trial designs, nodule growth patterns and LC treatments affected screening effectiveness across histologies.

Participants of both trials were more likely to be younger and have ceased smoking, but were generally representative of the general population meeting their inclusion criteria^40,41. Still, it is well known that the individuals eligible for lung cancer screening are more likely to have comorbidities such as COPD than those included in the trials^42,43. These comorbidities increase the overall risk of lung cancer, reduce life-expectancy and may affect both treatment effectiveness and the histological type of lung cancer that develops^{18,22,23,44,45,46,47,48,49}. Thus, future research should further evaluate the interplay between comorbid conditions and screening effectiveness.

Our analysis was limited to 4–4.5 years post-screening to equalize follow-up between trials. Consequently, our estimates should be interpreted as representative for the evaluated follow-up period in both trials. However, while the number of life-years gained through screening may increase with prolonged post-screening follow-up, the effect on LCM may be diluted. Indeed, this was shown in extended follow-up analyses of NLST, although dilution was modest⁵⁰. However, limited information on histology was available for cancers detected during the extended follow-up process (<8%), precluding evaluation of heterogeneity in screening effectiveness by histology. The analyses considered population characteristics at baseline. However, 10–24% of current smokers at baseline in the trials ceased smoking post-enrollment^51,52,53. Consequently, this may have affected the estimates of smoking cessation on LC screening effectiveness.

Misclassification of histology can occur and recommendations for the pathological classification of LC have changed over time⁵⁴. However, sensitivity analysis regarding LC misclassification similar to Pinsky, did not affect our findings with regards to variations in histology-screening effectiveness and prevalence of histologies across risk-factors³. Furthermore, we applied penalised estimation to mitigate the potential for overfitting as well as non-parametric methods. Given the consistency of our findings across different methodologies, the consistency between the risk-factors included in our models and those included in well validated risk-prediction models for LC(M), and the good calibration performance of the models, we believe the potential for model misspecification to be modest. Targeted therapy and immunotherapy use was limited during the trials, but their uptake has increased considerably since then. In addition, histology-specific incidence has changed in past decades and may change further²⁸. Thus, the effects of the increased uptake of novel therapies and changes in histology-specific incidence on LC screening effectiveness should be monitored.

Overall, our study shows that heterogeneity in LC screening effectiveness is primarily driven by histology. The 2021 USPSTF and 2023 ACS guidelines are more likely to include individuals with higher prevalence of histologies with high screening effectiveness compared to their previous guidelines, due to relaxation of smoking-related eligibility criteria. Integrating risk-reduction interventions in LC screening programs may further enhance screening effectiveness.

link