Cumulative Incidence of False-Positive Test Results in Lung Cancer Screening

A Randomized Trial Annals Int Med April 20, 2010 vol. 152 no. 8 505-512

Background: Direct-to-consumer promotion of lung cancer screening has increased, especially low-dose computed tomography (CT). However, screening exposes healthy persons to potential harms, and cumulative false-positive rates for low-dose CT have never been formally reported. Objective: To quantify the cumulative risk that a person who participated in a 1- or 2-year lung cancer screening examination would receive at least 1 false-positive result, as well as rates of unnecessary diagnostic procedures.

Design: Randomized, controlled trial of low-dose CT versus chest radiography.Setting: Feasibility study for the ongoing National Lung Screening Trial. Patients: Current or former smokers, aged 55 to 74 years, with a smoking history of 30 pack-years or more and no history of lung cancer (n = 3190).

Intervention: Random assignment to low-dose CT or chest radiography with baseline and 1 repeated annual screening; 1-year follow-up after the final screening. Randomization was centralized and stratified by age, sex, and study center. Measurements: False-positive screenings, defined as a positive screening with a completed negative work-up or 12 months or more of follow-up with no lung cancer diagnosis.

Results: By using a Kaplan–Meier analysis, a person's cumulative probability of 1 or more false-positive low-dose CT examinations was 21% after 1 screening and 33% after 2. The rates for chest radiography were 9% and 15%, respectively. A total of 7% of participants with a false-positive low-dose CT examination and 4% with a false-positive chest radiography had a resulting invasive procedure. Limitations: Screening was limited to 2 rounds. Follow-up after the second screening was limited to 12 months. The false-negative rate is probably an underestimate.

Conclusion: Risks for false-positive results on lung cancer screening tests are substantial after only 2 annual examinations, particularly for low-dose CT. Further study of resulting economic, psychosocial, and physical burdens of these methods is warranted.

The ongoing National Lung Screening Trial aims to define the effectiveness of screening for lung cancer. However, imaging studies to screen for lung cancer are currently marketed to patients.

Contribution

These data from a pilot study for the National Lung Screening Trial show a 33% cumulative incidence of false-positive results after 2 computed tomography examinations and 15% after 2 chest radiography examinations. Substantial proportions of patients (7% for computed tomography and 4% for chest radiography) with false-positive results required invasive testing to determine that the screening-detected lesion was not cancer.

Implication

Physicians and patients should bear in mind high false-positive rates when considering screening for lung cancer with computed tomography or chest radiography.

—The Editors

Despite the lack of a completed randomized, controlled trial demonstrating the efficacy of low-dose computed tomography (CT) in reducing mortality from lung cancer, its use as a screening tool is gaining increased attention in the past several years. Some hospitals and advocacy organizations have actively promoted CT screening to the public. A 2007 New York Times article quotes the director of surgical oncology at Greenwich Hospital, Greenwich, Connecticut, as predicting that “within the next five years, lung cancer screening will be routine, like mammography and colonoscopy”. One advocacy group mounted a national “Demand a CAT Scan” billboard campaign in 2008. The Lung Cancer Mortality Reduction Act of 2009 Senate bill states that “significant and rapid improvements in lung cancer mortality can be expected through greater use and access to lung cancer screening tests”.

Utilization rates of chest radiography or CT screening in the community setting have not been well studied. The Dutch–Belgian randomized lung cancer (NELSON) screening trial reported that 3.1% of participants received a screening chest radiography or CT examination outside the trial by 24 months after randomization (this may not be representative of U.S. rates). Surveys of U.S. community physicians have demonstrated high enthusiasm for screening. One study found that two thirds of family practitioners, internists, and gynecologists and 82% of general surgeons recommended chest radiography for lung cancer screening every 1 to 2 years.

However, as is the case with all medical interventions, screening tests may generate both benefits and harms. If lung cancer screening becomes national health policy, we must have solid evidence not only about the benefits but also the harms of testing. This is particularly important because asymptomatic persons are the target population. Major harms associated with screening include the risk for overdiagnosis—the discovery of indolent lung cancer that would not lead to a person's death or cancer in a person who would die of a competing cause first—and the risk for false-positive results. False-positive results are important because they may have negative psychological effects, affect future adherence to other preventive health measures and generate physical harms and economic costs from surveillance visits and confirmatory procedures.

Although the Lung Screening Study has reported the total positivity rate, we have not previously examined cumulative false-positivity rates by using formal statistical methods, and the false-positive component most accurately represents a clinically important burden of screening. We focus on the probability of false-positive test results and resulting diagnostic procedures when chest radiography and CT are used as early detection strategies for lung cancer.

Design

The Lung Screening Study was a 2-year study conducted by 6 centers participating in the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial. It was a feasibility study for the ongoing National Lung Screening Trial

Setting and Participants

The Lung Screening Study had a goal of randomly assigning 3000 participants at elevated risk for lung cancer. Enrollment was achieved through mass mailings of recruitment materials, along with public service announcements, posters, and physician recruitment efforts. A total of 3318 persons were randomly assigned from September 2000 to January 2001 to have chest radiography or low-dose CT. All participants signed a consent form approved by the institutional review board before randomization.

Eligible participants were aged 55 to 74 years, had a cigarette smoking history of 30 pack-years or more, and were current smokers or had quit in the past 10 years. Exclusion criteria included chest CT within 24 months of enrollment, previous lung cancer, removal of part or all of a lung, current treatment of any cancer except nonmelanoma skin cancer, and ongoing participation in a cancer prevention or screening trial other than a smoking cessation study.

Randomization and Interventions

Once eligibility was established and consent was obtained by a study center, participants were randomly assigned to a treatment group through a single centralized, secure, Web-based system (which generated random code) operated by the trial coordinating center. This process ensured allocation concealment for study site investigators. Randomization was stratified by age group (in 5-year categories), sex, and study center by using variable block sizes. Once randomization occurred, participants and study investigators were not blinded to the screening method received.

Two screening examinations were possible: baseline (T0) and repeated examination (T1) 1 year later. Participants were eligible for the second screening if they did not receive a diagnosis of lung cancer after the first examination. For inclusion in this analysis of false-positive results, participants had to adhere to at least 1 screening.

Low-dose CT scans were obtained with the following technical parameters: 120 to 140 kV peak, 60 ma, scan time of 1 s, 5-mm collimation, pitch of 2 or equivalent, and contiguous reconstructions. Chest radiography consisted of single posteroanterior views and was obtained by using high-kilovolt equipment at a tube-to-receiver distance of 6 to 10 feet.

Each study center had 1 or more (range, 1 to 14) board-certified radiologists interpreting the examinations. A second radiologist blinded to the initial interpretation as a quality-control measure reread a small sample of films (n = 20) at each center.

Outcomes and Follow-up

For CT, the definition of a positive screening result changed slightly between the T0 and T1 scans to better match accumulating prognostic evidence: Noncalcified nodules larger than 3 mm at T0 scan or 4 mm or larger at T1 scan were considered suspicious for cancer. Other abnormalities (including spiculated noncalcified nodules of any size; focal parenchymal opacification; endobronchial lesions; hilar, mediastinal, bony, or pleural masses; and major atelectasis) could also be deemed positive according to the radiologist's judgment. For chest radiography, nodules with circular opacity of 3.0 cm or less in diameter, masses greater than 3.0 cm, hilar or mediastinal lymph node enlargement (excepting calcified nodes), major atelectasis, infiltrates or consolidation, and pleural masses were considered suggestive of cancer.

We defined a false-positive screening result as a positive screening with a completed negative work-up or follow-up of at least 12 months with no diagnosis of lung cancer. Because performing biopsy of all screening-detected lung abnormalities was impractical and undesirable (because of the potential for harm), we had to choose a definition of a false-positive test result that relied on diagnostic work-ups for suspicious examinations. For persons who did not receive definitive testing, given that lung cancer as a general rule is one of the more aggressive tumors, we felt that a 12-month monitoring period was a reasonable cutoff for a false-positive result.

We defined a false-negative examination result as a negative screening associated with a diagnosis of lung cancer within 1 year of the examination. This definition is limited in its ability to discern between types of cancer truly missed by screening and aggressive interval tumors that may develop between tests. Furthermore, because the Lung Screening Study was a feasibility trial, follow-up of negative examination findings was not done in the same systematic manner as for positive test results. Persons in whom screening was negative at T1 did not continue to have follow-up in the trial; reported false-negative rates are limited to the period between the T0 and T1 examinations.

All positive results were communicated by telephone and mailed to participants and their designated physician within 3 weeks. The Lung Screening Study did not specify a diagnostic algorithm for follow-up of positive results; centers would provide recommendations for diagnostic action if requested. Center personnel abstracted medical records relating to follow-up of positive screening results. This process began after a positive screening result and continued until a conclusive diagnosis was made or 12 months had passed. In addition, study participants completed a study update form at the T1 screening to identify any interval cases of lung cancer.

Classifications for diagnostic follow-up were divided into categories by author consensus: imaging examinations (noninvasive), minimally invasive procedures (bronchoscopy), moderately invasive procedures (for example, biopsy, thoracentesis, video-assisted thoracoscopy), and major surgical procedures (thoracotomy or lung resections).

Statistical Analysis

Our study attempts to answer the question, “What is the probability that a person entering a lung cancer screening program involving 1 or 2 screening tests will have at least 1 false-positive CT or chest radiography?” A person could contribute to the cumulative risk curve only once, after a first false-positive test result; this avoided double-counting of suspicious nodules and artificial inflation of the curve. Kaplan–Meier analysis generated cumulative incidence curves based on estimated probability of a first false-positive result at baseline or second screening received. For the base-case analysis, we considered persons with incomplete follow-up (<12 months) after a positive screening to have received a false-positive result. As a sensitivity analysis, we assumed that a proportion of persons with insufficient follow-up after a positive examination result did have cancer, in which the proportion was estimated as published positive predictive values (from other trials) of CT (7%) and chest radiography (2%) for screening detection of lung cancer

Logistic regression was done through 2 models to identify potential participant characteristics associated with increased odds of a false-positive examination result after the first screening or the second screening (if the first screening was negative). Variables included age, current versus former smoking status, and smoking history of 60 pack-years or more versus 30 to 59 pack-years. We adjusted logistic regression analyses for screening center. We calculated odds ratios (ORs) separately for CT and chest radiography. We also examined variance in the false-positive rate by study center.

Results

Demographic Characteristics

Participants were more likely to be men (59%) and current smokers (57%). As expected, randomization achieved balance in baseline patient characteristics.

Adherence Rates

A total of 1610 participants underwent at least 1 CT, and 1580 participants underwent at least 1 chest radiography. A total of 1374 participants in the CT group (97%) and 1287 participants in the chest radiography group (95.2%) received both examinations. Adherence was lower in both groups for the second screening than for the baseline test.

False-Positive and False-Negative Results

A total of 31% (n = 506) of participants in the CT group and 14% (n = 216) of participants in the chest radiography group received at least 1 false-positive result. In comparison, screening CT was true-positive in 38 instances (2% of participants) and chest radiography was true-positive in 16 instances (1% of participants).

At baseline screening, the risk for a false-positive result is 21% (95% CI, 19% to 23%) for CT and 9% (CI, 8% to 11%) for chest radiography. These risks increase to 33% (CI, 31% to 35%) and 15% (CI, 13% to 16%), respectively, at second examination. Sensitivity analysis, which assumed that 7% of participants in the CT group and 2% of participants in the chest radiography group with insufficient follow-up after a positive screening had true-positive results, yielded identical rates. Four examinations with false-negative results were reported between the baseline and T1 examinations. All were in the chest radiography group (0.2% of participants in this group).