Commentary

CAC score controversies: How the 2018 USPSTF recommendation statement misinterpreted the data

In a recent controversial statement, The U.S. Preventive Services Task Force concluded that current evidence is insufficient to assess the balance of benefits and harms of adding coronary artery calcium score to traditional CV risk assessment in asymptomatic adults for the prevention of atherosclerotic CVD.

This position from the USPSTF is at odds with recommendations from the American College of Cardiology, American Heart Association, European Society of Cardiology and Society of Cardiovascular Computed Tomography, which advise consideration of CAC testing in select populations. Herein, we summarize the conclusions from the USPSTF report along with its supporting evidence, review the data on CAC for improving risk assessment in primary prevention, and aim to reconcile the USPSTF recommendations with the more selective indications for appropriate CAC testing.

The CAC score is a quantification of calcium in the epicardial coronary arteries, ranging from zero (no CAC) to thousands of Agatston units (a product of area and density of calcification). It differs from traditional risk factors in that it is a direct measurement of atherosclerosis. A CAC score is an independent predictor of clinical atherosclerotic CVD outcomes. Its utility in risk assessment has been focused in intermediate-risk populations, when further risk stratification is warranted to determine the need for preventive pharmacologic therapies.

Rhanderson Cardoso, MD
Rhanderson Cardoso
Roger S. Blumenthal, MD
Roger S. Blumenthal

The decision by the USPSTF derived from an assessment of test accuracy, reclassification, potential harms of testing and treatment and effectiveness of risk assessment. Remarks from the USPSTF are summarized below, with additional discussion of its methods and conclusions from the data.

Test accuracy

The USPSTF authors report that adding CAC to traditional risk factors and models such as the Pooled Cohort Equation improves calibration (agreement between observed and predicted outcomes) and improves discrimination (ability to distinguish those who will and will not have an event), although only modestly.

The authors reference the work of Joseph Yeboah, MD, MS, and colleagues on the discrimination properties of CAC testing. In an analysis of nearly 7,000 patients from the MESA study, the cohort was stratified into those with less than 7.5% or at least 7.5% 10-year atherosclerotic CVD risk by a “calibrated” Pooled Cohort Equation. Adding CAC to the model improved the area under the receiver operator curve by 0.02 to 0.04, which was considered only a modest improvement by the USPSTF.

However, there are two key limitations to this interpretation. First, since the Pooled Cohort Equation was known to overestimate atherosclerotic CVD risk, the authors “calibrated” the Pooled Cohort Equation to the MESA data. This approach was done to prevent an “unfair advantage” of CAC over the Pooled Cohort Equation alone, given that CAC was obtained in the population being studied, and therefore by definition appropriately calibrated. However, this is not the way that CAC is used in practice by clinicians. A calibrated Pooled Cohort Equation is not available for clinical practice, and thus, interpreting how well CAC can discriminate in addition to the calibrated Pooled Cohort Equation is not easily interpretable or applicable. For real-world practice, to know the incremental value of CAC over the standard Pooled Cohort Equation would be much more substantial and clinically meaningful.

Second, as discussed below, CAC testing is primarily indicated in intermediate-risk patients, where it can reclassify individuals and guide pharmacologic prevention strategies. The study by Yeboah and colleagues should not be used as evidence of limited discrimination among all risk groups. In fact, in the same MESA population, a separate study showed that among 1,330 individuals with a baseline Framingham Risk Score of 5% to 20%, adding CAC to the Framingham Risk Score improved the area under the ROC curve by 0.16, from 0.62 to 0.78. Clearly, the USPSTF did not understand the study by Yeboah and colleagues and did not look at the pertinent data from MESA and other cohorts.

Reclassification

The USPSTF authors reported that CAC tended to have a negative nonevent net reclassification (more persons without a clinical event reclassified into a higher-risk category than those correctly reclassified into a lower-risk category). Because cardiac events are overall infrequent over a decade, the authors concluded that more persons would be inappropriately than appropriately reclassified.

This data is also derived from the study by Yeboah and colleagues. Notwithstanding the previously mentioned limitation of using a calibrated risk tool, the authors found that among patients who had an event, CAC had a positive net reclassification improvement. However, among those who did not have an event, there was a negative net reclassification, meaning that more patients were incorrectly than correctly reclassified. However, these results are not reflective of the reclassification properties of CAC scoring because it considers all patients — those with low, intermediate and high risk. In fact, the study by the same group of authors, of 1,330 individuals in the MESA population with a baseline Framingham Risk Score of 5% to 20% showed a very high, positive net reclassification index in both the population with and without events.

A study by Khurram Nasir, MD, MPH, and colleagues very nicely outlined the reclassification potential of CAC testing in the intermediate-risk population. The authors divided the nearly 7,000 patients from MESA according to the baseline 10-year atherosclerotic CVD risk predicted by the Pooled Cohort Equation: less than 5% (n = 1,792), 5% to 7.5% (n = 589), 7.5% to 20% (n = 1,381), and more than 20% (n = 441). In the lower- and higher-risk categories, approximately 25% and 75% of patients had a CAC greater than zero, respectively. In these populations, the event rate was not substantially different between groups with CAC of zero and CAC greater than zero, and thus, the knowledge of CAC score would unlikely change preventive management.

In the intermediate-risk groups, however, half of patients had a CAC score of zero. More importantly, in the group of patients with baseline risk of 5% to 7.5% and 7.5% to 20%, patients with a CAC score greater than zero had an event rate that was five- and twofold higher, respectively, than those with zero CAC. In addition, the event rates in the population with estimated 10-year atherosclerotic CVD between 5% and 7.5% with CAC greater than zero achieved a threshold where primary prevention lipid-lowering therapies are advised. On the other hand, half of the population with an estimated risk between 5% and 20% had a CAC score of zero. These patients had a very low event rate, below currently accepted thresholds for statin therapy.

The large proportion of intermediate-risk patients who have a CAC score of zero and the power of a CAC score of zero to predict a low rate of atherosclerotic CVD outcomes, therefore avoiding unnecessary preventive pharmacotherapy, has been consistently shown in other populations. In the Framingham Risk Score cohort, those with a CAC score of zero had a 10-year incidence of atherosclerotic CVD events of 1.6%. In the Jackson Heart Study, 40% of statin-eligible patients according to ACC/AHA Guidelines had a CAC score of zero, and the atherosclerotic CVD event rate in this population was 3%. Altogether, these data indicate the very real ability for CAC testing to guide management in this large intermediate-risk population.

Similarly, CAC scoring has been shown to result in a high net reclassification rate in the intermediate-risk subgroup of the Heinz Nixdorf Recall study, where the net reclassification improvement (ie, the difference between the percentages of patients who were correctly and incorrectly reclassified) was 30%.

Harms of testing and treatment

The USPSTF authors acknowledge that radiation exposure from CAC is low (approximately 1 mSv) and that the potential harms from false-positive results, incidental findings (eg, pulmonary nodules), and further testing (eg, cardiac catheterization) are overall low. Similarly, the harms of preventive therapies such as aspirin and statins are low and outweighed by CVD risk reduction in populations found to be at higher risk.

In fact, there are more data corroborating the safety of CAC testing. The 1-mSv radiation dose is approximately one-third of average annual exposure from natural sources. Also, there are no well-established reasons for repeating CAC testing, so the overwhelming majority of individuals who undergo testing will only have it done once in their life span. The test is also done quickly and without contrast. Furthermore, the EISNER randomized trial showed no increase in downstream medical testing by a strategy of CAC testing, and that the higher resource utilization in those with CAC greater than 400 was balanced by a lower resource use in the patients with a CAC score of zero.

Effectiveness of CAC testing

The USPSTF authors reported evidence that CAC testing does not ultimately improve CVD outcomes and that CAC testing is not superior to traditional risk factor-based CVD risk assessment for behavioral modification, risk factor management, and the use of preventive medications.

In one of the studies cited in the USPSTF report, investigators randomly assigned 1,005 asymptomatic patients with CAC greater than 80th percentile to atorvastatin 20 mg daily, vitamin C and vitamin E vs. placebo. After follow-up of approximately 4 years, patients receiving atorvastatin had a 40% reduction in LDL (baseline 146 mg/dL). The incidence of pooled CVD endpoints was 6.9% in the atorvastatin group vs. 9.9% in the placebo group. Although this did not quite reach statistical significance (P = .08), the 30% RR reduction with statin therapy is fully consistent with the expected benefit of statins on the basis of risk reduction per millimole per liter of LDL reduction, indicating that the study simply did not have enough power to reach statistical significance.

Furthermore, there is evidence supporting that knowledge of CAC and CAC testing improve medication adherence and lifestyle modification. In the EISNER trial, CAC testing was associated with a favorable change in BP, LDL, waist circumference and weight loss over a 4-year follow-up compared with no CAC testing.

Ultimately, the USPSTF negative recommendation likely stems from the absence of randomized, high-quality data showing a reduction in CVD outcomes such as mortality and nonfatal strokes or MI by CAC testing compared with no CAC evaluation. Although randomized trials would certainly be welcomed by the CV community and are likely to be conducted in upcoming years, several points regarding a randomized trial must be considered beforehand.

No. 1, most importantly, the population would need to be carefully chosen to include intermediate-risk patients only, as discussed. No. 2, even in the right population, CAC testing would only lead to escalation of prevention therapies and reduction of atherosclerotic CVD outcomes in half the population, at best. A substantial proportion may still derive benefit from testing by reclassification as low risk and de-escalation of therapy, but this would not be expected to lower event rates, as primary prevention pharmacotherapies are overall safe. Ultimately, this creates an issue for study power and sample size, which is compounded by the low event rates in a primary prevention population. In fact, to conduct such a trial, it has been estimated that a sample size of 30,000 patients would be required to have 90% power.

No. 3, CAC testing is a diagnostic study, and the need to prove a reduction in endpoints to recommend clinical use of a diagnostic test seems overly burdensome. In the current era of widespread use of prevention strategies and declining rates of clinical atherosclerotic CVD events, even pharmacologic and interventional therapies rarely meet this target. It is unprecedented to set such a threshold for a diagnostic or risk assessment test in CV medicine. No such evidence is available for other widely used risk assessment tools such as the Framingham Risk Score or the Pooled Cohort Equation. The CAC score is not meant for widespread screening use, as discussed in the USPSTF recommendations. Rather, it is a powerful risk assessment test, to be used in selective patient populations, where it can reliably reclassify patients into higher- and lower-risk categories.

In conclusion, the absence of randomized data showing a reduction in endpoints with CAC testing should not deter clinicians, professional societies, government agencies and health care payers from identifying the well-established role of selective use of CAC testing in intermediate-risk groups for improved risk stratification and guidance of escalation or de-escalation of prevention therapies.

Disclosures: The authors report no relevant financial disclosures.

In a recent controversial statement, The U.S. Preventive Services Task Force concluded that current evidence is insufficient to assess the balance of benefits and harms of adding coronary artery calcium score to traditional CV risk assessment in asymptomatic adults for the prevention of atherosclerotic CVD.

This position from the USPSTF is at odds with recommendations from the American College of Cardiology, American Heart Association, European Society of Cardiology and Society of Cardiovascular Computed Tomography, which advise consideration of CAC testing in select populations. Herein, we summarize the conclusions from the USPSTF report along with its supporting evidence, review the data on CAC for improving risk assessment in primary prevention, and aim to reconcile the USPSTF recommendations with the more selective indications for appropriate CAC testing.

The CAC score is a quantification of calcium in the epicardial coronary arteries, ranging from zero (no CAC) to thousands of Agatston units (a product of area and density of calcification). It differs from traditional risk factors in that it is a direct measurement of atherosclerosis. A CAC score is an independent predictor of clinical atherosclerotic CVD outcomes. Its utility in risk assessment has been focused in intermediate-risk populations, when further risk stratification is warranted to determine the need for preventive pharmacologic therapies.

Rhanderson Cardoso, MD
Rhanderson Cardoso
Roger S. Blumenthal, MD
Roger S. Blumenthal

The decision by the USPSTF derived from an assessment of test accuracy, reclassification, potential harms of testing and treatment and effectiveness of risk assessment. Remarks from the USPSTF are summarized below, with additional discussion of its methods and conclusions from the data.

Test accuracy

The USPSTF authors report that adding CAC to traditional risk factors and models such as the Pooled Cohort Equation improves calibration (agreement between observed and predicted outcomes) and improves discrimination (ability to distinguish those who will and will not have an event), although only modestly.

The authors reference the work of Joseph Yeboah, MD, MS, and colleagues on the discrimination properties of CAC testing. In an analysis of nearly 7,000 patients from the MESA study, the cohort was stratified into those with less than 7.5% or at least 7.5% 10-year atherosclerotic CVD risk by a “calibrated” Pooled Cohort Equation. Adding CAC to the model improved the area under the receiver operator curve by 0.02 to 0.04, which was considered only a modest improvement by the USPSTF.

However, there are two key limitations to this interpretation. First, since the Pooled Cohort Equation was known to overestimate atherosclerotic CVD risk, the authors “calibrated” the Pooled Cohort Equation to the MESA data. This approach was done to prevent an “unfair advantage” of CAC over the Pooled Cohort Equation alone, given that CAC was obtained in the population being studied, and therefore by definition appropriately calibrated. However, this is not the way that CAC is used in practice by clinicians. A calibrated Pooled Cohort Equation is not available for clinical practice, and thus, interpreting how well CAC can discriminate in addition to the calibrated Pooled Cohort Equation is not easily interpretable or applicable. For real-world practice, to know the incremental value of CAC over the standard Pooled Cohort Equation would be much more substantial and clinically meaningful.

PAGE BREAK

Second, as discussed below, CAC testing is primarily indicated in intermediate-risk patients, where it can reclassify individuals and guide pharmacologic prevention strategies. The study by Yeboah and colleagues should not be used as evidence of limited discrimination among all risk groups. In fact, in the same MESA population, a separate study showed that among 1,330 individuals with a baseline Framingham Risk Score of 5% to 20%, adding CAC to the Framingham Risk Score improved the area under the ROC curve by 0.16, from 0.62 to 0.78. Clearly, the USPSTF did not understand the study by Yeboah and colleagues and did not look at the pertinent data from MESA and other cohorts.

Reclassification

The USPSTF authors reported that CAC tended to have a negative nonevent net reclassification (more persons without a clinical event reclassified into a higher-risk category than those correctly reclassified into a lower-risk category). Because cardiac events are overall infrequent over a decade, the authors concluded that more persons would be inappropriately than appropriately reclassified.

This data is also derived from the study by Yeboah and colleagues. Notwithstanding the previously mentioned limitation of using a calibrated risk tool, the authors found that among patients who had an event, CAC had a positive net reclassification improvement. However, among those who did not have an event, there was a negative net reclassification, meaning that more patients were incorrectly than correctly reclassified. However, these results are not reflective of the reclassification properties of CAC scoring because it considers all patients — those with low, intermediate and high risk. In fact, the study by the same group of authors, of 1,330 individuals in the MESA population with a baseline Framingham Risk Score of 5% to 20% showed a very high, positive net reclassification index in both the population with and without events.

A study by Khurram Nasir, MD, MPH, and colleagues very nicely outlined the reclassification potential of CAC testing in the intermediate-risk population. The authors divided the nearly 7,000 patients from MESA according to the baseline 10-year atherosclerotic CVD risk predicted by the Pooled Cohort Equation: less than 5% (n = 1,792), 5% to 7.5% (n = 589), 7.5% to 20% (n = 1,381), and more than 20% (n = 441). In the lower- and higher-risk categories, approximately 25% and 75% of patients had a CAC greater than zero, respectively. In these populations, the event rate was not substantially different between groups with CAC of zero and CAC greater than zero, and thus, the knowledge of CAC score would unlikely change preventive management.

PAGE BREAK

In the intermediate-risk groups, however, half of patients had a CAC score of zero. More importantly, in the group of patients with baseline risk of 5% to 7.5% and 7.5% to 20%, patients with a CAC score greater than zero had an event rate that was five- and twofold higher, respectively, than those with zero CAC. In addition, the event rates in the population with estimated 10-year atherosclerotic CVD between 5% and 7.5% with CAC greater than zero achieved a threshold where primary prevention lipid-lowering therapies are advised. On the other hand, half of the population with an estimated risk between 5% and 20% had a CAC score of zero. These patients had a very low event rate, below currently accepted thresholds for statin therapy.

The large proportion of intermediate-risk patients who have a CAC score of zero and the power of a CAC score of zero to predict a low rate of atherosclerotic CVD outcomes, therefore avoiding unnecessary preventive pharmacotherapy, has been consistently shown in other populations. In the Framingham Risk Score cohort, those with a CAC score of zero had a 10-year incidence of atherosclerotic CVD events of 1.6%. In the Jackson Heart Study, 40% of statin-eligible patients according to ACC/AHA Guidelines had a CAC score of zero, and the atherosclerotic CVD event rate in this population was 3%. Altogether, these data indicate the very real ability for CAC testing to guide management in this large intermediate-risk population.

Similarly, CAC scoring has been shown to result in a high net reclassification rate in the intermediate-risk subgroup of the Heinz Nixdorf Recall study, where the net reclassification improvement (ie, the difference between the percentages of patients who were correctly and incorrectly reclassified) was 30%.

Harms of testing and treatment

The USPSTF authors acknowledge that radiation exposure from CAC is low (approximately 1 mSv) and that the potential harms from false-positive results, incidental findings (eg, pulmonary nodules), and further testing (eg, cardiac catheterization) are overall low. Similarly, the harms of preventive therapies such as aspirin and statins are low and outweighed by CVD risk reduction in populations found to be at higher risk.

In fact, there are more data corroborating the safety of CAC testing. The 1-mSv radiation dose is approximately one-third of average annual exposure from natural sources. Also, there are no well-established reasons for repeating CAC testing, so the overwhelming majority of individuals who undergo testing will only have it done once in their life span. The test is also done quickly and without contrast. Furthermore, the EISNER randomized trial showed no increase in downstream medical testing by a strategy of CAC testing, and that the higher resource utilization in those with CAC greater than 400 was balanced by a lower resource use in the patients with a CAC score of zero.

PAGE BREAK

Effectiveness of CAC testing

The USPSTF authors reported evidence that CAC testing does not ultimately improve CVD outcomes and that CAC testing is not superior to traditional risk factor-based CVD risk assessment for behavioral modification, risk factor management, and the use of preventive medications.

In one of the studies cited in the USPSTF report, investigators randomly assigned 1,005 asymptomatic patients with CAC greater than 80th percentile to atorvastatin 20 mg daily, vitamin C and vitamin E vs. placebo. After follow-up of approximately 4 years, patients receiving atorvastatin had a 40% reduction in LDL (baseline 146 mg/dL). The incidence of pooled CVD endpoints was 6.9% in the atorvastatin group vs. 9.9% in the placebo group. Although this did not quite reach statistical significance (P = .08), the 30% RR reduction with statin therapy is fully consistent with the expected benefit of statins on the basis of risk reduction per millimole per liter of LDL reduction, indicating that the study simply did not have enough power to reach statistical significance.

Furthermore, there is evidence supporting that knowledge of CAC and CAC testing improve medication adherence and lifestyle modification. In the EISNER trial, CAC testing was associated with a favorable change in BP, LDL, waist circumference and weight loss over a 4-year follow-up compared with no CAC testing.

Ultimately, the USPSTF negative recommendation likely stems from the absence of randomized, high-quality data showing a reduction in CVD outcomes such as mortality and nonfatal strokes or MI by CAC testing compared with no CAC evaluation. Although randomized trials would certainly be welcomed by the CV community and are likely to be conducted in upcoming years, several points regarding a randomized trial must be considered beforehand.

No. 1, most importantly, the population would need to be carefully chosen to include intermediate-risk patients only, as discussed. No. 2, even in the right population, CAC testing would only lead to escalation of prevention therapies and reduction of atherosclerotic CVD outcomes in half the population, at best. A substantial proportion may still derive benefit from testing by reclassification as low risk and de-escalation of therapy, but this would not be expected to lower event rates, as primary prevention pharmacotherapies are overall safe. Ultimately, this creates an issue for study power and sample size, which is compounded by the low event rates in a primary prevention population. In fact, to conduct such a trial, it has been estimated that a sample size of 30,000 patients would be required to have 90% power.

PAGE BREAK

No. 3, CAC testing is a diagnostic study, and the need to prove a reduction in endpoints to recommend clinical use of a diagnostic test seems overly burdensome. In the current era of widespread use of prevention strategies and declining rates of clinical atherosclerotic CVD events, even pharmacologic and interventional therapies rarely meet this target. It is unprecedented to set such a threshold for a diagnostic or risk assessment test in CV medicine. No such evidence is available for other widely used risk assessment tools such as the Framingham Risk Score or the Pooled Cohort Equation. The CAC score is not meant for widespread screening use, as discussed in the USPSTF recommendations. Rather, it is a powerful risk assessment test, to be used in selective patient populations, where it can reliably reclassify patients into higher- and lower-risk categories.

In conclusion, the absence of randomized data showing a reduction in endpoints with CAC testing should not deter clinicians, professional societies, government agencies and health care payers from identifying the well-established role of selective use of CAC testing in intermediate-risk groups for improved risk stratification and guidance of escalation or de-escalation of prevention therapies.

Disclosures: The authors report no relevant financial disclosures.