Predicting artery clogging with a machine learning model is superior to current risk-based methods

The likelihood of clogged arteries in patients without symptoms can be far better predicted with a machine learning algorithm based on 12 patient data points, than by the risk scoring systems that doctors currently use, as reported in an article published in the Journal of the American College of Cardiology.

Artery clogging is a progressive disease that ultimately can cause heart attacks, strokes, or dementia. Doctors call the disease atherosclerosis.

Read the free book on “bad” LDL cholesterol, healthy diets, statin safety, and ultrasound artery screening, at the home page:

Researchers developed a machine learning approach they found to be superior to risk scoring methods for cardiovascular disease that are currently used in the U.S. and Europe. One risk-scoring method used in the U.S., for example, when tested against the data set on which the researchers trained their model, assigned an intermediate or high risk score to only 15% of individuals that in fact had early-stage artery clogging at more than one site.

In contrast, the machine learning algorithm, when validated using a data set of patients separate from the data set on which it was trained, had a much higher success rate in identifying individuals with artery clogging, with an “area under the curve” (a measure of the accuracy of a machine learning model) of 0.83, where 1.00 would be perfect.

The researchers trained their machine learning model using data from 3,515 middle aged individuals, for whom data were available from several sources: blood tests; two-dimensional ultrasound scans of the carotid arteries, the abdominal aorta, and the femoral arteries; and coronary artery calcium scoring.

The top five data points found by the machine learning system to be most predictive of artery clogging were, in order: age, average blood sugar over the past three months (“HbA1c”), the ratio of total cholesterol to “good” HDL cholesterol, leukocyte count (a measure of inflammation), and hemoglobin. Also among the top 10 predictors were “bad” LDL cholesterol and systolic blood pressure (the first number in a blood pressure reading). The effects of LDL cholesterol on artery clogging would also be captured by one of the top three data points, the ratio of total cholesterol (including LDL) to HDL cholesterol.

An editorial accompanying the study highlighted the superiority of the researcher’s machine learning model over risk scoring methods used in the U.S. and Europe. It noted, however, that doctors would find that some patient data elements toward the bottom of the list of 12 predictors are not intuitively related to artery clogging. While those predictors could be unrelated to the success of the model when validated against the separate data set,  the editorial left open the possibility that there could be “strong import” to those predictors that “we may simply not understand.”

The study, available free online, is titled “Machine Learning Improves Cardiovascular Risk Definition for Young, Asymptomatic Individuals.” The accompanying editorial, also available free, is titled “Transforming Data Into Diagnosis: Exercises for a Computer to Perform and a Physician to Interpret.”

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s