Cardiovascular diseases are the main cause of death worldwide. The aim of the present study is to verify the performances of a data mining methodology in the evaluation of cardiovascular risk in athletes, and whether the results may be used to support clinical decision making. Anthropometric (height and weight), demographic (age and sex) and biomedical (blood pressure and pulse rate) data of 26,002 athletes were collected in 2012 during routine sport medical examinations, which included electrocardiography at rest. Subjects were involved in competitive sport practice, for which medical clearance was needed. Outcomes were negative for the largest majority, as expected in an active population. Resampling was applied to balance positive/negative class ratio. A decision tree and logistic regression were used to classify individuals as either at risk or not. The receiver operating characteristic curve was used to assess classification performances. Data mining and resampling improved cardiovascular risk assessment in terms of increased area under the curve. The proposed methodology can be effectively applied to biomedical data in order to optimize clinical decision making, and-at the same time-minimize the amount of unnecessary examinations.

Predicting Cardiovascular Risk in Athletes: Resampling Improves Classification Performance

Barbieri, Davide
Primo
;
Zaccagni, Luciana
;
2020

Abstract

Cardiovascular diseases are the main cause of death worldwide. The aim of the present study is to verify the performances of a data mining methodology in the evaluation of cardiovascular risk in athletes, and whether the results may be used to support clinical decision making. Anthropometric (height and weight), demographic (age and sex) and biomedical (blood pressure and pulse rate) data of 26,002 athletes were collected in 2012 during routine sport medical examinations, which included electrocardiography at rest. Subjects were involved in competitive sport practice, for which medical clearance was needed. Outcomes were negative for the largest majority, as expected in an active population. Resampling was applied to balance positive/negative class ratio. A decision tree and logistic regression were used to classify individuals as either at risk or not. The receiver operating characteristic curve was used to assess classification performances. Data mining and resampling improved cardiovascular risk assessment in terms of increased area under the curve. The proposed methodology can be effectively applied to biomedical data in order to optimize clinical decision making, and-at the same time-minimize the amount of unnecessary examinations.
2020
Barbieri, Davide; Chawla, Nitesh; Zaccagni, Luciana; Grgurinović, Tonći; Šarac, Jelena; Čoklo, Miran; Missoni, Saša...espandi
File in questo prodotto:
File Dimensione Formato  
CVR ijerph-17-07923-v2.pdf

accesso aperto

Descrizione: Full text editoriale
Tipologia: Full text (versione editoriale)
Licenza: Creative commons
Dimensione 495.19 kB
Formato Adobe PDF
495.19 kB Adobe PDF Visualizza/Apri

I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/2425418
Citazioni
  • ???jsp.display-item.citation.pmc??? 5
  • Scopus 14
  • ???jsp.display-item.citation.isi??? 12
social impact