In spite of machine learning has been successfully used in a wide range of healthcare applications, there are several parameters that could influence the performance of a machine learning system. One of the big issues for a machine learning algorithm is related to imbalanced dataset. An imbalanced dataset occurs when the distribution of data is not uniform. This makes harder the implementation of accurate models. In this paper, intelligent models are implemented to predict the hematocrit level of blood starting from visible spectral data. The aim of this work is to show the effects of two balancing techniques (SMOTE and SMOTE+ENN) on the imbalanced dataset of blood spectra. Four different machine learning systems are fitted with imbalanced and balanced datasets and their performances are compared showing an improvement, in terms of accuracy, due to the use of balancing.

Prediction of hematocrit through imbalanced dataset of blood spectra

Decaro, Cristoforo
Primo
Formal Analysis
;
Bellanca, Gaetano
Ultimo
Writing – Review & Editing
2021

Abstract

In spite of machine learning has been successfully used in a wide range of healthcare applications, there are several parameters that could influence the performance of a machine learning system. One of the big issues for a machine learning algorithm is related to imbalanced dataset. An imbalanced dataset occurs when the distribution of data is not uniform. This makes harder the implementation of accurate models. In this paper, intelligent models are implemented to predict the hematocrit level of blood starting from visible spectral data. The aim of this work is to show the effects of two balancing techniques (SMOTE and SMOTE+ENN) on the imbalanced dataset of blood spectra. Four different machine learning systems are fitted with imbalanced and balanced datasets and their performances are compared showing an improvement, in terms of accuracy, due to the use of balancing.
Decaro, Cristoforo; Montanari, Giovanni Battista; Bianconi, Marco; Bellanca, Gaetano
File in questo prodotto:
File Dimensione Formato  
htl2.12006.pdf

accesso aperto

Descrizione: Full text editoriale
Tipologia: Full text (versione editoriale)
Licenza: Creative commons
Dimensione 794.56 kB
Formato Adobe PDF
794.56 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11392/2453918
Citazioni
  • ???jsp.display-item.citation.pmc??? 2
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 1
social impact