Label noise is a harmful issue that arises when data are erroneously labeled. Several label noise issues can occur but, among them, unit of measure inconsistencies (UMIs) are inexplicably neglected in the literature. Despite its relevance, a general and automated approach for UMI detection suitable to gas turbines (GTs) has not been developed yet; as a result, GT diagnosis, prognosis, and control may be challenged since collected data may not reflect the actual operation. To fill this gap, this paper investigates the capability of three supervised machine learning classifiers, i.e., Support Vector Machine, Naïve Bayes, and K‐Nearest Neighbors, that are tested by means of challenging analyses to infer general guidelines for UMI detection. Classification accuracy and posterior probability of each classifier is evaluated by means of an experimental dataset derived from a large fleet of Siemens gas turbines in operation. Results reveal that Naïve Bayes is the optimal classifier for UMI detection, since 88.5% of data are correctly labeled with 84% of posterior probability when experimental UMIs affect the dataset. In addition, Naïve Bayes proved to be the most robust classifier also if the rate of UMIs increases.

Optimal Classifier to Detect Unit of Measure Inconsistency in Gas Turbine Sensors

Lucrezia Manservigi
Primo
;
Mauro Venturini
Secondo
;
Enzo Losi;
2022

Abstract

Label noise is a harmful issue that arises when data are erroneously labeled. Several label noise issues can occur but, among them, unit of measure inconsistencies (UMIs) are inexplicably neglected in the literature. Despite its relevance, a general and automated approach for UMI detection suitable to gas turbines (GTs) has not been developed yet; as a result, GT diagnosis, prognosis, and control may be challenged since collected data may not reflect the actual operation. To fill this gap, this paper investigates the capability of three supervised machine learning classifiers, i.e., Support Vector Machine, Naïve Bayes, and K‐Nearest Neighbors, that are tested by means of challenging analyses to infer general guidelines for UMI detection. Classification accuracy and posterior probability of each classifier is evaluated by means of an experimental dataset derived from a large fleet of Siemens gas turbines in operation. Results reveal that Naïve Bayes is the optimal classifier for UMI detection, since 88.5% of data are correctly labeled with 84% of posterior probability when experimental UMIs affect the dataset. In addition, Naïve Bayes proved to be the most robust classifier also if the rate of UMIs increases.
2022
Manservigi, Lucrezia; Venturini, Mauro; Losi, Enzo; Bechini, Giovanni; Artal de la Iglesia, Javier
File in questo prodotto:
File Dimensione Formato  
Optimal-Classifier-to-Detect-Unit-of-Measure-Inconsistency-in-Gas-Turbine-SensorsMachines.pdf

accesso aperto

Descrizione: Full text editoriale
Tipologia: Full text (versione editoriale)
Licenza: Creative commons
Dimensione 2.03 MB
Formato Adobe PDF
2.03 MB Adobe PDF Visualizza/Apri

I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/2500618
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 4
social impact