Label noise is a harmful issue that arises when data are erroneously labeled. Several label noise issues can occur but, among them, unit of measure inconsistencies (UMIs) are inexplicably neglected in the literature. Despite its relevance, a general and automated approach for UMI detection suitable to gas turbines (GTs) has not been developed yet; as a result, GT diagnosis, prognosis, and control may be challenged since collected data may not reflect the actual operation. To fill this gap, this paper investigates the capability of three supervised machine learning classifiers, i.e., Support Vector Machine, Naïve Bayes, and K‐Nearest Neighbors, that are tested by means of challenging analyses to infer general guidelines for UMI detection. Classification accuracy and posterior probability of each classifier is evaluated by means of an experimental dataset derived from a large fleet of Siemens gas turbines in operation. Results reveal that Naïve Bayes is the optimal classifier for UMI detection, since 88.5% of data are correctly labeled with 84% of posterior probability when experimental UMIs affect the dataset. In addition, Naïve Bayes proved to be the most robust classifier also if the rate of UMIs increases.

Optimal Classifier to Detect Unit of Measure Inconsistency in Gas Turbine Sensors

Lucrezia Manservigi
;
Mauro Venturini;Enzo Losi;
2022

Abstract

Label noise is a harmful issue that arises when data are erroneously labeled. Several label noise issues can occur but, among them, unit of measure inconsistencies (UMIs) are inexplicably neglected in the literature. Despite its relevance, a general and automated approach for UMI detection suitable to gas turbines (GTs) has not been developed yet; as a result, GT diagnosis, prognosis, and control may be challenged since collected data may not reflect the actual operation. To fill this gap, this paper investigates the capability of three supervised machine learning classifiers, i.e., Support Vector Machine, Naïve Bayes, and K‐Nearest Neighbors, that are tested by means of challenging analyses to infer general guidelines for UMI detection. Classification accuracy and posterior probability of each classifier is evaluated by means of an experimental dataset derived from a large fleet of Siemens gas turbines in operation. Results reveal that Naïve Bayes is the optimal classifier for UMI detection, since 88.5% of data are correctly labeled with 84% of posterior probability when experimental UMIs affect the dataset. In addition, Naïve Bayes proved to be the most robust classifier also if the rate of UMIs increases.
2022
Manservigi, Lucrezia; Venturini, Mauro; Losi, Enzo; Bechini, Giovanni; Artal de la Iglesia, Javier
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/2500618
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 1
social impact