To date, decision trees are among the most used classification models. They owe their popularity to their efficiency during both the learning and the classification phases and, above all, to the high interpretability of the learned classifiers. This latter aspect is of primary importance in those domains in which understanding and validating the decision process is as important as the accuracy degree of the prediction. Pruning is a common technique used to reduce the size of decision trees, thus improving their interpretability and possibly reducing the risk of overfitting. In the present work, we investigate on the integration between evolutionary algorithms and decision tree pruning, presenting a decision tree post-pruning strategy based on the well-known multi-objective evolutionary algorithm NSGA-II. Our approach is compared with the default pruning strategies of the decision tree learners C4.5 (J48 - on which the proposed method is based) and C5.0. We empirically show that evolutionary algorithms can be profitably applied to the classical problem of decision tree pruning, as the proposed strategy is capable of generating a more variegate set of solutions than both J48 and C5.0; moreover, the trees produced by our method tend to be smaller than the best candidates produced by the classical tree learners, while preserving most of their accuracy and sometimes improving it.

Decision tree pruning via multi-objective evolutionary computation

Sciavicco, Guido
2017

Abstract

To date, decision trees are among the most used classification models. They owe their popularity to their efficiency during both the learning and the classification phases and, above all, to the high interpretability of the learned classifiers. This latter aspect is of primary importance in those domains in which understanding and validating the decision process is as important as the accuracy degree of the prediction. Pruning is a common technique used to reduce the size of decision trees, thus improving their interpretability and possibly reducing the risk of overfitting. In the present work, we investigate on the integration between evolutionary algorithms and decision tree pruning, presenting a decision tree post-pruning strategy based on the well-known multi-objective evolutionary algorithm NSGA-II. Our approach is compared with the default pruning strategies of the decision tree learners C4.5 (J48 - on which the proposed method is based) and C5.0. We empirically show that evolutionary algorithms can be profitably applied to the classical problem of decision tree pruning, as the proposed strategy is capable of generating a more variegate set of solutions than both J48 and C5.0; moreover, the trees produced by our method tend to be smaller than the best candidates produced by the classical tree learners, while preserving most of their accuracy and sometimes improving it.
2017
Brunello, Andrea; Marzano, Enrico; Montanari, Angelo; Sciavicco, Guido
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/2380308
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 8
  • ???jsp.display-item.citation.isi??? ND
social impact