The aim of this paper is to analyze a different practical implementation of the LIne Search based stochastic gradient Algorithm (LISA) recently proposed by Franchini et al. The LISA scheme belongs to the class of stochastic gradient methods and it practically relies on a line search strategy to select the learning rate and a dynamic technique to increase the mini batch size of the stochastic directions. Despite promising performance of LISA in solving optimization problems from machine learning applications, its mini batch increasing strategy involves checking for a condition which may be computationally expensive and memory demanding, especially in the presence of both deep neural networks and very large-scale dataset. In this work we investigate an a-priori procedure to select the size of the current mini batch which allows to dampen the computational execution time of the LISA method and to control the requests of memory resources, especially when dealing with hardware accelerators. A numerical experience on training both statistical models for binary classification and deep neural networks for multi-class image classification confirms the effectiveness of the proposal.

Line Search Stochastic Gradient Algorithm with A-priori Rule for Monitoring the Control of the Variance

Ruggiero V.;Trombini I.
;
2025

Abstract

The aim of this paper is to analyze a different practical implementation of the LIne Search based stochastic gradient Algorithm (LISA) recently proposed by Franchini et al. The LISA scheme belongs to the class of stochastic gradient methods and it practically relies on a line search strategy to select the learning rate and a dynamic technique to increase the mini batch size of the stochastic directions. Despite promising performance of LISA in solving optimization problems from machine learning applications, its mini batch increasing strategy involves checking for a condition which may be computationally expensive and memory demanding, especially in the presence of both deep neural networks and very large-scale dataset. In this work we investigate an a-priori procedure to select the size of the current mini batch which allows to dampen the computational execution time of the LISA method and to control the requests of memory resources, especially when dealing with hardware accelerators. A numerical experience on training both statistical models for binary classification and deep neural networks for multi-class image classification confirms the effectiveness of the proposal.
2025
9783031812408
9783031812415
adaptive hyper-parameter setting; deep learning; line search strategy; machine learning; stochastic gradient methods; variance condition;
adaptive hyper-parameter setting, deep learning; line search strategy, machine learning, stochastic gradient methods, variance condition
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/2580170
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact