Minimization problems involving a finite sum as objective function often arise in machine learning applications. The number of components of the finite-sum term is typically very large, by making unfeasible the computation of its gradient. For this reason stochastic gradient methods are commonly considered. The performance of these approaches strongly relies on the selection of both the learning rate and the mini-batch size employed to compute the stochastic direction. In this paper we combine a recent idea to select the learning rate as a diagonal matrix based on stochastic Barzilai-Borwein rules together with an adaptive subsampling technique to fix the mini-batch size. Convergence results of the resulting stochastic gradient algorithm are shown for both convex and non-convex objective functions. Several numerical experiments on binary classification problems are carried out to compare the proposed method with other state-of-the-art schemes.
Diagonal Barzilai-Borwein Rules in Stochastic Gradient-Like Methods
Ruggiero V.;Trombini I.Penultimo
;
2023
Abstract
Minimization problems involving a finite sum as objective function often arise in machine learning applications. The number of components of the finite-sum term is typically very large, by making unfeasible the computation of its gradient. For this reason stochastic gradient methods are commonly considered. The performance of these approaches strongly relies on the selection of both the learning rate and the mini-batch size employed to compute the stochastic direction. In this paper we combine a recent idea to select the learning rate as a diagonal matrix based on stochastic Barzilai-Borwein rules together with an adaptive subsampling technique to fix the mini-batch size. Convergence results of the resulting stochastic gradient algorithm are shown for both convex and non-convex objective functions. Several numerical experiments on binary classification problems are carried out to compare the proposed method with other state-of-the-art schemes.File | Dimensione | Formato | |
---|---|---|---|
544026_1_En_2_Chapter_Author.pdf
solo gestori archivio
Descrizione: Pre-print
Tipologia:
Pre-print
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
888.95 kB
Formato
Adobe PDF
|
888.95 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Diagonal Barzilai-Borwein Rules in Stochastic Gradient-Like Methods.pdf
solo gestori archivio
Descrizione: Full text editoriale
Tipologia:
Full text (versione editoriale)
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
1.05 MB
Formato
Adobe PDF
|
1.05 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.