Research Article

Automated learning rate search using batch-level cross-validation

Volume: 4 Number: 3 December 31, 2021
EN

Automated learning rate search using batch-level cross-validation

Abstract

Deep learning researchers and practitioners have accumulated a significant amount of experience on training a wide variety of architectures on various datasets. However, given a network architecture and a dataset, obtaining the best model (i.e. the model giving the smallest test set error) while keeping the training time complexity low is still a challenging task. Hyper-parameters of deep neural networks, especially the learning rate and its (decay) schedule, highly affect the network's final performance. The general approach is to search the best learning rate and learning rate decay parameters within a cross-validation framework, a process that usually requires a significant amount of experimentation with extensive time cost. In classical cross-validation (CV), a random part of the dataset is reserved for the evaluation of model performance on unseen data. This technique is usually run multiple times to decide learning rate settings with random validation sets. In this paper, we explore batch-level cross-validation as an alternative to the classical dataset-level, hence macro, CV. The advantage of batch-level or micro CV methods is that the gradient computed during training is re-used to evaluate several different learning rates. We propose an algorithm based on micro CV and stochastic gradient descent with momentum, which produces a learning rate schedule during training by selecting a learning rate per epoch, automatically. In our algorithm, a random half of the current batch (of examples) is used for training and the other half is used for validating several different step sizes or learning rates. We conducted comprehensive experiments on three datasets (CIFAR10, SVHN and Adience) using three different network architectures (a custom CNN, ResNet and VGG) to compare the performances of our micro-CV algorithm and the widely used stochastic gradient descent with momentum in a early-stopping macro-CV setup. The results show that, our micro-CV algorithm achieves comparable test accuracy to macro-CV with a much lower computational cost.

Keywords

References

  1. [1] K. Anand, Z. Wang, M. Loog, and J. van Gemert, “Black magic in deep learning: How human skill impacts network training,” in British Machine Vision Conference, 2020.
  2. [2] R. Schwartz, J. Dodge, N. A. Smith, and O. Etzioni, “Green ai,” Communications of the ACM, vol. 63, p. 54–63, Nov. 2020.
  3. [3] G. E. Hinton, N. Srivastava, and K. Swersky, “Neural Networks for Machine Learning,” COURSERA: Neural Networks for Machine Learning, 2012.
  4. [4] D. P. Kingma and J. L. Ba, “Adam: A method for stochastic gradient descent,” in ICLR: International Conference on Learning Representations, 2015.
  5. [5] “Neural networks (maybe) evolved to make adam the best optimizer – parameter-free learning and optimization algorithms.” https://parameterfree.com/2020/12/06/neural-network-maybe-evolved-to-make-adam-the-best-optimizer/. (Accessed on 03/01/2021).
  6. [6] A. C. Wilson, R. Roelofs, M. Stern, N. Srebro, and B. Recht, “The Marginal Value of Adaptive Gradient Methods in Machine Learning,” in Advances in Neural Information Processing Systems 30 (I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, eds.), pp. 4148–4158, Curran Associates, Inc., 2017.
  7. [7] J. Zhang, I. Mitliagkas, and C. R´e, “YellowFin and the Art of Momentum Tuning,” CoRR, vol. abs/1706.0, 2017.
  8. [8] D. Kabakci, “Automated learning rate search using batch-level cross-validation,” Master’s thesis, Middle East Technical University, Ankara, Turkey, July 2019. https://open.metu.edu.tr/handle/11511/43629.

Details

Primary Language

English

Subjects

Artificial Intelligence

Journal Section

Research Article

Authors

Duygu Kabakçı
0000-0001-6636-813X
Türkiye

Publication Date

December 31, 2021

Submission Date

May 10, 2021

Acceptance Date

November 4, 2021

Published in Issue

Year 1970 Volume: 4 Number: 3

APA
Kabakçı, D., & Akbaş, E. (2021). Automated learning rate search using batch-level cross-validation. Sakarya University Journal of Computer and Information Sciences, 4(3), 312-325. https://doi.org/10.35377/saucis...935353
AMA
1.Kabakçı D, Akbaş E. Automated learning rate search using batch-level cross-validation. SAUCIS. 2021;4(3):312-325. doi:10.35377/saucis.935353
Chicago
Kabakçı, Duygu, and Emre Akbaş. 2021. “Automated Learning Rate Search Using Batch-Level Cross-Validation”. Sakarya University Journal of Computer and Information Sciences 4 (3): 312-25. https://doi.org/10.35377/saucis. 935353.
EndNote
Kabakçı D, Akbaş E (December 1, 2021) Automated learning rate search using batch-level cross-validation. Sakarya University Journal of Computer and Information Sciences 4 3 312–325.
IEEE
[1]D. Kabakçı and E. Akbaş, “Automated learning rate search using batch-level cross-validation”, SAUCIS, vol. 4, no. 3, pp. 312–325, Dec. 2021, doi: 10.35377/saucis...935353.
ISNAD
Kabakçı, Duygu - Akbaş, Emre. “Automated Learning Rate Search Using Batch-Level Cross-Validation”. Sakarya University Journal of Computer and Information Sciences 4/3 (December 1, 2021): 312-325. https://doi.org/10.35377/saucis. 935353.
JAMA
1.Kabakçı D, Akbaş E. Automated learning rate search using batch-level cross-validation. SAUCIS. 2021;4:312–325.
MLA
Kabakçı, Duygu, and Emre Akbaş. “Automated Learning Rate Search Using Batch-Level Cross-Validation”. Sakarya University Journal of Computer and Information Sciences, vol. 4, no. 3, Dec. 2021, pp. 312-25, doi:10.35377/saucis. 935353.
Vancouver
1.Duygu Kabakçı, Emre Akbaş. Automated learning rate search using batch-level cross-validation. SAUCIS. 2021 Dec. 1;4(3):312-25. doi:10.35377/saucis. 935353

Cited By

 

INDEXING & ABSTRACTING & ARCHIVING

 

31045 31044   ResimLink - Resim Yükle  31047 

31043 28939 28938 34240
 

 

29070    The papers in this journal are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License