On stochastic optimization and the Adam optimizer: Divergence, convergence rates, and acceleration techniques
ZOOM + Room 0.016 (Institut für Informatik, Campus Poppelsdorf, Universität Bonn) Abstract: Stochastic gradient descent (SGD) optimization methods are nowadays the method of choice for the training of …