On stochastic optimization and the Adam optimizer: Divergence, convergence rates, and acceleration techniques
stochastic gradient descent optimization in the training of deep neural networks, arXiv:2503.01660 (2025), 42 pages. [4] A. Jentzen & A. Riekert, Non-convergence to global minimizers for Adam and stochastic [...] Jentzen, & A. Riekert, Sharp higher order convergence rates for the Adam optimizer, arXiv:2504.19426 (2025), 27 pages. …