地点：腾讯会议(会议号：688 912 696)
报告题目：Accelerating stochastic gradient methods
内容简介：Stochastic gradient method has been extensively used to train machine learning models, in particular for deep learning. Various techniques have been applied to accelerate stochastic gradient methods, either numerically or theoretically, such as momentum acceleration and adapting learning rates. In this talk, I will present two ways to accelerate stochastic gradient methods. The first one is to accelerate the popular adaptive (Adam-type) stochastic gradient method by asynchronous (async) parallel computing. Numerically, async-parallel computing can have significantly higher parallelization speed-up than its sync-parallel counterpart. Several previous works have studied async-parallel non-adaptive stochastic gradient methods. However, a non-adaptive stochastic gradient method often converges significantly slower than an adaptive one. I will show that our async-parallel adaptive stochastic gradient method can have near-linear speed-up on top of the fast convergence of an adaptive stochastic gradient method. In the second part, I will present a momentum-accelerated proximal stochastic gradient method. It can have provably faster convergence than a standard proximal stochastic gradient method. I will also show experimental results to demonstrate its superiority on training a sparse deep learning model.
Dr. Yangyang Xu(徐扬扬) is now a tenure-track assistant professor in the Department of Mathematical Sciences at Rensselaer Polytechnic Institute. He received his B.S. in Computational Mathematics from Nanjing University in 2007, M.S. in Operations Research from Chinese Academy of Sciences in 2010, and Ph.D from the Department of Computational and Applied Mathematics at Rice University in 2014. His research interests are optimization theory and methods and their applications such as in machine learning, statistics, and signal processing. He developed optimization algorithms for compressed sensing, matrix completion, and tensor factorization and learning. Recently, his research focuses on first-order methods, operator splitting, stochastic optimization methods, and high performance parallel computing. He has published over 30 papers in prestigious journals and conference proceedings. He was awarded the gold medal in 2017 International Consortium of Chinese Mathematicians.