Convergence of Langevin-Simulated Annealing algorithms with multiplicative noise

Abstract

We study the convergence of Langevin-Simulated Annealing type algorithms with multiplicative noise, i.e. for V:Rd→R a potential function to minimize, we consider the stochastic equation dYt=−σσ⊤∇V(Yt)dt+a(t)σ(Yt)dWt+a(t)2Υ(Yt)dt, where (Wt) is a Brownian motion, where σ:Rd→Md(R) is an adaptive (multiplicative) noise, where a:R+→R+ is a function decreasing to 0 and where Υ is a correction term. This setting can be applied to optimization problems arising in Machine Learning. The case where σ is a constant matrix has been extensively studied however little attention has been paid to the general case. We prove the convergence for the L1-Wasserstein distance of Yt and of the associated Euler-scheme Y¯t to some measure ν⋆ which is supported by argmin(V) and give rates of convergence to the instantaneous Gibbs measure νa(t) of density ∝exp(−2V(x)/a(t)2). To do so, we first consider the case where a is a piecewise constant function. We find again the classical schedule a(t)=Alog−1/2(t). We then prove the convergence for the general case by giving bounds for the Wasserstein distance to the stepwise constant case using ergodicity properties.

Publication
In arXiv e-prints

Gilles Pagès’ website

Pierre Bras
Pierre Bras
PhD Student in Applied Mathematics

I am a PhD student under the direction of Gilles Pagès, interested in Numerical Probability, Stochastic Optimization, Machine Learning and Stochastic Calculus

Related