On the infinite-depth limit of finite-width neural networks
In this talk, I will discuss the infinite depth limit of finite-width residual neural networks. The infinite-width limit of deep neural networks has been extensively studied in the literature. The converse (infinite depth) remains poorly understood. With proper scaling, we show that by fixing the width and taking the depth to infinity, the vector of pre-activations converges in distribution to a zero-drift diffusion process that is essentially governed by the activation function. Unlike the infinite-width limit where the neurons exhibit a Gaussian behaviour, we show that the infinite-depth limit (with finite width) yields different distributions depending on the choice of the activation function. We further discuss the sequential limit infinite-depth-then-infinite-width and show some key differences with the converse infinite-width-then-infinite-depth limit.
Soufiane Hayou obtained his PhD in statistics in 2021 from Oxford where he was advised by Arnaud Doucet and Judith Rousseau. He graduated from Ecole Polytechnique in Paris before joining Oxford. During his PhD, he worked mainly on the theory of randomly initialized infinite-width neural networks on topics including the impact of the hyperparameters (variance of the weights, activation function) and the architecture (fully-connected, convolutional, skip connections) on how the 'geometric' information propagates inside the network. He is currently a Peng Tsu Ann Assistant Professor of mathematics at the National University of Singapore.
To become a member of the Rough Path Interest Group, register here for free.