Generating data with complex patterns, such as images, audio, and molecular structures, requires fitting very flexible statistical models to the data distribution. Even in the age of deep neural networks, building such models is difficult because they typically require an intractable normalisation procedure to represent a probability distribution. To address this challenge, I propose to model the vector field of gradients of the data distribution (known as the score function), which does not require normalisation and therefore can take full advantage of the flexibility of deep neural networks. I will show how to (1) estimate the score function from data with flexible deep neural networks and efficient statistical methods, (2) generate new data using stochastic differential equations and Markov chain Monte Carlo, and even (3) evaluate probability values accurately as in a traditional statistical model. The resulting method, called score-based generative modelling, achieves record-breaking performance in applications including image synthesis, text-to-speech generation, time series prediction, and point cloud generation, challenging the long-time dominance of generative adversarial networks (GANs) on many of these tasks. Furthermore, unlike GANs, score-based generative models are suitable for Bayesian reasoning tasks such as solving ill-posed inverse problems, and I have demonstrated their superior performance on sparse-view computed tomography and accelerated magnetic resonance imaging.
Yang Song is a final year PhD student at Stanford University. His research interest is in deep generative models and their applications to inverse problem solving and AI safety.
His first-author papers have been recognised with an Outstanding Paper Award at ICLR-2021, and an oral presentation at NeurIPS-2019. He is a recipient of the Apple PhD Fellowship in AI/ML, and the JP Morgan PhD Fellowship.
To become a member of the Rough Path Interest Group, register here for free.