Anastasis Kratsios

Abstract

We build universal approximators of continuous maps between arbitrary Polish metric spaces X and Y using universal approximators between Euclidean spaces as building blocks. Earlier results assume that the output space Y is a topological vector space. We overcome this limitation by "randomization": our approximators output discrete probability measures over Y. When X and Y are Polish without additional structure, we prove very general qualitative guarantees; when they have suitable combinatorial structure, we prove quantitative guarantees for Hölder-like maps, including maps between finite graphs, solution operators to rough differential equations between certain Carnot groups, and continuous non-linear operators between Banach spaces arising in inverse problems. In particular, we show that the required number of Dirac measures is determined by the combinatorial structure of X and Y. For barycentric Y, including Banach spaces, R-trees, Hadamard manifolds, or Wasserstein spaces on Polish metric spaces, our approximators reduce to Y-valued functions. When the Euclidean approximators are neural networks, our constructions generalize transformer networks, providing a new probabilistic viewpoint of geometric deep learning.

 

As an application, we show that the solution operator to an RDE can be approximated within our framework.

 

Based on the following articles:

  • An Approximation Theory for Metric Space-Valued Functions With A View Towards Deep Learning (2023) - Chong Liu, Matti Lassas, Maarten V. de Hoop, and Ivan Dokmanić (ArXiV 2304.12231)
  • Designing universal causal deep learning models: The geometric (Hyper)transformer (2023) B. Acciaio, A. Kratsios, and G. Pammer, Math. Fin. https://onlinelibrary.wiley.com/doi/full/10.1111/mafi.12389
  • Universal Approximation Under Constraints is Possible with Transformers (2022) - ICLR Spotlight - A. Kratsios, B. Zamanlooy, T. Liu, and I. Dokmanić.

Our speaker

Anastasis Kratsios is an assistant professor at McMaster University in Canada and a Faculty Affiliate at the Vector Institute in Toronto. His research focuses on geometric deep learning theory, from approximation to statistical learning guarantees, with an emphasis on financial applications.  His work has appeared in all top machine learning venues from JMLR to NeurIPS. He previously was a postdoctoral fellow at ETH Zürich and the University of Basel.