We discuss linear convolutional neural networks (LCNs) and their critical points. We observe that the function space (that is, the set of functions represented by LCNs) can be identified with polynomials that admit certain factorizations, and we use this perspective to describe the impact of the network's architecture on the geometry of the function space.
For instance, for LCNs with one-dimensional convolutions having stride one and arbitrary filter sizes, we provide a full description of the boundary of the function space. We further study the optimization of an objective function over such LCNs: We characterize the relations between critical points in function space and in parameter space and show that there do exist spurious critical points. We compute an upper bound on the number of critical points in function space using Euclidean distance degrees and describe dynamical invariants for gradient descent.
This talk is based on joint work with Thomas Merkh, Guido Montúfar, and Matthew Trager.
Kathlén Kohn is an assistant professor in Mathematics of Data and AI at KTH Royal Institute of Technology since September 2019. She obtained her PhD from the Technical University of Berlin in 2018. Afterwards she was a postdoctoral researcher at the Institute for Computational and Experimental Research in Mathematics (ICERM) at Brown University and at the University of Oslo. Kathlén’s goal is to understand the intrinsic geometric structures behind machine learning and AI systems in general and to provide a rigorous and well-understood theory explaining them. Her areas of expertise are algebraic, differential, and tropical geometry as well as invariant theory.