Presenter(s)
ISIT 2024Plenary Lecture
Date
Abstract
Neural networks are increasingly prevalent and transformative across domains. Understanding how these networks operate in settings where mistakes can be costly (such as transportation, finance, healthcare, and law) is essential to uncovering potential failure modes. Many of these networks operate in the “overparameterized regime,” in which there are far more parameters than training samples, allowing the training data to be fit perfectly. What does this imply about the predictions the network will make on new samples? That is, if we train a neural network to interpolate training samples, what can we say about the interpolant, and how does this depend on the network architecture? In this talk, I will describe insights into the role of network depth using the notion of representation costs – i.e., how much it “costs” for a neural network to represent various functions. Understanding representation costs helps reveal the role of network depth in machine learning and the types of functions learned, relating them to Barron and mixed variation function spaces, such as single- and multi-index models.