Economics and Computation Series

The Complexity of Gradient Descent

2nd June 2021, 13:00 add to calender
Rahul Savani
University of Liverpool

Abstract

This talk is about the computational complexity of Gradient Descent, one of the oldest and most widely-used algorithmic approaches for optimisation, for example of neural networks. The approach dates all the way back to an 1847 paper of Cauchy.


When Gradient Descent is constrained to a bounded domain, there are not one but two reasons why it must terminate.

Reason 1: we are always going downhill, altitude must "bottom out". This puts the search for a solution in the complexity class PLS (polynomial local search).

Reason 2: Gradient Descent maps any point to a nearby point in the direction of the negative gradient. Brouwer's Fixed Point Theorem guarantees that such a mapping has a point mapped to itself. This puts the search for a solution in the complexity class PPAD.

PPAD and PLS correspond to existence-of-solution proof principles that guarantee solutions, but in a computationally-inefficient way. Both classes have become successful through the fact that they have been shown to exactly characterise the complexity of important problems such as finding a Nash equilibrium (PPAD) or finding a local max cut of a graph (PLS). Our main result shows that the Gradient Descent solution-existence principle tastefully combines the PLS principle with the PPAD principle: We show how to efficiently reduce any problem that is in both PPAD and PLS to a Gradient Descent problem.

Joint work with: John Fearnley (Liverpool), Paul Goldberg (Oxford), and Alexandros Hollender (Oxford). Paper to appear at STOC'21.

 

 




add to calender (including abstract)

Additional Materials