Perspectives on the science of deep learning

Three essays on building theory that matters.

The scientific method in two stepsJamie SimonTowards an atlas of deep learningDhruva KarkadaScience plays the long gameFlorentin Guth

Deep linear networks are a surprisingly useful toy model of weight-space dynamics

Deep linear networks are simple enough to study analytically but rich enough to exhibit key phenomena of neural network training.

On neural scaling and the quanta hypothesis

What is the origin of neural scaling laws? What do they tell us about the structure of data? What are the limits of interpretability?

On neural scaling and the quanta hypothesis

A visual guide to progressive sharpening and the edge of stability