Transformer Circuit Videos

As an experiment, we recorded a couple videos discussing our early stage thinking on trying to reverse engineer neural networks. We made them to share our very informal thoughts with colleagues at other institutions.

Please treat these videos like talks one might give on early results at a research group meeting. Our thinking is very rough and errors are very possible. Please take all of these videos with a big grain of salt. We expect they're primarily of interest to people actively thinking about how to reverse engineer neural networks.

Our thoughts have evolved a lot since we started recording these videos. We're grateful to colleagues who gave feedback on our early thoughts! The first couple of these videos have since been superseded by our more polished paper, A Mathematical Framework for Transformer Circuits.

See Playlist on Youtube →