What Learning Algorithm Is In-Context Learning? Investigations with Linear Models
By Ekin Akyurek et al
Published on May 17, 2023
Read the original document by opening this link in a new tab.
Table of Contents
1. Introduction
2. Preliminaries
3. What Learning Algorithms Can a Transformer Implement?
4. What Computation Does an In-Context Learner Perform?
5. Behavioral Metrics
6. Experimental Setup
Summary
Neural sequence models, especially transformers, demonstrate a remarkable capacity for in-context learning. This paper investigates the hypothesis that in-context learners implement standard learning algorithms implicitly. By encoding smaller models in their activations, transformers can update these implicit models as new examples appear in the context. The study focuses on linear regression problems and provides evidence that transformers can implement learning algorithms for linear models based on gradient descent and ridge regression. The results suggest that in-context learning can be understood in algorithmic terms and may rediscover standard estimation algorithms.