Specializing Smaller Language Models towards Multi-Step Reasoning

By Yao Fu et al
Published on Jan. 30, 2023
Read the original document by opening this link in a new tab.

Table of Contents

1. Introduction
2. Background
3. Specializing Multi-Step Reasoning
4. Experiments

Summary

The paper discusses the surprising ability of Large Language Models (LLMs) to perform complex reasoning tasks with only a few-shot prompts. It proposes model specialization to distill these abilities from large models to smaller ones. The focus is on improving smaller models' performance on multi-step math reasoning tasks. The experiments aim to show how concentrating a model's capacity on a target ability can lift the scaling curve of smaller models. The paper also addresses challenges such as aligning tokenizers and the tradeoff between generic and specialized abilities during model specialization.
×
This is where the content will go.