It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
By Timo Schick et al
Read the original document by opening this link in a new tab.
Table of Contents
1 Introduction
2 Related Work
3 Pattern-Exploiting Training
4 Experiments
Summary
This document discusses the effectiveness of small language models as few-shot learners, highlighting the challenges and solutions in natural language understanding. It introduces Pattern-Exploiting Training (PET), a method that combines cloze questions with gradient-based fine-tuning to achieve impressive results with reduced parameter count. The study compares PET with GPT-3 on SuperGLUE tasks, showcasing the potential of PET in achieving similar performance with significantly fewer parameters. The document also explores the implications of PET's 'green' properties on environmental sustainability in NLP.