Summary
Large pre-trained language models have been shown to store factual knowledge in their parameters and achieve state-of-the-art results when fine-tuned on downstream NLP tasks. However, their ability to access and precisely manipulate knowledge is still limited. In this paper, the authors introduce retrieval-augmented generation (RAG) models that combine pre-trained parametric and non-parametric memory for language generation. These models use a pre-trained seq2seq model as parametric memory and a dense vector index of Wikipedia for non-parametric memory. The authors explore different formulations of RAG and evaluate them on various knowledge-intensive NLP tasks, achieving state-of-the-art results on open domain QA tasks and demonstrating the benefits of combining parametric and non-parametric memory for knowledge-intensive tasks.