Lora: Low-Rank Adaptation of Large Language Models

By Edward Hu et al

Published on Oct. 16, 2021

Read the original document by opening this link in a new tab.

1. Introduction
2. Problem Statement
3. Existing Solutions
4. Our Method
5. Empirical Experiments
6. Baselines

Summary

An important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example, we propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks. LoRA performs on-par or better than fine-tuning in model quality on RoBERTa, DeBERTa, GPT-2, and GPT-3, despite having fewer trainable parameters, a higher training throughput, and no additional inference latency.

This is where the content will go.

Innervu Knowledge Navigator

Lora: Low-Rank Adaptation of Large Language Models

By Edward Hu et al

Published on Oct. 16, 2021

Read the original document by opening this link in a new tab.

Table of Contents

Summary