Roformer: Enhanced Transformer with Rotary Position Embedding

By J. Su et al.
Published on Nov. 9, 2023
Read the original document by opening this link in a new tab.

Table of Contents

1. Abstract
2. Introduction
3. Background and Related Work
4. RoFormer
5. Proposed Approach
6. Properties of RoPE
7. Theoretical Explanation

Summary

The paper introduces Roformer, an enhanced transformer with rotary position embedding to leverage positional information in transformer-based language models. It discusses the importance of position encoding in natural language understanding, compares various methods for integrating positional information, and proposes RoPE as a novel method. RoPE encodes absolute and relative positions effectively, enabling flexibility in sequence length and enhancing the linear self-attention with relative position encoding. Experimental results demonstrate Roformer's superior performance in long text classification tasks. The paper provides a theoretical analysis and details the formulation of RoPE to incorporate relative position information through rotation matrices.
×
This is where the content will go.