Decision Transformer: Reinforcement Learning via Sequence Modeling

By Lili Chen et al
Published on June 24, 2021
Read the original document by opening this link in a new tab.

Table of Contents

1 Introduction
2 Preliminaries
2.1 Offline reinforcement learning
2.2 Transformers
3 Method
4 Evaluations on Offline RL Benchmarks
5 Discussion
6 Related Work
7 Conclusion
A Experimental Details
A.1 Atari
A.2 OpenAI Gym
A.2.1 Decision Transformer
A.2.2 Behavior Cloning
A.3 Graph Shortest Path
B Atari Task Scores
21 Introduction

Summary

The document introduces Decision Transformer, a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem. It leverages the Transformer architecture to output optimal actions by conditioning an autoregressive model on the desired return, past states, and actions. Despite its simplicity, Decision Transformer matches or exceeds the performance of state-of-the-art model-free offline RL baselines on various tasks. The method is evaluated on offline RL benchmarks in Atari, OpenAI Gym, and Key-to-Door environments. Decision Transformer aims to bridge sequence modeling and transformers with RL, showcasing the potential of sequence modeling as a strong algorithmic paradigm for RL.
×
This is where the content will go.