How Is ChatGPT’s Behavior Changing Over Time?

By L. Chen et al
Published on Oct. 31, 2023
Read the original document by opening this link in a new tab.

Table of Contents

1 Introduction
2 Overview: LLM Services, Tasks and Metrics
3 Monitoring Reveals Substantial LLM Drifts
3.1 Math I (Prime vs Composite): Chain-of-Thought Can Fail
...

Summary

This paper discusses the evaluation of the behavior of GPT-3.5 and GPT-4 over time, focusing on tasks such as math problems, opinion surveys, code generation, and more. The study reveals significant performance and behavior drifts, emphasizing the need for continuous monitoring of large language models like ChatGPT.
×
This is where the content will go.