Sociotechnical Safety Evaluation of Generative AI Systems

By Laura Weidinger et al
Published on Oct. 31, 2023
Read the original document by opening this link in a new tab.

Table of Contents

Contents
1 Introduction
2 Framework for sociotechnical AI safety evaluation
2.1 Layer 1: Capability
2.2 Layer 2: Human interaction
2.3 Layer 3: Systemic impact
2.4 Summary
3 Current state of sociotechnical safety evaluation
3.1 Taxonomy of harm
3.1.1 Multimodality raises new evaluation challenges
3.2 Mapping the landscape
3.2.1 Limitations
3.3 Evaluation gaps
4 Closing evaluation gaps
4.1 Operationalising risks
4.1.1 Ensuring validity
4.2 Selecting evaluation methods
4.2.1 Capability evaluation methods
4.2.2 Human interaction evaluation methods
4.2.3 Systemic impact evaluation methods
4.3 Practical steps to closing the multimodal evaluation gap
4.3.1 Repurposing evaluations for new modalities
4.3.2 Transcribing non-text output for text-based evaluation
4.3.3 Model-driven evaluation may fill gaps
5 Discussion
5.1 Benefits of a sociotechnical approach
5.2 Roles and responsibilities
5.3 Limits of evaluation
5.3.1 Evaluation is incomplete
5.3.2 Evaluation is never value-neutral
5.4 Steps forward
5.4.1 Evaluations must be developed where they do not yet exist
5.4.2 Evaluations must be done as a matter of course
5.4.3 Evaluation must have real consequences
5.4.4 Evaluations must be done systematically, in standardised ways
5.4.5 Toward a shared framework for AI safety
6 Conclusion
A Appendix
A.1 Taxonomy of harm
A.2 Evaluation methods per layer
A.2.1 Capabilities layer
A.2.2 Human interaction layer
A.2.3 Systemic impact layer
A.3 Case study: Misinformation
Bibliography
Reader's guide

Summary

Generative AI systems produce a range of risks. To ensure the safety of generative AI systems, these risks must be evaluated. In this paper, we make two main contributions toward establishing such evaluations. First, we propose a three-layered framework that takes a structured, sociotechnical approach to evaluating these risks. This framework encompasses capability evaluations, which are the main current approach to safety evaluation. It then reaches further by building on system safety principles, particularly the insight that context determines whether a given capability may cause harm. To account for relevant context, our framework adds human interaction and systemic impacts as additional layers of evaluation. Second, we survey the current state of safety evaluation of generative AI systems and create a repository of existing evaluations. Three salient evaluation gaps emerge from this analysis. We propose ways forward to closing these gaps, outlining practical steps as well as roles and responsibilities for different actors. Sociotechnical safety evaluation is a tractable approach to the robust and comprehensive safety evaluation of generative AI systems.
×
This is where the content will go.