Towards AGI in Computer Vision: Lessons Learned from GPT and Large Language Models

By L. Xie et al
Published on June 6, 2023
Read the original document by opening this link in a new tab.

Table of Contents

Abstract
Introduction
Artificial General Intelligence
AGI in Computer Vision
Imaginary Pipeline towards AGI in CV
Conclusion

Summary

The AI community has been pursuing algorithms known as artificial general intelligence (AGI) that apply to any kind of real-world problem. Recently, chat systems powered by large language models (LLMs) emerge and rapidly become a promising direction to achieve AGI in natural language processing (NLP), but the path towards AGI in computer vision (CV) remains unclear. This paper explores the lessons learned from GPT and LLMs to shed light on the challenges and potential solutions in achieving AGI in CV. The authors propose an imaginary pipeline towards AGI in CV involving three stages: establishing interactive environments, training agents, and fine-tuning for various tasks. The paper concludes by highlighting the need for substantial research and engineering efforts to advance this vision.
×
This is where the content will go.