ImageNet Classification with Deep Convolutional Neural Networks

By Alex Krizhevsky et al
Read the original document by opening this link in a new tab.

Table of Contents

1. Abstract
2. Introduction
3. Current Approaches to Object Recognition
4. The Dataset
5. The Architecture
6. ReLU Nonlinearity
7. Training on Multiple GPUs
8. Local Response Normalization
9. Overlapping Pooling
10. Overall Architecture
11. Reducing Overfitting
12. Data Augmentation

Summary

This paper presents the work on training a large, deep convolutional neural network to classify high-resolution images in the ImageNet LSVRC-2010 contest. The network achieved top-1 and top-5 error rates better than previous state-of-the-art methods. The architecture consists of five convolutional layers, followed by max-pooling layers, and three fully-connected layers. To improve training efficiency, non-saturating neurons and a GPU implementation of the convolution operation were used. The paper also discusses the use of a regularization method called 'dropout' to reduce overfitting. The authors entered a variant of the model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate. The dataset used, ImageNet, contains over 15 million labeled high-resolution images. The paper highlights the importance of larger training sets for object recognition tasks. The architecture of the neural network includes ReLU nonlinearity, training on multiple GPUs, local response normalization, and overlapping pooling. Data augmentation techniques were employed to reduce overfitting, including image translations, horizontal reflections, and altering RGB intensities.
×
This is where the content will go.