MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

By Andrew G. Howard et al
Read the original document by opening this link in a new tab.

Table of Contents

1. Introduction
2. Prior Work
3. MobileNet Architecture
3.1. Depthwise Separable Convolution
3.2. Network Structure and Training
3.3. Width Multiplier: Thinner Models
3.4. Resolution Multiplier: Reduced Representation
4. Experiments
4.1. Model Choices
4.2. Model Shrinking Hyperparameters

Summary

MobileNets are efficient models designed for mobile and embedded vision applications, based on depthwise separable convolutions. They offer a trade-off between latency and accuracy, allowing for the selection of model size based on application constraints. The architecture uses depthwise separable convolutions to reduce computation and model size, resulting in high performance compared to other models. Experiments demonstrate the effectiveness of MobileNets in various recognition tasks. The paper introduces width and resolution multipliers to further customize the model for specific requirements, showing smooth trade-offs between accuracy, computation, and model size.
×
This is where the content will go.