IEEE Transactions on Pattern Analysis and Machine Intelligence
By J. Wang et al.
Published on March 10, 2020
Read the original document by opening this link in a new tab.
Table of Contents
1. Abstract 2. Introduction 3. Deep High-Resolution Representation Learning ... 4. Related Work 5. High-Resolution Networks 6. Representation Head 7. Instantiation 8. Analysis 9. Human Pose Estimation
Summary
The document discusses the importance of high-resolution representations in vision problems and introduces a new network named HRNet. HRNet maintains high-resolution representations throughout the process by connecting high-to-low resolution convolution streams in parallel and exchanging information across resolutions. The document presents two versions of HRNet: HRNetV 1 focuses on the high-resolution stream, while HRNetV 2 combines representations from all resolutions. HRNetV 2p constructs a feature pyramid from HRNetV 2's output. The network consists of four stages with parallel convolution streams at different resolutions. The multi-resolution fusion module exchanges information across representations. The document also covers applications of HRNet in human pose estimation, semantic segmentation, and object detection.