Sparse Online Learning via Truncated Gradient

By John Langford et al
Read the original document by opening this link in a new tab.

Table of Contents

Abstract
Introduction
What Others Do
What We Do
Online Learning with GD
Sparse Online Learning
Simple Coefficient Rounding
A Sub-gradient Algorithm for L1 Regularization
Truncated Gradient

Summary

We propose a general method called truncated gradient to induce sparsity in the weights of online learning algorithms with convex loss functions. The method has several essential properties: 1. The degree of sparsity is continuous - a parameter controls the rate of sparsification from no sparsification to total sparsification. 2. The approach is theoretically motivated, and an instance of it can be regarded as an online counterpart of the popular L1-regularization method in the batch setting. We prove that small rates of sparsification result in only small additional regret with respect to typical online learning guarantees. 3. The approach works well empirically. We apply the approach to several datasets and find that for datasets with large numbers of features, substantial sparsity is discoverable.
×
This is where the content will go.