Read the original document by opening this link in a new tab.
Table of Contents
Preface
1 Introduction
1.1 A Taste of Machine Learning
1.1.1 Applications
1.1.2 Data
1.1.3 Problems
1.2 Probability Theory
1.2.1 Random Variables
1.2.2 Distributions
1.2.3 Mean and Variance
1.2.4 Marginalization, Independence, Conditioning, and Bayes Rule
1.3 Basic Algorithms
1.3.1 Naive Bayes
1.3.2 Nearest Neighbor Estimators
1.3.3 A Simple Classifier
1.3.4 Perceptron
1.3.5 K-Means
2 Density Estimation
2.1 Limit Theorems
2.1.1 Fundamental Laws
2.1.2 The Characteristic Function
2.1.3 Tail Bounds
2.1.4 An Example
2.2 Parzen Windows
2.2.1 Discrete Density Estimation
2.2.2 Smoothing Kernel
2.2.3 Parameter Estimation
2.2.4 Silverman's Rule
2.2.5 Watson-Nadaraya Estimator
2.3 Exponential Families
2.3.1 Basics
2.3.2 Examples
2.4 Estimation
2.4.1 Maximum Likelihood Estimation
2.4.2 Bias, Variance and Consistency
2.4.3 A Bayesian Approach
2.4.4 An Example
2.5 Sampling
2.5.1 Inverse Transformation
2.5.2 Rejection Sampler
3 Optimization
3.1 Preliminaries
3.1.1 Convex Sets
3.1.2 Convex Functions
3.1.3 Subgradients
3.1.4 Strongly Convex Functions
3.1.5 Convex Functions with Lipschitz Continuous Gradient
3.1.6 Fenchel Duality
3.1.7 Bregman Divergence
3.2 Unconstrained Smooth Convex Minimization
3.2.1 Minimizing a One-Dimensional Convex Function
3.2.2 Coordinate Descent
3.2.3 Gradient Descent
3.2.4 Mirror Descent
3.2.5 Conjugate Gradient
3.2.6 Higher Order Methods
3.2.7 Bundle Methods
3.3 Constrained Optimization
3.3.1 Projection Based Methods
3.3.2 Lagrange Duality
3.3.3 Linear and Quadratic Programs
3.4 Stochastic Optimization
3.4.1 Stochastic Gradient Descent
3.5 Nonconvex Optimization
3.5.1 Concave-Convex Procedure
3.6 Some Practical Advice
4 Online Learning and Boosting
4.1 Halving Algorithm
4.2 Weighted Majority
n5 Conditional Densities
5.1 Logistic Regression
5.2 Regression
5.2.1 Conditionally Normal Models
5.2.2 Posterior Distribution
5.2.3 Heteroscedastic Estimation
5.3 Multiclass Classification
5.3.1 Conditionally Multinomial Models
5.4 What is a CRF?
5.4.1 Linear Chain CRFs
5.4.2 Higher Order CRFs
5.4.3 Kernelized CRFs
5.5 Optimization Strategies
5.5.1 Getting Started
5.5.2 Optimization Algorithms
5.5.3 Handling Higher order CRFs
5.6 Hidden Markov Models
5.7 Further Reading
5.7.1 Optimization
6 Kernels and Function Spaces
6.1 The Basics
6.1.1 Examples
6.2 Kernels
6.2.1 Feature Maps
6.2.2 The Kernel Trick
6.2.3 Examples of Kernels
6.3 Algorithms
6.3.1 Kernel Perceptron
6.3.2 Trivial Classifier
6.3.3 Kernel Principal Component Analysis
6.4 Reproducing Kernel Hilbert Spaces
6.4.1 Hilbert Spaces
6.4.2 Theoretical Properties
6.4.3 Regularization
6.5 Banach Spaces
6.5.1 Properties
6.5.2 Norms and Convex Sets
7 Linear Models
7.1 Support Vector Classification
Summary
Introduction to Machine Learning by Alex Smola et al is a comprehensive textbook covering various aspects of machine learning. The book provides an overview of machine learning applications, probability theory, basic algorithms, density estimation, optimization techniques, conditional densities, kernels, and function spaces, linear models, and more. It discusses real-world applications such as web page ranking, collaborative filtering, automatic translation, face recognition, and named entity recognition. The text emphasizes the importance of machine learning in modern information technology and its role in technological progress. With detailed explanations and examples, this book serves as a valuable resource for students and researchers interested in machine learning.