Content-CS229 MachineLearning Notes

Uploaded by

nhipt.vnist

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Content-CS229 MachineLearning Notes

Uploaded by

nhipt.vnist

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

I Supervised learning 5
1 Linear regression 8
1.1 LMS algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 The normal equations . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.1 Matrix derivatives . . . . . . . . . . . . . . . . . . . . . 13
1.2.2 Least squares revisited . . . . . . . . . . . . . . . . . . 14
1.3 Probabilistic interpretation . . . . . . . . . . . . . . . . . . . . 15
1.4 Locally weighted linear regression (optional reading) . . . . . . 17

2 Classification and logistic regression 20

2.1 Logistic regression . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Digression: the perceptron learning algorithm . . . . . . . . . 23
2.3 Multi-class classification . . . . . . . . . . . . . . . . . . . . . 24
2.4 Another algorithm for maximizing `(θ) . . . . . . . . . . . . . 27

3 Generalized linear models 29

3.1 The exponential family . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Constructing GLMs . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2.1 Ordinary least squares . . . . . . . . . . . . . . . . . . 32
3.2.2 Logistic regression . . . . . . . . . . . . . . . . . . . . 33

4 Generative learning algorithms 34

4.1 Gaussian discriminant analysis . . . . . . . . . . . . . . . . . . 35
4.1.1 The multivariate normal distribution . . . . . . . . . . 35
4.1.2 The Gaussian discriminant analysis model . . . . . . . 38
4.1.3 Discussion: GDA and logistic regression . . . . . . . . 40
4.2 Naive bayes (Option Reading) . . . . . . . . . . . . . . . . . . 41
4.2.1 Laplace smoothing . . . . . . . . . . . . . . . . . . . . 44
4.2.2 Event models for text classification . . . . . . . . . . . 46

1
CS229 Spring 20223 2

5 Kernel methods 48
5.1 Feature maps . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.2 LMS (least mean squares) with features . . . . . . . . . . . . . 49
5.3 LMS with the kernel trick . . . . . . . . . . . . . . . . . . . . 49
5.4 Properties of kernels . . . . . . . . . . . . . . . . . . . . . . . 53

6 Support vector machines 59

6.1 Margins: intuition . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.2 Notation (option reading) . . . . . . . . . . . . . . . . . . . . 61
6.3 Functional and geometric margins (option reading) . . . . . . 61
6.4 The optimal margin classifier (option reading) . . . . . . . . . 63
6.5 Lagrange duality (optional reading) . . . . . . . . . . . . . . . 65
6.6 Optimal margin classifiers: the dual form (option reading) . . 68
6.7 Regularization and the non-separable case (optional reading) . 72
6.8 The SMO algorithm (optional reading) . . . . . . . . . . . . . 73
6.8.1 Coordinate ascent . . . . . . . . . . . . . . . . . . . . . 74
6.8.2 SMO . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

II Deep learning 79
7 Deep learning 80
7.1 Supervised learning with non-linear models . . . . . . . . . . . 80
7.2 Neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . 84
7.3 Modules in Modern Neural Networks . . . . . . . . . . . . . . 92
7.4 Backpropagation . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.4.1 Preliminaries on partial derivatives . . . . . . . . . . . 99
7.4.2 General strategy of backpropagation . . . . . . . . . . 102
7.4.3 Backward functions for basic modules . . . . . . . . . . 105
7.4.4 Back-propagation for MLPs . . . . . . . . . . . . . . . 107
7.5 Vectorization over training examples . . . . . . . . . . . . . . 109

III Generalization and regularization 112

8 Generalization 113
8.1 Bias-variance tradeoff . . . . . . . . . . . . . . . . . . . . . . . 115
8.1.1 A mathematical decomposition (for regression) . . . . . 120
8.2 The double descent phenomenon . . . . . . . . . . . . . . . . . 121
8.3 Sample complexity bounds (optional readings) . . . . . . . . . 126
CS229 Spring 20223 3

8.3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . 126

8.3.2 The case of finite H . . . . . . . . . . . . . . . . . . . . 128
8.3.3 The case of infinite H . . . . . . . . . . . . . . . . . . 131

9 Regularization and model selection 135

9.1 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
9.2 Implicit regularization effect (optional reading) . . . . . . . . . 137
9.3 Model selection via cross validation . . . . . . . . . . . . . . . 139
9.4 Bayesian statistics and regularization . . . . . . . . . . . . . . 142

IV Unsupervised learning 144

10 Clustering and the k-means algorithm 145

11 EM algorithms 148
11.1 EM for mixture of Gaussians . . . . . . . . . . . . . . . . . . . 148
11.2 Jensen’s inequality . . . . . . . . . . . . . . . . . . . . . . . . 151
11.3 General EM algorithms . . . . . . . . . . . . . . . . . . . . . . 152
11.3.1 Other interpretation of ELBO . . . . . . . . . . . . . . 158
11.4 Mixture of Gaussians revisited . . . . . . . . . . . . . . . . . . 158
11.5 Variational inference and variational auto-encoder (optional
reading) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

12 Principal components analysis 165

13 Independent components analysis 171

13.1 ICA ambiguities . . . . . . . . . . . . . . . . . . . . . . . . . . 172
13.2 Densities and linear transformations . . . . . . . . . . . . . . . 173
13.3 ICA algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

14 Self-supervised learning and foundation models 177

14.1 Pretraining and adaptation . . . . . . . . . . . . . . . . . . . . 177
14.2 Pretraining methods in computer vision . . . . . . . . . . . . . 179
14.3 Pretrained large language models . . . . . . . . . . . . . . . . 181
14.3.1 Open up the blackbox of Transformers . . . . . . . . . 183
14.3.2 Zero-shot learning and in-context learning . . . . . . . 186
CS229 Spring 20223 4

V Reinforcement Learning and Control 188

15 Reinforcement learning 189
15.1 Markov decision processes . . . . . . . . . . . . . . . . . . . . 190
15.2 Value iteration and policy iteration . . . . . . . . . . . . . . . 192
15.3 Learning a model for an MDP . . . . . . . . . . . . . . . . . . 194
15.4 Continuous state MDPs . . . . . . . . . . . . . . . . . . . . . 196
15.4.1 Discretization . . . . . . . . . . . . . . . . . . . . . . . 196
15.4.2 Value function approximation . . . . . . . . . . . . . . 199
15.5 Connections between Policy and Value Iteration (Optional) . . 203

16 LQR, DDP and LQG 206

16.1 Finite-horizon MDPs . . . . . . . . . . . . . . . . . . . . . . . 206
16.2 Linear Quadratic Regulation (LQR) . . . . . . . . . . . . . . . 210
16.3 From non-linear dynamics to LQR . . . . . . . . . . . . . . . 213
16.3.1 Linearization of dynamics . . . . . . . . . . . . . . . . 214
16.3.2 Differential Dynamic Programming (DDP) . . . . . . . 214
16.4 Linear Quadratic Gaussian (LQG) . . . . . . . . . . . . . . . . 216

17 Policy Gradient (REINFORCE) 220

The Hundred-Page Machine Learning Book-Andriy Burkov (2019) - Removed
No ratings yet
The Hundred-Page Machine Learning Book-Andriy Burkov (2019) - Removed
145 pages
Machine Learning - A First Course For Engineers and Scientists
No ratings yet
Machine Learning - A First Course For Engineers and Scientists
348 pages
Machine Learning Simplified
100% (1)
Machine Learning Simplified
109 pages
Machine Learning Algorithms Applications and Practices in Data Science PDF
No ratings yet
Machine Learning Algorithms Applications and Practices in Data Science PDF
113 pages
Cs229-Main Notes Andrew NG and Tengyu Ma
No ratings yet
Cs229-Main Notes Andrew NG and Tengyu Ma
227 pages
ML Main Printing Material
No ratings yet
ML Main Printing Material
241 pages
Main Notes
No ratings yet
Main Notes
227 pages
Main Notes
No ratings yet
Main Notes
227 pages
CS229 Lecture Notes: Andrew NG and Tengyu Ma April 25, 2023
No ratings yet
CS229 Lecture Notes: Andrew NG and Tengyu Ma April 25, 2023
223 pages
Andrew NG Main - Notes PDF
No ratings yet
Andrew NG Main - Notes PDF
226 pages
CS229 Andrew NG Lecture Notes
No ratings yet
CS229 Andrew NG Lecture Notes
216 pages
Stanford ML
No ratings yet
Stanford ML
168 pages
6036 Lecture Notes
No ratings yet
6036 Lecture Notes
56 pages
6.036 Notes
No ratings yet
6.036 Notes
99 pages
Cs181 Textbook
No ratings yet
Cs181 Textbook
163 pages
Textbook
No ratings yet
Textbook
161 pages
SML Book Draft Latest
No ratings yet
SML Book Draft Latest
275 pages
6 390 Lecture Notes Fall24 (1)
No ratings yet
6 390 Lecture Notes Fall24 (1)
146 pages
Super Cheatsheet Machine Learning
100% (1)
Super Cheatsheet Machine Learning
15 pages
An Adventure of Epic Porpoises
No ratings yet
An Adventure of Epic Porpoises
174 pages
SML Book Draft Latest
No ratings yet
SML Book Draft Latest
194 pages
6 390 Lecture Notes Spring24
No ratings yet
6 390 Lecture Notes Spring24
144 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
112 pages
The Hundred-Page Machine Learni - Andriy Burkov (8998)
No ratings yet
The Hundred-Page Machine Learni - Andriy Burkov (8998)
159 pages
Machine Learning Complete-Course-Notes Polimi
No ratings yet
Machine Learning Complete-Course-Notes Polimi
107 pages
SML Book Draft Latest (001 046)
No ratings yet
SML Book Draft Latest (001 046)
46 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
135 pages
Poly ML SIR
No ratings yet
Poly ML SIR
378 pages
A Comprehensive Guide To Machine Learning
No ratings yet
A Comprehensive Guide To Machine Learning
152 pages
LN ML Rug
No ratings yet
LN ML Rug
283 pages
Cheet Sheet
No ratings yet
Cheet Sheet
47 pages
Ml2 Script v2
No ratings yet
Ml2 Script v2
123 pages
Introduction To Machine Learning: ETH Zurich Janik Schuettler Marcel Graetz FS18
No ratings yet
Introduction To Machine Learning: ETH Zurich Janik Schuettler Marcel Graetz FS18
18 pages
Machine Learning Cheat Sheet HCMUT K
No ratings yet
Machine Learning Cheat Sheet HCMUT K
34 pages
Mathematical Foundations of Machine Learning
100% (1)
Mathematical Foundations of Machine Learning
340 pages
10 1 1 672 7118 PDF
No ratings yet
10 1 1 672 7118 PDF
35 pages
Supp 2
No ratings yet
Supp 2
214 pages
Machine Learning Lecture
No ratings yet
Machine Learning Lecture
433 pages
Machine Learning Lecture
No ratings yet
Machine Learning Lecture
332 pages
Machine Learning Lecture
No ratings yet
Machine Learning Lecture
435 pages
Exercises
No ratings yet
Exercises
69 pages
Mathematical Foundations
No ratings yet
Mathematical Foundations
431 pages
Machine Learning Lecture-Notes
100% (2)
Machine Learning Lecture-Notes
408 pages
bookDMNN 1516 PDF
No ratings yet
bookDMNN 1516 PDF
169 pages
Machine Learning
No ratings yet
Machine Learning
216 pages
PCML Notes
No ratings yet
PCML Notes
249 pages
Alice's Adventures in A Differentiable Wonderland
No ratings yet
Alice's Adventures in A Differentiable Wonderland
279 pages
Introduction To Machine Learning by Ethem Alpaydin 2nded - 2010
No ratings yet
Introduction To Machine Learning by Ethem Alpaydin 2nded - 2010
314 pages
Machine Learning - A Probabilistic Approach
No ratings yet
Machine Learning - A Probabilistic Approach
343 pages
Nlp
No ratings yet
Nlp
69 pages
Table of Contents
No ratings yet
Table of Contents
9 pages
Vorlesung Main Compressed
No ratings yet
Vorlesung Main Compressed
1,437 pages
Machine Learning Summary
No ratings yet
Machine Learning Summary
38 pages
Lecture Notes 2016
No ratings yet
Lecture Notes 2016
132 pages
Machine Learning Basic Principles
No ratings yet
Machine Learning Basic Principles
124 pages
Machine Learning and Data Mining Notes 1647447657
No ratings yet
Machine Learning and Data Mining Notes 1647447657
134 pages
Machine Learning and Data Mining
No ratings yet
Machine Learning and Data Mining
134 pages
Advanced college algebra study guide
From Everand
Advanced college algebra study guide
Harrison Cook
No ratings yet
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
From Everand
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
Harrison K Cook
No ratings yet
Gray Hat Hacking the Ethical Hacker's
From Everand
Gray Hat Hacking the Ethical Hacker's
Çağatay Şanlı
5/5 (1)
Salary Guide - UAE in 2023
100% (1)
Salary Guide - UAE in 2023
14 pages
List of Facility Head-7
No ratings yet
List of Facility Head-7
5 pages
TXTBK Yr1 1011
No ratings yet
TXTBK Yr1 1011
3 pages
BLR Manual - All - R1 PDF
No ratings yet
BLR Manual - All - R1 PDF
235 pages
Latsco Application Tubiera, Jc Moneses
No ratings yet
Latsco Application Tubiera, Jc Moneses
3 pages
Aplio I800 - General Imaging
No ratings yet
Aplio I800 - General Imaging
20 pages
Module 1
No ratings yet
Module 1
95 pages
Heavy Water
No ratings yet
Heavy Water
28 pages
(Adhesion and Adhesives_ Fundamental and Applied Aspects) K. L. Mittal, Manfred Dunky - Biobased Adhesives_ Sources, Characteristics, And Applications-Wiley-Scrivener (2023)
No ratings yet
(Adhesion and Adhesives_ Fundamental and Applied Aspects) K. L. Mittal, Manfred Dunky - Biobased Adhesives_ Sources, Characteristics, And Applications-Wiley-Scrivener (2023)
762 pages
Group 8: Designing Channels of Distribution
No ratings yet
Group 8: Designing Channels of Distribution
2 pages
What Is Salesforce
No ratings yet
What Is Salesforce
9 pages
Coffin Reservation1 1
No ratings yet
Coffin Reservation1 1
15 pages
Mÿkj ÇNS"K JKTF"KZ V.Mu Eqdr Fo"Ofo - Ky ) Iz KXJKT
No ratings yet
Mÿkj ÇNS"K JKTF"KZ V.Mu Eqdr Fo"Ofo - Ky ) Iz KXJKT
17 pages
NP234 (A) 20
No ratings yet
NP234 (A) 20
70 pages
Mel Torme - When I Found You - Lyrics
No ratings yet
Mel Torme - When I Found You - Lyrics
6 pages
LOGframe
No ratings yet
LOGframe
83 pages
Lahore District
No ratings yet
Lahore District
3 pages
Solution Manual for Operations Research: An Introduction, 9/E 9th Edition Hamdy A. Taha download pdf
100% (20)
Solution Manual for Operations Research: An Introduction, 9/E 9th Edition Hamdy A. Taha download pdf
19 pages
Upper-Intermediate Dictation
No ratings yet
Upper-Intermediate Dictation
13 pages
T TP 2548617 PDF The Man Who Bought A Mountain Ebook - Ver - 3
No ratings yet
T TP 2548617 PDF The Man Who Bought A Mountain Ebook - Ver - 3
62 pages
Parts Guide Manual: Bizhub C250
No ratings yet
Parts Guide Manual: Bizhub C250
112 pages
Zero To Tea Tasting Expert in 30 Mins 1
100% (1)
Zero To Tea Tasting Expert in 30 Mins 1
24 pages
Chapter 1) INTRODUCTION: (1.1) History
No ratings yet
Chapter 1) INTRODUCTION: (1.1) History
50 pages
Macs Hash
No ratings yet
Macs Hash
8 pages
Analisis Debussy PDF
No ratings yet
Analisis Debussy PDF
2 pages
Dr. Format
67% (3)
Dr. Format
7 pages
Sap Inspection Lot Status - 4 Jan 2017
No ratings yet
Sap Inspection Lot Status - 4 Jan 2017
3 pages
Taxi Service
No ratings yet
Taxi Service
25 pages
Teacher's Task Learners' Task Pre-Reading: Aim
No ratings yet
Teacher's Task Learners' Task Pre-Reading: Aim
2 pages
ETF Strategies
0% (1)
ETF Strategies
6 pages