0% found this document useful (0 votes)
5 views12 pages

Lec4 Logistic Regression

The document discusses the concept of binary logistic regression, which is used for classification problems where the output takes on discrete values, such as spam detection in emails or tumor classification. It explains the transformation of probabilities to remove range restrictions using odds and the log-odds function, ultimately leading to the standard logistic sigmoid function. Additionally, it outlines the steps for Maximum Likelihood Estimation (MLE) to maximize the likelihood function for parameter estimation.

Uploaded by

Mohamed Sadek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views12 pages

Lec4 Logistic Regression

The document discusses the concept of binary logistic regression, which is used for classification problems where the output takes on discrete values, such as spam detection in emails or tumor classification. It explains the transformation of probabilities to remove range restrictions using odds and the log-odds function, ultimately leading to the standard logistic sigmoid function. Additionally, it outlines the steps for Maximum Likelihood Estimation (MLE) to maximize the likelihood function for parameter estimation.

Uploaded by

Mohamed Sadek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 12

machine Learning

Logistic Regression Model

Lec 4
Instructor:
Assoc Lecturer : Ahmed Yousry
The classification problem is just like the regression problem,
except that the values y we now want to predict take on only
a small number of discrete values.

Some Example of Classification problem


• Email : Spam / Not spam
• Tumor: Malignant/ Benign
0.5
 Binary Logistic Regression
• We have a set of feature vectors X with corresponding binary
outputs
X  {x 1 ,x 2 ,....,x n } T
n T y i  {0,1}
Y  {y 1 , y 2 ,...., } , w he re
y to model p(y|x)
• We want

p( yi  1 | xi ,)  j xij


By definition p( y i ,) xi
 1 | xi . We want to transform the
{0,1}the range restrictions,
probability to remove j as xiθ can take any
real value.
 Odds
p : probability of an event
occurring
1 – p : probability of the event not
occurring The odds forp ievent i are then
odd s i
defined as  1 p i
Taking the log of the odds removes the range restrictions.

This way we map the probabilities from the [0; 1] range to the
entire number line (real value).
 pi 
log   xi
 1 p i  
pi
xi
1 pi  e

e xi 1
pi xi 
  xi 
 1 e 1
e Standard logistic sigmoid function

g( )
1
p i  g ( t
x)   t
1  e
x
h (x)  p( y  1|
x; )
 Maximum Likelihood Estimation
(MLE)
1. Step 1 : get the probability for all observations

2. Step 2 : Express this is a function of θ, where X and y are


fixed parameters L()  p( y | X :)

3. Step 3 : Maximize L() likelihood function


We can simplify L(θ) by taking its log and then differentiate
to get the gradient.
Any Question

You might also like