0% found this document useful (0 votes)

12 views51 pages

ML Module3 Regression

The document provides an overview of regression analysis, a supervised learning algorithm used in predictive analytics to model the relationship between dependent and independent variables. It covers various types of regression, including linear, multilinear, and logistic regression, along with their applications, assumptions, and methods for improving accuracy. Key concepts such as the least squares method, multicollinearity, and the differences between linear and logistic regression are also discussed.

Uploaded by

12302080603002

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views51 pages

ML Module3 Regression

Uploaded by

12302080603002

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

Regression Analysis

1
Contents

• What is Regression?
• Why Regression?

• Linear Regression
– Linear Regression algorithm using least square method
– Evaluation of method
• Multilinear Regression
• Logistic Regression

2
What is Regression?

• Regression is a supervised learning algorithm under

Machine Learning terminology
• An important tool in Predictive Analytics
• Regression analysis is a predictive modeling technique
which investigates the relationship between a
dependent and independent variable.
• Graphing a line over a set of data points that most
closely fits the overall shape of the data.
• The regression shows the changes in the dependent
variable on the Y axis to the changes in the explanatory
3 variable on X axis.
What is Regression?

• Regression is a tool for finding existence of an association

relationship between a dependent variable (Y) and one or more
independent variables (X1, X2, …, Xn) in a study.

• The relationship can be linear or non-linear.

• A dependent variable (response variable) “measures an

outcome of a study (also called outcome variable)”.

• An independent variable (explanatory variable) “explains

changes in a response variable”.
4
Types of Regression

One More than

independent One
variable independent
5 variable
Most Common Regression Algorithms

● Simple linear regression

● Multiple linear regression
● Polynomial regression
● Multivariate adaptive regression splines
● Logistic regression
● Maximum likelihood estimation (least squares)

6
Use cases of Regression

• Predictive analytics
• Operation efficiency
• Supporting decisions
• Correcting errors
• New insights

• House Price Predictions

• Trend forecasting
– E.g. what will be the price of gold in next six months
• Finding Associations among attributes:
– E.g. Mediclaim agencies: Effect of age on claims
7
Linear Regression
• Linear regression: It is a linear approach to modelling the
relationship between a scalar response and one or more
explanatory variables (also known as dependent and independent
variables).
• The case of one explanatory variable is called simple linear
regression; for more than one, the process is called multiple
linear regression.

• In linear regression, the relationships are modeled using linear

predictor functions whose unknown
model parameters are estimated from the data.
• Linear regression models are often fitted using the least
squares approach.
8
Simple Linear Regression

• One of the easiest algorithm in machine learning.

• Simple Linear regression: It is a statistical model that

attempts to show the relationship between two variables
through the linear equation.

• Data is modeled using a straight line (Y = mX + c)

• Correlation between X and Y variables

9
Simple Linear Regression: Understanding
+ve Relationship

Speed of Vehicle

m=Slop of the line

(Dependent variable)

Distance travelled in
fixed duration of time

c= y – intercept of the line

10
(Independent variable)
Simple Linear Regression: Understanding

11
Slops of Simple Linear Regression Model

Linear positive slope Linear negative slope

Slope = (Y − Y ) / (X − X ) = Delta (Y) / Delta(X)

Slope = Change in Y/Change in X

Curve linear positive slope

Example:
(X , Y ) = (−3, −2) and (X , Y ) = (2, 2)
Rise = (Y − Y ) = (2 − (−2)) = 2 + 2 = 4
Run = (X − X ) = (2 − (−3)) = 2 + 3 = 5
12 Slope = Rise/Run = 4/5 = 0.8

Curve linear negative slope

Relations in Regression

Linear positive slope Linear negative slope

Curve linear positive slope

Curve linear negative slope

Simple Linear Regression:
Least Square Method
• How to find the best Regression Line?
• Our challenge is to determine the value of m and c, that gives
the minimum error for the given dataset. We will be doing this
by using the Least Squares method.
• Loss function:
y = mx +c

For minimum loss we take partial

derivative of L(x) and equate to 0,
then finding expression of m and c.

14
Simple Linear Regression:
Least Square Method (Example)

• A method to predict best fit line.

15
Simple Linear Regression:
Least Square Method (Example)

16
Simple Linear Regression
• Measure of Goodness: R2 method

18
OLS algorithm

● Step 1: Calculate the mean of X and Y

● Step 2: Calculate the errors of X and Y
● Step 3: Get the product
● Step 4: Get the summation of the products
● Step 5: Square the difference of X
● Step 6: Get the sum of the squared difference
● Step 7: Divide output of step 4 by output of step 6 to calculate ‘b’
● Step 8: Calculate ‘a’ using the value of ‘b’

19
Example of Simple Linear Regression

Calculation summary:
Sum of X = 299
Sum of Y = 852
Mean X, M = 19.93
Mean Y, M = 56.8

20
Error in Simple Regression

Y = (a + bX) + ε

Example of simple regression

Scatter plot and regression line

Sum of Square of Residual

21 SSE=

Residual is the distance between the predicted point (on the

regression line) and the actual point as depicted in Figure
Multiple Linear Regression

• Two or more independent variables, i.e. predictors are involved in the

model.

• In the example of simple linear regression by considering Price of a

Property as the dependent variable and the Area of the Property (in sq.
m.) as the predictor variable.

• If we consider Price of a Property (in $) as the dependent variable and

Area of the Property (in sq. m.), location, floor, number of years since
purchase and amenities available as the independent variables, we can
form a multiple regression equation as shown below:

22
• The simple linear regression
• Parameter ‘a’ is the intercept of
model and the multiple
this plane. Parameters ‘b1 ’ and
regression model assume that
‘b2 ’ are referred to as partial
the dependent variable is
regression coefficients.
continuous.

• Parameter b1 represents the

• The following expression
change in the mean response
describes the equation involving
corresponding to a unit change in
the relationship with two
X1 when X2 is held constant.
predictor variables, namely X1
and X2 .
• Parameter b2 represents the
change in the mean response
23 corresponding to a unit change in
• The model describes a plane in X2 when X1 is held constant.
the three-dimensional space of
Ŷ, X1, and X2 .
• Consider the following
example of a multiple linear
regression model with two
predictor variables, namely
X1 and X2.

24
Multiple regression for estimating equation when there are ‘n’ predictor
variables is as follows:

While finding the best fit line, we can fit either a polynomial or
curvilinear regression. These are known as polynomial or curvilinear
regression, respectively.
25
Assumptions in Regression Analysis

1. The dependent variable (Y) can be calculated / predicated as a

linear function of a specific set of independent variables (X’s) plus an
error term (ε).
2. The number of observations (n) is greater than the number of
parameters (k) to be estimated, i.e. n > k.
3. Relationships determined by regression are only relationships of
association based on the data set and not necessarily of cause and
effect of the defined class.
4. Regression line can be valid only over a limited range of data. If the
line is extended (outside the range of extrapolation), it may only lead
to wrong predictions.

26
5. If the business conditions change and the business assumptions
underlying the regression model are no longer valid, then the past
data set will no longer be able to predict future trends.
6. Variance is the same for all values of X (homoskedasticity).
7. The error term (ε) is normally distributed. This also means that the
mean of the error (ε) has an expected value of 0.
8. The values of the error (ε) are independent and are not related to
any values of X. This means that there are no relationships between a
particular X, Y that are related to another specific value of X, Y.

27 Given the above assumptions, the OLS estimator is the Best Linear
Unbiased Estimator (BLUE), and this is called as Gauss-Markov
Theorem.
Main Problems in Regression Analysis

• Two primary problems: Multicollinearity and heteroskedasticity

Mutilcollinearity
• Two variables are perfectly collinear if there is an exact linear
relationship between them.

• Multicollinearity is the situation in which the degree of correlation is

not only between the dependent variable and the independent
variable, but there is also a strong correlation within (among) the
independent variables themselves.

• A multiple regression equation can make good predictions when

28 there is multicollinearity, but it is difficult for us to determine how
the dependent variable will change if each independent variable is
changed one at a time.
• When multicollinearity is present, it increases the standard errors
of the coefficients.
• One way to gauge multicollinearity is to calculate the Variance
Inflation Factor (VIF), which assesses how much the variance of
an estimated regression coefficient increases if the predictors are
correlated.
• If no factors are correlated, the VIFs will be equal to 1.
• The assumption of no perfect collinearity states that there is no
exact linear relationship among the independent variables.
• This assumption implies two aspects of the data on the
independent variables.

29
• First, none of the independent variables, other than the variable
associated with the intercept term, can be a constant.
• Second, variation in the X’s is necessary.
• In general, the more variation in the independent variables, the
better will be the OLS estimates in terms of identifying the impacts
of the different independent variables on the dependent variable.

Heteroskedasticity
• Refers to the changing variance of the error term.
• If the variance of the error term is not constant across data sets,
there will be erroneous predictions.
30 • In general, for a regression equation to make accurate predictions,
the error term should be independent, identically (normally)
distributed (iid).
31
Improving Accuracy of the Linear Regression Model

• Accuracy refers to how close the estimation is near the actual

value, whereas prediction refers to continuous estimation of the
value.
High bias = low accuracy (not close to real value)
High variance = low prediction (values are scattered)
Low bias = high accuracy (close to real value)
Low variance = high prediction (values are close to each other)

• We have a regression model which is highly accurate and highly

predictive; therefore, the overall error of our model will be low,
implying a low bias (high accuracy) and low variance (high
32 prediction). This is highly preferable.
Improving Accuracy of the Linear Regression Model

Accuracy of linear regression can be improved using the following

three methods:

1. Shrinkage Approach
2. Subset Selection
3. Dimensionality (Variable) Reduction

33
Polynomial Regression Model

• Extension of the simple linear model by adding extra predictors

obtained by raising (squaring) each of the original predictors to a
power.
• This approach provides a simple way to yield a non-linear fit to
data. For example,

• Let us use the below data set of (X, Y) for degree 3 polynomial.

34
35
• As you can observe, the regression line is slightly curved for
polynomial degree 3 with the above 15 data points.

• The regression line will curve further if we increase the polynomial

degree.

• At the extreme value as shown above, the regression line will be

overfitting into all the original values of X.

36
What is Logistic Regression?

• Logistic regression is a Classification algorithm.

• Logistic Regression is all about predicting binary variables,
not predicting continuous variables.
• Logistic regression models estimate how probability of an
event may be affected by one or more explanatory variables.

• Logistic regression is a technique used for predicting “class

probability”, that is the probability that the case belongs to a
particular class.

37
Use cases of Logistic Regression

• Mail[Spam / Not Spam]

• Transaction [Fraudulent / Normal]
• Tumor [Malignant / Benign]
• Sentimental Analysis [Positive / Negative]
• Weather Prediction [Rain / Not Rain]
• Medical Diagnosis [Fit / ill]

38
Linear and Logistic Regression

39
Linear and Logistic Regression

Logistic curve
Sigmoid (S) curve

40
Logistic Regression Curve

41
Some fundamentals terms of Logistic Regression

• The probability that an event will occur is the fraction of times you expect
to see that event in many trials. If the probability of an event occurring is
Y, then the probability of the event not occurring is 1-
Y. Probabilities always range between 0 and 1.
• The odds are defined as the probability that the event will occur divided
by the probability that the event will not occur. Unlike probability, the
odds are not constrained to lie between 0 and 1 but can take any value
from zero to infinity.
• If the probability of Success is P, then the odds of that event is:

• The logit function is the logarithmic transformation of the logistic function.

It is defined as the natural logarithm of odds.
43
Math behind Logistic Regression

44
Math behind Logistic Regression

45
Math behind Logistic Regression

46
• Let us say we have a model that can predict whether a person is
male or female on the basis of their height.
• Given a height of 150 cm, we need to predict whether the person
is male or female.
• We know that the coefficients of a = −100 and b = 0.6.
• Using the above equation, we can calculate the probability of male
given a height of 150 cm or more formally P(male|height = 150).

47 or a probability of near zero that the person is a male.

Linear vs Logistic Regression

Basis Linear Regression Logistic Regression

Core concept Data is modeled using a Data is modeled using a
(Modeling of data) straight line. logistic (sigmoid) function.
Used with Continuous variable Categorical variable
Output/prediction Value of the variable Probability of occurrence of
event
Problem Solved Regression Classification
Accuracy Loss, R2, adjusted R2, etc. Accuracy, Precision,
(goodness of fit) Recall, F1 score, ROC
curve, Confusion matrix,
etc.

• The basic difference: the type of function that is used for

mapping
– (Linear: continuous X -> Continuous Y;
48
– Logistic: continuous X -> binary Y) – used for deciding category or true/false
decisions of the data
Parameter Estimation by
Maximum Likelihood Method

● The coefficients in a logistic regression are estimated using a process called Maximum
Likelihood Estimation (MLE).

● Likelihood function:

49
Thank You

50
Parameter Estimation by
Maximum Likelihood Method

• Probability density function for binary logistic regression is

given by:

51
Parameter Estimation by
Maximum Likelihood Method

The above system of equations are solved iteratively to

estimate β0 and β1

52
References

• Coursera tutorial - Linear Regression, Logistic Regression

• SimpliLearn tutorial – Logistic Regression
• Wikipedia – linear regression

• Business Analytics – by U.Dinesh Kumar

System Design Interview Vol2
100% (5)
System Design Interview Vol2
427 pages
The Pragmatic Programmer PDF
100% (54)
The Pragmatic Programmer PDF
352 pages
Data Structure and Algorithmic Thinking With Python Data Structure and Algorithmic Puzzles PDF
95% (21)
Data Structure and Algorithmic Thinking With Python Data Structure and Algorithmic Puzzles PDF
471 pages
Python Programming. A Step-by-Step Guide For Absolute Beginners
93% (43)
Python Programming. A Step-by-Step Guide For Absolute Beginners
181 pages
The Python Bible
97% (31)
The Python Bible
506 pages
Gayle Laakmann McDowell - Cracking The Coding Interview - 189 Programming Questions and Solutions (2015, CareerCup)
82% (49)
Gayle Laakmann McDowell - Cracking The Coding Interview - 189 Programming Questions and Solutions (2015, CareerCup)
708 pages
Python 3 Cheat Sheet
94% (51)
Python 3 Cheat Sheet
2 pages
Method Statement For Traffic Light Installation 1
80% (5)
Method Statement For Traffic Light Installation 1
44 pages
Understanding Machine Learning
100% (70)
Understanding Machine Learning
416 pages
Data Structures and Algorithms Made Easy-Narasimha Karumanchi
85% (40)
Data Structures and Algorithms Made Easy-Narasimha Karumanchi
228 pages
Practical Projects
100% (30)
Practical Projects
478 pages
The Python Manual
97% (31)
The Python Manual
196 pages
(Hunt, J.) A Beginners Guide To Python 3 Programming
96% (47)
(Hunt, J.) A Beginners Guide To Python 3 Programming
440 pages
Nathan Kutz - Dynamic Mode Decomposition Data-Driven Modeling of Complex Systems
100% (1)
Nathan Kutz - Dynamic Mode Decomposition Data-Driven Modeling of Complex Systems
255 pages
Artificial Intelligence With Python (Machine Learning Foundations, Methodologies, and Applications) (Teik Toe Teoh, Zheng Rong)
93% (15)
Artificial Intelligence With Python (Machine Learning Foundations, Methodologies, and Applications) (Teik Toe Teoh, Zheng Rong)
334 pages
System Design Interview Fundamentals
100% (4)
System Design Interview Fundamentals
412 pages
Competitive Programming in Python 128 Algorithms To Develop Your Coding Skills
100% (10)
Competitive Programming in Python 128 Algorithms To Develop Your Coding Skills
267 pages
SQL Interview Questions PDF
88% (43)
SQL Interview Questions PDF
48 pages
37DLPLUS Manual
No ratings yet
37DLPLUS Manual
298 pages
Linear Regression
No ratings yet
Linear Regression
16 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Hackers Guide To Machine Learning With Python PDF
100% (15)
Hackers Guide To Machine Learning With Python PDF
272 pages
Full Course of Machine Learning
100% (16)
Full Course of Machine Learning
660 pages
Data Structure and Algorithms With Python
100% (14)
Data Structure and Algorithms With Python
369 pages
Java For Fucking Idiots Learn The Basics of Java Programming Without1
100% (10)
Java For Fucking Idiots Learn The Basics of Java Programming Without1
142 pages
Machine Learning Projects in Python
100% (16)
Machine Learning Projects in Python
135 pages
100 Days Preparation Strategy To Crack Product Based Companies
50% (2)
100 Days Preparation Strategy To Crack Product Based Companies
33 pages
Coffee Break NumPy PDF
100% (5)
Coffee Break NumPy PDF
211 pages
Hanan
No ratings yet
Hanan
9 pages
MachineLearning Unit-II
No ratings yet
MachineLearning Unit-II
45 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
Regression Unit-2
No ratings yet
Regression Unit-2
5 pages
ML-UNIT-IV - Complete
No ratings yet
ML-UNIT-IV - Complete
42 pages
Regression Analysis
No ratings yet
Regression Analysis
40 pages
1linear Regression
No ratings yet
1linear Regression
12 pages
Mod3 Eda
No ratings yet
Mod3 Eda
16 pages
Data Science
100% (1)
Data Science
14 pages
18-Linear Regression
No ratings yet
18-Linear Regression
29 pages
MachineLearning Unit II
No ratings yet
MachineLearning Unit II
45 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Unit - 3 Machine Learning
No ratings yet
Unit - 3 Machine Learning
30 pages
ML Unit-2
No ratings yet
ML Unit-2
123 pages
Unit 3c Linear Regression
No ratings yet
Unit 3c Linear Regression
98 pages
Unit2
No ratings yet
Unit2
34 pages
Simple and Multiple Linear Regression
No ratings yet
Simple and Multiple Linear Regression
6 pages
1.5.linear Regression
No ratings yet
1.5.linear Regression
5 pages
Linear Regression
No ratings yet
Linear Regression
17 pages
Lecture Note #8 - PEC-CS701E
No ratings yet
Lecture Note #8 - PEC-CS701E
20 pages
Linear Regression For Machine Learning
No ratings yet
Linear Regression For Machine Learning
9 pages
SimpleMultipleLinearRegression FoundationalMathofAI S24
No ratings yet
SimpleMultipleLinearRegression FoundationalMathofAI S24
6 pages
Unit II-II
No ratings yet
Unit II-II
21 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
13 pages
Mod 3C
No ratings yet
Mod 3C
36 pages
ML Unit-2
No ratings yet
ML Unit-2
34 pages
Chapter2 1
No ratings yet
Chapter2 1
55 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
Linear Regression Models
No ratings yet
Linear Regression Models
42 pages
Linear Regression
No ratings yet
Linear Regression
22 pages
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
No ratings yet
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
60 pages
DA unit-III
No ratings yet
DA unit-III
30 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
L4a - Supervised Learning
No ratings yet
L4a - Supervised Learning
25 pages
PA
No ratings yet
PA
28 pages
Chapter - 2 - Linear and Logistic Regression
No ratings yet
Chapter - 2 - Linear and Logistic Regression
34 pages
Linear Regression
No ratings yet
Linear Regression
35 pages
Regression
No ratings yet
Regression
25 pages
Supervised Learning
No ratings yet
Supervised Learning
20 pages
Unit-3 Part 2 DA
No ratings yet
Unit-3 Part 2 DA
20 pages
Module 3
No ratings yet
Module 3
34 pages
Regression
No ratings yet
Regression
14 pages
Lecture 6 - Regression Analysis
No ratings yet
Lecture 6 - Regression Analysis
34 pages
IDS UNIT 5 Linear Regression
No ratings yet
IDS UNIT 5 Linear Regression
27 pages
Simple Linear Regression With Example Problem
No ratings yet
Simple Linear Regression With Example Problem
12 pages
Unit 2
No ratings yet
Unit 2
136 pages
Module III (Part II) (Regression and Time Series)
No ratings yet
Module III (Part II) (Regression and Time Series)
118 pages
Machine Learning and Deep Learning Course
No ratings yet
Machine Learning and Deep Learning Course
23 pages
U-4 Iml
No ratings yet
U-4 Iml
17 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
Regression Coeffient
No ratings yet
Regression Coeffient
52 pages
ml_exp_1
No ratings yet
ml_exp_1
4 pages
Regression and Introduction To Bayesian Network
No ratings yet
Regression and Introduction To Bayesian Network
12 pages
Day 2-Data Science
No ratings yet
Day 2-Data Science
16 pages
9 Regression Analysis
No ratings yet
9 Regression Analysis
38 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Calculus: Maths of the Gods
From Everand
Calculus: Maths of the Gods
Bill Todorovich
No ratings yet
Exercises of Logarithms and Exponentials
From Everand
Exercises of Logarithms and Exponentials
Simone Malacrida
No ratings yet
Python Cheat Sheets
97% (33)
Python Cheat Sheets
11 pages
Leetcode Hard1
86% (7)
Leetcode Hard1
1,364 pages
Hands On Machine Learning With Python Concepts and Applications For Beginners - John Anderson 2018
91% (11)
Hands On Machine Learning With Python Concepts and Applications For Beginners - John Anderson 2018
166 pages
Design Patterns Quick Reference Card
100% (26)
Design Patterns Quick Reference Card
2 pages
Muhammad Yasoob Ullah Khalid - Practical Python Projects-Muhammad Yasoob Ullah Khalid (2021)
100% (3)
Muhammad Yasoob Ullah Khalid - Practical Python Projects-Muhammad Yasoob Ullah Khalid (2021)
329 pages
Data Structures and Algorithms in C++
90% (10)
Data Structures and Algorithms in C++
738 pages
ML - MODULE7 - Advanced Topics in ML
No ratings yet
ML - MODULE7 - Advanced Topics in ML
22 pages
ML MODULE6 Artificial Neural Networks
No ratings yet
ML MODULE6 Artificial Neural Networks
42 pages
ML Module5 Clustering
No ratings yet
ML Module5 Clustering
71 pages
ML MODULE1 Introduction To Machine
No ratings yet
ML MODULE1 Introduction To Machine
38 pages
PNPKI Individual Certificate Application Form Fillable
No ratings yet
PNPKI Individual Certificate Application Form Fillable
4 pages
Small Tool Instruments and
No ratings yet
Small Tool Instruments and
61 pages
SS FC Tables
No ratings yet
SS FC Tables
1 page
Miele WWH860 Brief Operating Instructions
No ratings yet
Miele WWH860 Brief Operating Instructions
2 pages
Complete Unit I
100% (1)
Complete Unit I
147 pages
NoteGPT - Fix Input Lag - FPS Caps, Freesync, & Full Screen Vs Borderless Window
No ratings yet
NoteGPT - Fix Input Lag - FPS Caps, Freesync, & Full Screen Vs Borderless Window
3 pages
African Development Bank: Owas Department
No ratings yet
African Development Bank: Owas Department
34 pages
ESLT - Posters - Progressive Lifting - EN - 3 PDF
No ratings yet
ESLT - Posters - Progressive Lifting - EN - 3 PDF
1 page
Servicers Provided by Indian Railways
No ratings yet
Servicers Provided by Indian Railways
76 pages
Trs en
No ratings yet
Trs en
2 pages
Yak 38 Instructions
No ratings yet
Yak 38 Instructions
4 pages
Drug Master File
No ratings yet
Drug Master File
19 pages
Grading in Civil 3D
No ratings yet
Grading in Civil 3D
12 pages
Respiratory Questions
No ratings yet
Respiratory Questions
11 pages
1 - Transportation Requirements Aviation NiCd - 2019
No ratings yet
1 - Transportation Requirements Aviation NiCd - 2019
4 pages
Securi Lock
No ratings yet
Securi Lock
4 pages
04 - Gen2 Home
No ratings yet
04 - Gen2 Home
9 pages
Cit333 Summary With Past Questions
No ratings yet
Cit333 Summary With Past Questions
25 pages
Chapter 4
No ratings yet
Chapter 4
18 pages
1) Park Ph1 MP Quarter 4 Engg (Jan - Mar 2023) Signed 2024-02-26
No ratings yet
1) Park Ph1 MP Quarter 4 Engg (Jan - Mar 2023) Signed 2024-02-26
3 pages
D.P. Test For All Parts
No ratings yet
D.P. Test For All Parts
1 page
Spaceandmore HORECA Katalógus
No ratings yet
Spaceandmore HORECA Katalógus
619 pages
PM Risks and Constraints
No ratings yet
PM Risks and Constraints
16 pages
Resolute Absolute Optical Encoder With Siemens Drive-Cliq Serial Communications
No ratings yet
Resolute Absolute Optical Encoder With Siemens Drive-Cliq Serial Communications
10 pages
Design Concepts and Principles
No ratings yet
Design Concepts and Principles
35 pages
Affidavit - Evidence 65-B
100% (3)
Affidavit - Evidence 65-B
4 pages
ARINC 665-1 Loadable SW Standards
100% (1)
ARINC 665-1 Loadable SW Standards
54 pages

ML Module3 Regression

Uploaded by

ML Module3 Regression

Uploaded by

Regression Analysis

• Regression is a supervised learning algorithm under

• Regression is a tool for finding existence of an association

• The relationship can be linear or non-linear.

• A dependent variable (response variable) “measures an

• An independent variable (explanatory variable) “explains

One More than

● Simple linear regression

• House Price Predictions

• In linear regression, the relationships are modeled using linear

• One of the easiest algorithm in machine learning.

• Simple Linear regression: It is a statistical model that

• Data is modeled using a straight line (Y = mX + c)

• Correlation between X and Y variables

m=Slop of the line

c= y – intercept of the line

Linear positive slope Linear negative slope

Slope = (Y − Y ) / (X − X ) = Delta (Y) / Delta(X)

Curve linear positive slope

Curve linear negative slope

Linear positive slope Linear negative slope

Curve linear positive slope

Curve linear negative slope

For minimum loss we take partial

• A method to predict best fit line.

● Step 1: Calculate the mean of X and Y

Example of simple regression

Scatter plot and regression line

Sum of Square of Residual

Residual is the distance between the predicted point (on the

• Two or more independent variables, i.e. predictors are involved in the

• In the example of simple linear regression by considering Price of a

• If we consider Price of a Property (in $) as the dependent variable and

• Parameter b1 represents the

1. The dependent variable (Y) can be calculated / predicated as a

• Two primary problems: Multicollinearity and heteroskedasticity

• Multicollinearity is the situation in which the degree of correlation is

• A multiple regression equation can make good predictions when

• Accuracy refers to how close the estimation is near the actual

• We have a regression model which is highly accurate and highly

Accuracy of linear regression can be improved using the following

• Extension of the simple linear model by adding extra predictors

• The regression line will curve further if we increase the polynomial

• At the extreme value as shown above, the regression line will be

• Logistic regression is a Classification algorithm.

• Logistic regression is a technique used for predicting “class

• Mail[Spam / Not Spam]

• The logit function is the logarithmic transformation of the logistic function.

47 or a probability of near zero that the person is a male.

Basis Linear Regression Logistic Regression

• The basic difference: the type of function that is used for

• Probability density function for binary logistic regression is

The above system of equations are solved iteratively to

• Coursera tutorial - Linear Regression, Logistic Regression

• Business Analytics – by U.Dinesh Kumar

You might also like