ML Mod 2
ML Mod 2
ML Mod 2
Ans
● Predict the value of the dependent variable based on the values of the
independent variables.
● Understand the relative importance of each independent variable in
influencing the dependent variable.
● Identify potential interactions between independent variables.
Key Features:
Equation:
● Y: Dependent variable
● X1, X2, ..., Xp: Independent variables
● β0: Intercept (constant term)
● β1, β2, ..., βp: Regression coefficients, representing the change in Y for a
unit change in each X
● ε: Error term (random variation not explained by the model)
Example:
Applications:
This is the simplest form of linear regression, and it involves only one
independent variable and one dependent variable. The equation for
simple linear regression is:
where:
β0 is the intercept
β1 is the slope
Multiple Linear Regression
This involves more than one independent variable and one dependent
variable. The equation for multiple linear regression is:
where:
β0 is the intercept
Independent variable (x): Square footage of the house (in square feet)
Data Collection:
Scatter Plot:
The agent creates a scatter plot of the data, with square footage on the x-axis
and selling price on the y-axis. Similar to the previous example, the plot should
show a positive linear relationship, indicating that houses with larger square
footage tend to have higher selling prices in Mumbai.
Regression Line:
Interpretation:
● Intercept: 18 lakhs
● Slope: 1.2 lakhs per 1000 square feet
Predictions:
Regression Line:
● It's the straight line that best represents the relationship between the
independent variable (square footage) and the dependent variable
(selling price) in the given data.
● In the example, the regression line equation was y = 18 + 1.2x (where y is
in lakhs).
● It means for every additional 1000 square feet, the predicted selling price
increases by 1.2 lakhs.
Scatter Plot:
Error in Prediction:
● It's the difference between the actual selling price of a house and the
price predicted by the regression line.
● Regression models aren't perfect, so there will almost always be some
errors.
● It's the regression line that minimizes the overall error in prediction.
● It's the line that best balances fitting the existing data points while also
generalizing to new data.
● Statistical software is often used to find the best-fitting line, using
techniques like least squares regression.
4. Explain the concept of Logistic Regression
Ans
Logistic regression is a supervised machine learning algorithm mainly
used for binary classification where we use a logistic function, also known
as a sigmoid function that takes input as independent variables and
produces a probability value between 0 and 1.
For example, we have two classes Class 0 and Class 1 if the value of the
logistic function for an input is greater than 0.5 (threshold value) then it
belongs to Class 1 and if lower than 0.5 then it belongs to Class 0. It’s
referred to as regression because it is the extension of linear regression
but is mainly used for classification problems.
It is used for predicting the categorical dependent variable using a given set of
independent variables.