ML Mod 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Q1. Explain Multivariate Linear regression method.

Ans

Multivariate Linear Regression is a statistical technique that extends simple


linear regression to model the relationship between multiple independent
variables and a single dependent variable. It aims to find the best linear
equation that fits the data, allowing you to:

● Predict the value of the dependent variable based on the values of the
independent variables.
● Understand the relative importance of each independent variable in
influencing the dependent variable.
● Identify potential interactions between independent variables.

Key Features:

● Multiple independent variables: It can handle multiple predictors, unlike


simple linear regression.
● Linear relationship: It assumes a linear relationship between each
independent variable and the dependent variable.
● Continuous variables: Both dependent and independent variables should
be continuous (numerical).

Equation:

Y = β0 + β1X1 + β2X2 + ... + βpXp + ε

● Y: Dependent variable
● X1, X2, ..., Xp: Independent variables
● β0: Intercept (constant term)
● β1, β2, ..., βp: Regression coefficients, representing the change in Y for a
unit change in each X
● ε: Error term (random variation not explained by the model)
Example:

Predicting house prices:

● Independent variables: square footage, number of bedrooms, age of the


house
● Dependent variable: house price

Applications:

● Economics: Predicting sales, consumer behavior, stock prices


● Medicine: Studying the effects of treatments, identifying risk factors for
diseases
● Engineering: Optimizing production processes, modeling system behavior
Q2. Explain Linear regression along with an example.
Ans
Linear regression is a statistical method used to model the relationship between
a dependent variable and one or more independent variables by fitting a linear
equation to observed data. The goal is to find the best-fitting line that
minimizes the sum of the squared differences between the observed values and
the values predicted by the linear model.

Types of Linear Regression


There are two main types of linear regression:

Simple Linear Regression

This is the simplest form of linear regression, and it involves only one
independent variable and one dependent variable. The equation for
simple linear regression is:

where:

Y is the dependent variable

X is the independent variable

β0 is the intercept

β1 is the slope
Multiple Linear Regression

This involves more than one independent variable and one dependent
variable. The equation for multiple linear regression is:

where:

Y is the dependent variable

X1, X2, …, Xp are the independent variables

β0 is the intercept

β1, β2, …, βn are the slopes

Predicting House Prices in India:

Problem: A real estate agent in Mumbai, Maharashtra wants to predict the


selling prices of houses based on their square footage.

Independent variable (x): Square footage of the house (in square feet)

Dependent variable (y): Selling price of the house (in rupees)

Data Collection:
Scatter Plot:

The agent creates a scatter plot of the data, with square footage on the x-axis
and selling price on the y-axis. Similar to the previous example, the plot should
show a positive linear relationship, indicating that houses with larger square
footage tend to have higher selling prices in Mumbai.

Regression Line:

y = 18 + 1.2x (where y is in lakhs)

Interpretation:

● Intercept: 18 lakhs
● Slope: 1.2 lakhs per 1000 square feet

Predictions:

● For a 1800 sq ft house: y = 18 + 1.2(1.8) = 37.2 lakhs


● For a 2500 sq ft house: y = 18 + 1.2(2.5) = 45 lakhs
Q3. Explain Regression line, Scatter plot, Error in prediction and Best fitting
line.
Ans

Regression Line:

● It's the straight line that best represents the relationship between the
independent variable (square footage) and the dependent variable
(selling price) in the given data.
● In the example, the regression line equation was y = 18 + 1.2x (where y is
in lakhs).
● It means for every additional 1000 square feet, the predicted selling price
increases by 1.2 lakhs.

Scatter Plot:

● It's a graph that visualizes the relationship between two variables by


plotting individual data points.
● In the house price example, the scatter plot would show each house's
square footage on the x-axis and its selling price on the y-axis.

Error in Prediction:

● It's the difference between the actual selling price of a house and the
price predicted by the regression line.
● Regression models aren't perfect, so there will almost always be some
errors.

Best Fitting Line:

● It's the regression line that minimizes the overall error in prediction.
● It's the line that best balances fitting the existing data points while also
generalizing to new data.
● Statistical software is often used to find the best-fitting line, using
techniques like least squares regression.
4. Explain the concept of Logistic Regression
Ans
Logistic regression is a supervised machine learning algorithm mainly
used for binary classification where we use a logistic function, also known
as a sigmoid function that takes input as independent variables and
produces a probability value between 0 and 1.

For example, we have two classes Class 0 and Class 1 if the value of the
logistic function for an input is greater than 0.5 (threshold value) then it
belongs to Class 1 and if lower than 0.5 then it belongs to Class 0. It’s
referred to as regression because it is the extension of linear regression
but is mainly used for classification problems.

The difference between linear regression and logistic regression is that


linear regression output is the continuous value that can be anything
while logistic regression predicts the probability that an instance belongs
to a given class or not.

It is used for predicting the categorical dependent variable using a given set of
independent variables.

● Logistic regression predicts the output of a categorical dependent


variable. Therefore the outcome must be a categorical or discrete
value.
● It can be either Yes or No, 0 or 1, True or False, etc. but instead of
giving the exact value as 0 and 1, it gives the probabilistic values
which lie between 0 and 1.
● Logistic Regression is much similar to Linear Regression except that
how they are used. Linear Regression is used for solving Regression
problems, whereas Logistic regression is used for solving the
classification problems.
● In Logistic regression, instead of fitting a regression line, we fit an “S”
shaped logistic function, which predicts two maximum values (0 or 1).
● The sigmoid function is a mathematical function used to map the
predicted values to probabilities.

You might also like