Objects Oriented Programming OOP
Example of Objects Oriented Programming are:
Class
Class, Module and Package or Library
A�er running this code, the data from CSV will be load into the variable ‘data’
Since we yse panda, The data variable will be automa�cally converrted to a
dataFrame.
Steps 2 : type Data
Steps 3:
data.describe() to get the the sta��cal summary of the data. It a
panda method to get useful sta��c informa�on about the data
Steps 4:
Define y and x1 Values
We represent x0 as 1
In terms of code for crea�ng the value of x0, StatModel uses
the method add constant
Steps 5:
Crea�ng Regression
Type results.summary() to see more stat data table
Steps 6:
Create the scater plot using matplotlib
Steps 7:
Create the scater plot using Seaborn
Import seaborn as sns
Sns.set()
Then restart your Kernel and re-run the code
This equa�on form the Regression lIne
We can get the predicted value of GPA for any SAT score now by calcula�ng or tracing the value from
the regression line
OLS is the most common method for es�ma�ng linear
regression equa�on,
OLS is the most
common method for
es�ma�ng linear
regression equa�on
Don’t just jump into regression for analysis, Cri�cal thinking is crucial
tools that let you determine factors that affect the outcome of your
predicted outcome.
In the case of GPA score; gender, finance, and marital status may
affect the GPA score of students
Scien�st even dicover gender gap occur in educa�on.
MULTI LINEAR REGRESSION AND ADJUSTABLE R- SQUARED
Step 1
Step 2
The variable Rand 1,2,3 we added affected
the value of R-Square and it Adjustable R-
Square. We added information but lost
value.
Out Conclusion about the varable rand 1,2,3 is to drop
that variable.
Tes�ng for significant with F-sta�s�c
This is another tool that allows us to compare models for analysis
The biggest mistakes you can make is to perform regression
that violate one of these regression assump�ons!
Linear Regression
No Endogeneity
This problem is refers to as omited variable bias.
Predic�ng the value of an real estate in London where the
small apartment are expensive more than the bigger
apartments.
Solu�on with cri�cal thinking and ques�oning
Inpu�ng the missing variables
In conclusion
If the mean of the normality is not zero then the line is not a
best fi�ng one. In real life, it is usual to violate this part of
assup�ons.
Example Poor man and Rich men spending habity lead to Heteroscedas�city
To Prevent Heteroscedas�city we use log transforma�on, look for omited
variable, and remove outliers.
Log Transformation: Assumption_3
How to detect Autocorrelation
Plot all the residual data on grapgh and look is they is no patern. Then you are safe from
autocorrela�on.
Another way to detect autocorrela�on is the Durbin watson
Don’t use linear regression when the error are autocorrelated
or when the data is a �mestamp data.
We experience mul�collinearity when two or more valuable
have a high correla�on.
Preven�ons: Before crea�ng a mul� linear regression model, always
find the correla�on between each two pairs of independent variables.
Data Sample for convert string category data to number category data for our
regression model
Step 1
Step 2
Step 3
Step 4
Print out these
data
data.describe()
Step 5
Create Regression mode
The Comparism with the old data give a significant difference
Step 6
Plot Yhat_yes and Yhat_no model
Step 7
Change the color of the student that atended to green and red
for those that did not atend class
Plot all the three model for the data
Step 8
Predic�ng from our design regression model
X is the data frame of the model
Let create a new data frame and organize it in the same way as x
DataFrame are normally arrange in alphabe�cal order by default
Simple Linear Regression
They’re three type of machine Learning
Supervise
Unsupervised
Reinforcement
The Solu�on
Mul�linear Regression
Note: We didn’t get an error as the dimension is already a mul�ple linear regression, Sklearn is design to
work perfectly with arrays.
Print adjusted_r2
Features Selce�on