0% found this document useful (0 votes)
2 views4 pages

Sample Lab File

The document outlines a lab practical focused on using linear regression to predict the weight of Savannah sparrows based on their wing length. It includes objectives, theoretical background, assumptions, dependencies, and a sample code for implementing linear regression. Additionally, it provides a lab assignment on predicting student scores and includes quiz and viva questions related to regression concepts.

Uploaded by

andrewtatehey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views4 pages

Sample Lab File

The document outlines a lab practical focused on using linear regression to predict the weight of Savannah sparrows based on their wing length. It includes objectives, theoretical background, assumptions, dependencies, and a sample code for implementing linear regression. Additionally, it provides a lab assignment on predicting student scores and includes quiz and viva questions related to regression concepts.

Uploaded by

andrewtatehey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Shri Vaishnav Vidyapeeth Vishwavidyalaya, Indore

Shri Vaishnav Institute of Information Technology

Sample Lab Practical

1. Title:- A scientist has collected data on a stratified random sample of 116 Savannah
sparrows at Kent Island. The weight (in grams) and wing length (in mm) were obtained for
birds from nests that were reduced, controlled, or enlarged. Is wing length a significant linear
predictor of weight for Savannah sparrows?

2. Outcome:- Must be able to predict the wing length unseen weights are given.

3. Objectives
Understand the relationship between observations and response variable.

4. Nomenclature, theory with self-assessment questionnaire:-


4.1 Nomenclature:
Y Response Variable

X Input Data

a Bias

b Weight associated with Input

4.2 Solution:
Linear regression attempts to model the relationship between two variables by fitting
a linear equation to observed data. One variable is considered to be an explanatory
variable, and the other is considered to be a dependent variable. For example, a
modeler might want to relate the weights of individuals to their heights using a linear
regression model.
Before attempting to fit a linear model to observed data, a modeler should first
determine whether or not there is a relationship between the variables of interest. This
does not necessarily imply that one variable causes the other (for example, higher
SAT scores do not cause higher college grades), but that there is some significant
association between the two variables. A scatterplot can be a helpful tool in
determining the strength of the relationship between two variables. If there appears to
be no association between the proposed explanatory and dependent variables (i.e., the
scatterplot does not indicate any increasing or decreasing trends), then fitting a linear
regression model to the data probably will not provide a useful model. A valuable
numerical measure of association between two variables is the correlation coefficient,
which is a value between -1 and 1 indicating the strength of the association of the
observed data for the two variables.
Shri Vaishnav Vidyapeeth Vishwavidyalaya, Indore
Shri Vaishnav Institute of Information Technology

A linear regression line has an equation of the form Y = a + bX, where X is the
explanatory variable and Y is the dependent variable. The slope of the line is b, and a
is the intercept (the value of y when x = 0).

4.3 Assumptions:
4.3.1 Linearity: The relationship between X and the mean of Y is linear.
4.3.2 Homoscedasticity: The variance of residual is the same for any value of X.
4.3.3 Independence: Observations are independent of each other.
4.3.4 Normality: For any fixed value of X, Y is normally distributed.
4.4 Dependencies
4.4.1 Numpy
4.4.2 Matplotlib
4.4.3 Sklearn
4.5 Code/ Pseudo Code
import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, linear_model
from sklearn.metrics import mean_squared_error, r2_score

# Load the dataset


X, y = datasets.load_db(return_X_y=True)

# Use only one feature


X = X[:, np.newaxis, 2]

# Split the data into training/testing sets


X_train = X[:-20]
X_test = X[-20:]

# Split the targets into training/testing sets


y_train = y[:-20]
y_test = y[-20:]

# Create linear regression object


regr = linear_model.LinearRegression()

# Train the model using the training sets


regr.fit(X_train, y_train)

# Make predictions using the testing set


y_pred = regr.predict(X_test)

# The coefficients
print('Coefficients: \n', regr.coef_)
# The mean squared error
print('Mean squared error: %.2f'
% mean_squared_error(y_test, y_pred))
Shri Vaishnav Vidyapeeth Vishwavidyalaya, Indore
Shri Vaishnav Institute of Information Technology

# The coefficient of determination: 1 is perfect prediction


print('Coefficient of determination: %.2f'
% r2_score(y_test,y_pred))

# Plot outputs
plt.scatter(X_test, y_test, color='black')
plt.plot(X_test, y_pred, color='blue', linewidth=3)

plt.xticks(())
plt.yticks(())

plt.show()

4.6 Results
4.6.1 Test Case
Input: 4 Output: 6
Input: 7 Output: 8
4.6.2 Result Analysis
4.6.2.1 Advantages
4.6.2.2 Issues
4.7 References:

5. Lab Assignment: In this regression task you have to predict the percentage of marks that a
student is expected to score based upon the number of hours they studied. This is a simple
linear regression task as it involves just two variables.

6. Quiz & Viva Questions


6.1 Quiz:

 Which of the following methods do we use to find the best fit line for data in Linear
Regression
(a) Least Square Error
(b) Maximum Likelihood
(c) Logrithmic Loss
(d) Both A and B
 Which o
 f the following evaluation metrics can be used to evaluate a model while modeling a
continuous output variable?
(a) AUC-ROC
(b) Accuracy
(c) Logloss
(d) Mean Squared Error
Shri Vaishnav Vidyapeeth Vishwavidyalaya, Indore
Shri Vaishnav Institute of Information Technology

6.2 Viva
 What is regression?
 What is the difference between observation and response variable?
 What is the expectation of response variable?
 What do you mean by residuals?

You might also like