BS UNIT-3 (1)
BS UNIT-3 (1)
BS UNIT-3 (1)
CLASS NOTES
M.B.A 1ST YEAR (SEMESTER-2)
SUBJECT CODE- KMBN104
BUSINESS STATISTICS AND ANALYSIS
UNIT -3
LDC GROUP OF INSTITUTIONS OFFERS: B.TECH., MBA, MCA, POLYTECHNIC DIPLA, ITI, BBA & BCA
CLASS NOTES
LDC GROUP OF INSTITUTIONS OFFERS: B.TECH., MBA, MCA, POLYTECHNIC DIPLA, ITI, BBA & BCA
CLASS NOTES
LDC GROUP OF INSTITUTIONS OFFERS: B.TECH., MBA, MCA, POLYTECHNIC DIPLA, ITI, BBA & BCA
CLASS NOTES
n= number of observations
Assumptions
The coefficient of correlation cannot take value less than -1 or more than
one +1. Symbolically,
-1<=r<= + 1 or | r | <1.
This property reveals that if we subtract any constant from all the values
of X and Y, it will not affect the coefficient of correlation.
This property reveals that if we divide or multiply all the values of X and Y,
it will not affect the coefficient of correlation.
LDC GROUP OF INSTITUTIONS OFFERS: B.TECH., MBA, MCA, POLYTECHNIC DIPLA, ITI, BBA & BCA
CLASS NOTES
LDC GROUP OF INSTITUTIONS OFFERS: B.TECH., MBA, MCA, POLYTECHNIC DIPLA, ITI, BBA & BCA
CLASS NOTES
Regression Explained
The two basic types of regression are linear regression and multiple linear
regressions, although there are non-linear regression methods for more
complicated data and analysis. Linear regression uses one independent
variable to explain or predict the outcome of the dependent variable Y,
while multiple regressions use two or more independent variables to
predict the outcome.
LDC GROUP OF INSTITUTIONS OFFERS: B.TECH., MBA, MCA, POLYTECHNIC DIPLA, ITI, BBA & BCA
CLASS NOTES
used regression model in finance for pricing assets and discovering costs
of capital.
Linear regression: Y = a + bX + u
Multiple regression: Y = a + b1X1 + b2X2 + b3X3 + … +
btXt + u
Where:
a = the intercept.
b = the slope.
ASSUMPTIONS IN REGRESSION
LDC GROUP OF INSTITUTIONS OFFERS: B.TECH., MBA, MCA, POLYTECHNIC DIPLA, ITI, BBA & BCA
CLASS NOTES
REGRESSION LINE
Definition: The Regression Line is the line that best fits the data, such that
the overall distance from the line to the points (variable values) plotted on
a graph is the smallest. In other words, a line used to minimize the
squared deviations of predictions is called as the regression line.
Note: The regression lines cut each other at the point of average of X and
Y. This means, from the point where the lines intersect each other the
perpendicular is drawn on the X axis we will get the mean value of X.
Similarly, if the horizontal line is drawn on the Y axis we will get the mean
value of Y.
LDC GROUP OF INSTITUTIONS OFFERS: B.TECH., MBA, MCA, POLYTECHNIC DIPLA, ITI, BBA & BCA
CLASS NOTES
represented as
6. The regression coefficients are independent of the change of
origin, but not of the scale . By origin, we mean that there will
be no effect on the regression coefficients if any constant is
subtracted from the value of X and Y. By scale, we mean
that if the value of X and Y is either multiplied or divided by
some constant, then the regression coefficients will also
change.
Thus, all these properties should be kept in mind while solving for the
regression coefficients.
LDC GROUP OF INSTITUTIONS OFFERS: B.TECH., MBA, MCA, POLYTECHNIC DIPLA, ITI, BBA & BCA
CLASS NOTES
The value of a correlation coefficient can vary from minus one to plus one.
A minus one indicates a perfect negative correlation, while a plus one
indicates a perfect positive correlation. A correlation of zero means there
is no relationship between the two variables. When there is a negative
correlation between two variables, as the value of one variable increases,
the value of the other variable decreases, and vise versa. In other words,
for a negative correlation, the variables work opposite each other. When
there is a positive correlation between two variables, as the value of one
variable increases, the value of the other variable also increases. The
variables move together.
LDC GROUP OF INSTITUTIONS OFFERS: B.TECH., MBA, MCA, POLYTECHNIC DIPLA, ITI, BBA & BCA
CLASS NOTES
Example
A company wanted to know if there is a significant relationship between
the total number of salespeople and the total number of sales. They
collect data for five months.
Variable 1 Variable 2
207 6907
180 5991
220 6810
205 6553
190 6190
——————————–
Another Example
LDC GROUP OF INSTITUTIONS OFFERS: B.TECH., MBA, MCA, POLYTECHNIC DIPLA, ITI, BBA & BCA
CLASS NOTES
Since both variables are ordinal, Spearman’s method is chosen. The first
variable is the rating for the quality the product. Responses are coded as
4=excellent, 3=good, 2=fair, and 1=poor. The second variable is the
perceived reputation of the company and is coded 3=good, 2=fair, and
1=poor.
Variable 1 Variable 2
4 3
2 2
1 2
3 3
4 3
1 1
2 1
——————————————-
Regression Analysis
Simple regression is used to examine the relationship between one
dependent and one independent variable. After performing an analysis,
the regression statistics can be used to predict the dependent variable
when the independent variable is known. Regression goes beyond
correlation by adding prediction capabilities.
LDC GROUP OF INSTITUTIONS OFFERS: B.TECH., MBA, MCA, POLYTECHNIC DIPLA, ITI, BBA & BCA
CLASS NOTES
The regression line (known as the least squares line) is a plot of the
expected value of the dependent variable for all values of the
independent variable. Technically, it is the line that “minimizes the
squared residuals”. The regression line is the one that best fits the data
on a scatterplot.
y = a + bx + e
The significance of the slope of the regression line is determined from the
t-statistic. It is the probability that the observed correlation coefficient
occurred by chance if the true correlation is zero. Some researchers prefer
to report the F-ratio instead of the t-statistic. The F-ratio is equal to the t-
statistic squared.
LDC GROUP OF INSTITUTIONS OFFERS: B.TECH., MBA, MCA, POLYTECHNIC DIPLA, ITI, BBA & BCA
CLASS NOTES
On the other hand, take an example where the slope is zero. It has no
prediction ability because for every value of the independent variable, the
prediction for the dependent variable would be the same. Knowing the
value of the independent variable would not improve our ability to predict
the dependent variable. Thus, if the slope is not significantly different than
zero, don’t use the model to make predictions.
The standard error of the estimate for regression measures the amount of
variability in the points around the regression line. It is the standard
deviation of the data points as they are distributed around the regression
line. The standard error of the estimate can be used to develop
confidence intervals around a prediction.
Example
4.2 27.1
6.1 30.4
3.9 25.0
LDC GROUP OF INSTITUTIONS OFFERS: B.TECH., MBA, MCA, POLYTECHNIC DIPLA, ITI, BBA & BCA
CLASS NOTES
5.7 29.7
7.3 40.1
5.9 28.8
————————————————–
LDC GROUP OF INSTITUTIONS OFFERS: B.TECH., MBA, MCA, POLYTECHNIC DIPLA, ITI, BBA & BCA