Simple linear regression analyzes the relationship between two continuous variables, where one variable is considered the independent or predictor variable (X) and the other is the dependent or response variable (Y). The regression line that best fits the data is the line for which the sum of the squared residuals is minimized. The correlation coefficient measures the strength and direction of the linear relationship between the two variables.
Simple linear regression analyzes the relationship between two continuous variables, where one variable is considered the independent or predictor variable (X) and the other is the dependent or response variable (Y). The regression line that best fits the data is the line for which the sum of the squared residuals is minimized. The correlation coefficient measures the strength and direction of the linear relationship between the two variables.
Simple linear regression analyzes the relationship between two continuous variables, where one variable is considered the independent or predictor variable (X) and the other is the dependent or response variable (Y). The regression line that best fits the data is the line for which the sum of the squared residuals is minimized. The correlation coefficient measures the strength and direction of the linear relationship between the two variables.
Simple linear regression analyzes the relationship between two continuous variables, where one variable is considered the independent or predictor variable (X) and the other is the dependent or response variable (Y). The regression line that best fits the data is the line for which the sum of the squared residuals is minimized. The correlation coefficient measures the strength and direction of the linear relationship between the two variables.
Download as PPT, PDF, TXT or read online from Scribd
Download as ppt, pdf, or txt
You are on page 1of 8
9.
1 Simple Linear Regression Analysis
Regression is concerned with bringing out the nature of relationship and using it to know the best approximate value of one variable corresponding to a known value of other variable. Simple linear regression deals with method of fitting a straight line (regression line) on a sample of data of two variables in terms of equation so that if the value of one variable is given we can predict the value of the other variable. In other words if we have two variables under study one may represent the cause and the other may represent the effect. The variable representing the cause is known as independent (predictor or repressor) variable and it is usually denoted by X. The variable representing the effect is known as dependent (predicted) variable and is usually denoted by Y. The simple linear regression of Y on X in the population is given by: Y = + X + ε Where, = y-intercept = slope of the line or regression coefficient ε=is the error term The y-intercept and the regression coefficient are the population parameters. We obtain the estimates of and from the sample. The estimators of and are denoted by a and b, respectively. The fitted regression line is thus, Ye = a + b X The difference between the observed and the expected values Y-Ye, is known as error or residual, and is denoted by e. A best fitting line is one for which the sum of squares of the residuals,∑e2 , is minimum. For this purpose the principle called the method of least squares is used. According to the principle of least squares, one would select a and b such that ∑e2 = (Y- Ye) ² is minimum To minimize this function, first we take the partial derivatives of ∑e2 with respect to a and b. Regression analysis is useful in predicting the value of one variable from the given values of another variable. The measure of the degree of relationship between two continuous variables is known as correlation coefficient.
The population correlation coefficient is represented
by and its estimator by r.
r is given as the ratio of the covariance of the variables
x and y to the product of the standard deviations of x and y. Symbolically, The correlation coefficient is always between –1 and +1, i.e. -1<=r<=1 Interpretation r = +1 indicates a perfect positive linear relationship between X and Y. r = -1 indicates a perfect negative linear relationship between X and Y. r = 0 implies there is no linear relationship between the two variables X and Y. as r approaches -1 indicates strong relationship (positive or negative ) between the two variables as r approaches 0 indicates weak relationship (positive or negative) b/n the two variables Trace metals in drinking water affect the flavor of the water, and unusually high concentration can pose a health hazard. The following table shows trace-metal concentrations (zinc, in mg/L) for both surface water and bottom water at six different river locations. Our aim is to see if surface water concentration (x) is predictive of bottom water concentration (y).
Location 1 2 3 4 5 6
Bottom 0.43 0.27 0.58 0.53 0.71 0.72
Surface 0.42 0.24 0.39 0.41 0.61 0.61
a) Estimate the regression parameters, fit the regression line and interpret the coefficients.
b) Estimate the bottom water concentration for
location with a surface water concentration of 0.5 mg/L. c) Calculate the correlation coefficient and coefficient of determination and provide your interpretation.