0% found this document useful (0 votes)
28 views8 pages

Activity #3' Answer (Yellchin P. Semblante)

The document provides answers to multiple questions about statistical concepts and methods for analyzing relationships between variables, including the differences between correlation and regression, Pearson and Spearman correlation, Kendall rank correlation, regression analysis, and other statistical topics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views8 pages

Activity #3' Answer (Yellchin P. Semblante)

The document provides answers to multiple questions about statistical concepts and methods for analyzing relationships between variables, including the differences between correlation and regression, Pearson and Spearman correlation, Kendall rank correlation, regression analysis, and other statistical topics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Cebu Technological University

Main Campus
Corner M.J. Cuenco Ave.
R.Palma St., Cebu City 6000
Graduate School Studies

NAME: YELLCHIN P. SEMBLANTE

STUDENT I.D. NUMBER:1209253

PROGRAM/COURSE: DOCTOR IN DEVELOPMENT EDUCATION (Dev.Ed.D.)

SUBJECT: MULTIVARIATE ANALYSIS

PROFESSOR: DR. MA. MYRNA PEPITO

DATE ACCOMPLIESHED: March 4, 2023

Answer completely the following questions:

    1. Differentiate correlation from regression.

Answer: The difference between correlation from regression is that the strength

of the linear link between two variables is measured by correlation. As implied by the

name, correlation establishes the connection or co-relationship between the

variables. Regression expresses the relationship as an equation, whereas correlation

does not distinguish between independent and dependent values whereas regression

expresses the relationship in the form of an equation. Regression’ explains how an


independent variable is numerically associated with the dependent variable.

   2. What is the difference between Pearson Product Moment Correlation from

Spearman Rank Correlation?

Answer: A statistic that assesses the linear correlation between two variables

X and Y is called the Pearson correlation coefficient, often known as Pearson's r or

the bivariate correlation. Its value ranges from +1 to -1. The Pearson correlation can

only assess a linear relationship between two continuous variables (a relationship is

linear only when a change in one variable is associated with a proportional change in

the other variable), whereas the Spearman's rank correlation coefficient, also known

as Spearman's, named after Charles Spearman, is a nonparametric measure of

correlation between two variables (statistical dependence between the rankings of

two variables). It evaluates how well a monotonic function can capture the

relationship between two variables. The Spearman correlation uses the ranking

values for each variable rather than the raw data to assess a monotonic relationship

between two continuous or ordinal variables.

3. Discuss and give your idea about the following:

a. Kendall Rank Correlation Coefficient

Answer: The statistic known as Kendall's coefficient, also known as the Kendall rank

correlation coefficient, is used to determine the ordinal relationship between two

measured items. The coefficient-based test is a non-parametric hypothesis test for

statistical dependence. The rank correlation coefficient from Kendall (1955) measures

how close two sets of ranks given to the same collection of objects. This coefficient is

dependent on the quantity of object pair inversions required to change one rank order
into the other. To do this, each rank order is represented by the set of all pairs of

objects (for example, [a,b] and [b,a] are the two pairings that represent the items a

and b), and a value of 1 or 0 is used to indicate whether the pair is a pair of objects.

b. Rank- Biserial Coefficient

Answer: For dichotomous nominal data vs rankings, the rank-biserial correlation

coefficient, rrb, is used (ordinal). The formula is typically expressed as rrb = 2 •(Y1 -

Y0)/n, where n is the number of data pairs and Y0 and Y1, respectively, are the Y

score means for data pairs with an x score of 0 and 1. These Y values are ranked.

This formula assumes there are no tied ranks. This could be the same as a Somer's

D statistic, for which there is an online calculator.

c. Regression Analysis

Answer: Regression analysis is a statistical technique that demonstrates the

relationship between two or more variables. The method, which is usually

represented by a graph, examines the relationship between a dependent variable and

independent variables. A regression analysis allows you to predict the effects of the

independent variable on the dependent variable. For example, age and height can be

described using a linear regression model. Because a person's height increases with

age, they have a linear relationship.

d. Contingency Coefficient

Answer: Contingency coefficients can be used to estimate the extent of a relationship

or to demonstrate the strength of a relationship between two variables. It is a

measure of association between statistical variables with unequal quantitative

categories or at least one of which can only be classified qualitatively.


e. Correlation Coefficient

Answer: A correlation coefficient is a number ranging from -1 to 1 that indicates the

strength and direction of a relationship between two variables.

In other words, it reflects how similar two or more variables' measurements are

across a dataset. A correlation coefficient is an example of a descriptive statistic.

That is, it summarizes sample data without allowing you to draw conclusions about

the population. When there are two variables, a correlation coefficient is a bivariate

statistic, and when there are more than two variables, it is a multivariate statistic.

If your correlation coefficient is based on sample data, an inferential statistic will be

required to generalize your findings to the population. To compute a, you can use a F

test or a t test to calculate a test statistic that indicates the statistical significance of

your discovery.

f. Linear Probability Model

Answer: The Linear Probability Model (LPM) The Linear Probability Model (LPM) is

simply the application of ordinary least squares. (OLS) to binary rather than

continuous outcomes. Equation 1 illustrates the LPM in the context of. Y is the

outcome, and T is the experimental impact estimation.

A linear probability model (LPM) is a subset of a binary regression model in statistics.

In this case, the dependent variable for each observation is either 0 or 1. The

likelihood of seeing a 0 or 1 in any given case is treated as being dependent on one

or more explanatory variables.


g. Simple Linear Regression

Answer: To estimate the relationship between two quantitative variables, simple

linear regression is used. When you want to know how strong the relationship

between two variables is (for example, the relationship between rainfall and soil

erosion), you can use simple linear regression.

In business, linear regressions can be used to evaluate trends and make estimates or

forecasts. For example, if a company's sales have steadily increased every month for

the past few years, the company could forecast sales in future months by conducting

a linear analysis on the sales data with monthly sales.

h. Concordant

Answer: Concordant is simply the percentage of data pairs that behave as expected

(assuming no ties in relative risk values between any pairs of observations). Another

way to look at Concordant is as a predictive device. The concordant rate is a

statistical index used by researchers to determine the relative importance of nature

and nurture. Because the word concordance means 'to agree,' the concordance rate

is the rate of agreement.

i. Coefficient of Correlation

Answer: In a correlation analysis, the correlation coefficient is the specific measure

that quantifies the strength of the linear relationship between two variables. In a

correlation report, the coefficient is represented by the letter r.

How does the correlation coefficient come into play?

The formula compares the distance of each datapoint from the variable mean and

uses this to determine how well the relationship between the variables can be fit to an
imaginary line drawn through the data for two variables. When we say that

correlations look at linear relationships, we mean exactly that.

j. Coefficient of Determination

Answer: The coefficient of determination (R2) is a number between 0 and 1 that

indicates how well a statistical model predicts an outcome. R2 can be interpreted as

the proportion of variation in the dependent variable predicted by the statistical

model. The coefficient of determination can also be found using the following formula:

R2 = MSS/TSS = (TSS RSS)/TSS, where MSS is the model sum of squares (also

known as ESS, or explained sum of squares), which is the sum of the squares of the

prediction from the linear regression minus the mean for that variable; TSS is the total

sum of squares, which is the sum of the squares of the measurements minus the

linear regression prediction.

k. Confidence Intervals

Answer: Confidence intervals are one way to express how "good" an estimate is; the

larger the 90% confidence interval for a specific estimate, the greater the caution

required when using the estimate. Confidence intervals serve as a helpful reminder of

the estimates' limitations. The goal of confidence intervals is to provide us with a

range of values for our estimated population parameter rather than a single value or a

point estimate. The estimated confidence interval provides us with a range of values

within which we believe the true population value falls, with varying degrees of

certainty.
l. Method of Least Square

Answer: The least squares method is a type of mathematical regression analysis that

is used to determine the best fit line for a set of data points, providing a visual

representation of the relationship between the data points. Each data point

represents the relationship between a known independent variable and an unknown

dependent variable. The regression analysis method begins with a set of data points

that will be plotted on an x- and y-axis graph. Using the least squares method, an

analyst will generate a best-fit line that explains the potential relationship between

independent and dependent variables. The least squares method provides the overall

rationale for locating the line of best fit among the data points under consideration.

The most common application of this method, which is also known as "linear" or

"ordinary," aims to generate a straight line that minimizes the sum of the squares of

the errors generated by the results of the associated equations, such as the squared

residuals resulting from differences in the observed and anticipated values based on

that model.

m. Line of Best Fit

Answer: A straight line with the best fit is one that minimizes the distance between it

and some data. In a scatter plot of different data points, the line of best fit is used to

express a relationship. It is a result of regression analysis and can be used to

forecast indicators and price movements. The regression line is chosen using the

least squares criterion. The regression line is also known as the "line of best fit"
because it is the best fit when drawn through the points. It is a line that minimizes the

difference between the actual and predicted scores.

   4. How to interpret the coefficient of determination?

Answer: The coefficient of determination (R2) can be interpreted as the proportion of

variance in the dependent variable predicted by the statistical model.

Another way to look at it is that R2 is the proportion of variance shared by the

independent and dependent variables.

The R2 is also known as the proportion of variance "explained" or "accounted for" by

the model. The variance that the model does not predict is represented by the

proportion that remains (1 R2).

If you prefer, you can express the R2 as a percentage rather than a proportion.

Simply multiply the percentage by 100.

The coefficient of determination is commonly interpreted as how well the regression

model fits the observed data. A coefficient of determination of 60%, for example,

indicates that 60% of the data fit the regression model. A higher coefficient, in

general, indicates a better fit for the model.

You might also like