Saghar 8612-2

1
Educational Statistics
ALLAMA IQBAL OPEN UNIVERSITY ISLAMABAD
Assignment No: 02
Submitted By A.R Saghar
Roll No CA651432
Course Title Educational Statistics
Course Code 8612
Level B.ed( 1.5 years)
Semester Spring 2021

2
ASSIGNMENT No. 2
(Unit: 6-9)
Question#1 What do you know about?
a) An independent sample t-test.
b) A paired sample t-test
Answer:
T-Test
A t-test is a useful statistical technique used for comparing mean values of two data sets obtained
from two groups. The comparison tells us whether these data sets are different from each other. It
further tells us how significant the differences are and if these differences could have happened
by chance. The statistical significance of t-test indicates whether or not the difference between
the mean of two groups most likely reflects a real difference in the population from which the
groups are selected. t-tests are used when there are two groups (male and female) or two sets of
data (before and after), and the researcher wishes to compare the mean score on some continuous
variable.
Type of T-Test
There is a number of t-test available but two main types independent sample t-test and paired
sample t-test is most commonly used. Let us deal with these types in some detail.
i) Independent sample t-test
Independent sample t-test is used when there are two different independent groups of people and
the researcher is interested to compare their scores. In this case the researcher collects
information from two different groups of people on only one occasion. Independent sample t-test
3
is a statistical technique that is used to analyze the mean comparison of two independent groups.
In independent samples t-test, when we take two samples from the same population, then the
mean of the two samples may be identical. But when samples are taken from two different
populations, then the mean of the sample may differ. In this case, it is used to draw conclusions
about the means of two populations, and used to tell whether or not they are similar.
An unpaired t-test (also known as an independent t-test) is a statistical procedure that compares
the averages/means of two independent or unrelated groups to determine if there is a significant
difference between the two.
What are the hypotheses of an unpaired t-test?
The hypotheses of an unpaired t-test are the same as those for a paired t-test. The two hypotheses
are:
The null hypothesis (H0) states that there is no significant difference between the means of the
two groups.
 The alternative hypothesis (H1) states that there is a significant difference between the two
population means, and that this difference is unlikely to be caused by sampling error or
chance.
What are the assumptions of an unpaired t-test?
 The dependent variable is normally distributed
 The observations are sampled independently
 The dependent variable is measured on an incremental level, such as ratios or intervals.
 The variance of data is the same between groups, meaning that they have the same standard
deviation
 The independent variables must consist of two independent groups.

4
When to use an unpaired t-test?
An unpaired t-test is used to compare the mean between two independent groups. You use an
unpaired t-test when you are comparing two separate groups with equal variance.
Examples of appropriate instances during which to use an unpaired t-test:
 Research, such as a pharmaceutical study or other treatment plan, where ½ of the subjects are
assigned to the treatment group and ½ of the subjects are randomly assigned to the control
group.
 Research during which there are two independent groups, such as women and men, that
examines whether the average bone density is significantly different between the two groups.
 Comparing the average commuting distance traveled by New York City and San Francisco
residents using 1,000 randomly selected participants from each city.
In the case of unequal variances, a Welch’s test should be used.
ii) Paired sample t-test
Paired sample t-test is also called repeated measures. It is used the researcher is interested in
comparing changes in the scores of the same group tested at two different occasions.
Here at this level it is necessary to know some general assumptions regarding use of ttest. The
first assumption regarding t-test concerns the scale of measurement. It means that it is assumed
that the dependent variable is measured at interval or ratio scale. The second assumption made is
that of a simple random sample, that the data is collected from a representative, randomly
selected portion of the total population. The third assumption is that the data, when plotted,
results in a normal distribution i.e. in bellshaped distribution curve. The fourth assumption is that
the observation that make up data must independent of one another. That is, each observation or
5
measurement must not be influences by any other observation or measurement. The fifth
assumption is that a reasonably large sample size is used. A large sample size means that the
distribution of results should approach a normal bell-shaped curve. The final assumption is
homogeneity of variance. Variance will be homogeneous or equal when the standard deviation of
samples is approximately equal.
A paired t-test (also known as a dependent or correlated t-test) is a statistical test that compares
the averages/means and standard deviations of two related groups to determine if there is a
significant difference between the two groups.
 A significant difference occurs when the differences between groups are unlikely to be due to
sampling error or chance.
 The groups can be related by being the same group of people, the same item, or being
subjected to the same conditions.
Paired t-tests are considered more powerful than unpaired t-tests because using the same
participants or item eliminates variation between the samples that could be caused by anything
other than what’s being tested.
What are the hypotheses of a paired t-test?
There are two possible hypotheses in a paired t-test.
 The null hypothesis (H0) states that there is no significant difference between the means of
the two groups.
 The alternative hypothesis (H1) states that there is a significant difference between the two
population means, and that this difference is unlikely to be caused by sampling error or
chance.
6
What are the assumptions of a paired t-test?
 The dependent variable is normally distributed
 The observations are sampled independently
 The dependent variable is measured on an incremental level, such as ratios or intervals.
 The independent variables must consist of two related groups or matched pairs.
When to use a paired t-test?
Paired t-tests are used when the same item or group is tested twice, which is known as a repeated
measures t-test. Some examples of instances for which a paired t-test is appropriate include:
 The before and after effect of a pharmaceutical treatment on the same group of people.
 Body temperature using two different thermometers on the same group of participants.
 Standardized test results of a group of students before and after a study prep course.
----------------------------------------------Q#1 THE END-------------------------------------------
Question#2 Why do we use regression analysis? Write down the types of regression.
Answer:
Regression
A correlation quantifies the degree and direction to which two variables are related. It does not
fit a line through the data points. It does not have to think about the cause and effect. It does not
natter which of the two variables is called dependent and which is called independent. On the
other hand regression finds the best line that predicts dependent variables from the independent
7
variable. The decision of which variable is calls dependent and which calls independent is an
important matter in regression, as it will get a different best-fit line if we exchange the two
variables, i.e. dependent to independent and independent to dependent. The line that best predicts
independent variable from dependent variable will not be the same as the line that predicts
dependent variable from independent variable.
Let us start with the simple case of studying the relationship between two variables X and Y. The
variable Y is dependent variable and the variable X is the independent variable. We are
interested in seeing how various values of the independent variable X predict corresponding
values of dependent Y. This statistical technique is called regression analysis. We can say that
regression analysis is a technique that is used to model the dependency of one dependent variable
upon one independent variable. Merriam-Webster online dictionary defines regression as a
functional relationship between two or more correlated variables that is often empirically
determined from data and is used especially to predict values of one variable when given
variables of others. According to Gravetter & Wallnua (2002), regression is a statistical
technique for finding the best-fitting straight line for a set of data is called regression, and the
resulting straight line is called regression line.
Objectives of Regression Analysis
The regression analysis is used to explain variability in dependent variable by mean of one or
more of independent variables and to analyze relationships among variables to answer the
question of how much dependent variable changes with the changes in the independent variables
and to forecast or predict the value of dependent variable based on the values of the independent
variable. The primary objective of the regression is to develop a relationship between a response
8
variable and the explanatory variable for the purpose of prediction, assumes that a functional
relationship exists, and alternative approaches are superior.
Why do we use Regression Analysis?
Regression analysis estimates the relationship between two or more variables and is used for
forecasting or finding cause and effect relationship between the variables. There are multiple
benefits of using regression analysis. These are as follows:
i) It indicates the significant relationships between dependent and the independent
variables.
ii) It indicates the strength of impact of multiple independent variables on a dependent
variable.
iii) It allows us to compare the effects of variables measured on different scales. These
benefits help a researcher to estimate and evaluate the best set of variables to be used for
building productive models.
Types of Regression
Commonly used types of regression are:
i) Linear Regression
It is the most commonly used types of regression. In this technique the dependent variable is
continuous and the independent variable can be continuous or discrete and the nature of
regression line is linear. Linear regression establishes a relationship between dependent variable
(Y) and one or more independent variables (X) using best fit straight line (also known as
regression line).
9
ii. Logistic Regression
Logistic regression is a statistical method for analyzing a dataset in which there are one or more
independent variables that determine an outcome. The outcome is measured with the
dichotomous (binary) variable. Like all regression analysis, the logistic regression is a predictive
analysis. It is used to describe and explain relationship between one dependent binary variable
and one or more nominal, ordinal, interval or ratio level independent variables.
iii. Polynomial Regression
It is a form of regression analysis in which the relationship between independent variable X and
dependent variable Y is modeled as an nth degree polynomial in x. this type of regression fits a
non-linear relationship between the values of X with the corresponding values of Y.
iv. Stepwise Regression
It is a method of fitting regression model in which the choice of predictive variables is carried
out by an automatic procedure. In each step, a variable is considered for addition or subtraction
from the set of explanatory variables based on some pre-specified criteria. The general idea
behind this procedure is that we build our regression model from a set of predictor variable by
entering and removing predictors in our model, in a stepwise manner, until there is no justifiable
reason to enter or remove any more.
v. Ridge Regression
It is a technique for analyzing multiple regression data that suffer from multi-collinearity
(independent variables are highly correlated). When multi-collinearity occurs, least squares
estimates are unbiased, but their variances are large so that they may be far from the true value.
By adding the degree of bias to the regression estimates, ridge regression reduces the standard
errors.
10
vi. LASSO Regression
LASSO or lasso stands for Least Absolute Shrinkage and Selection Operator. It is a method that
performs both variable selection and regularization in order to enhance the prediction accuracy
and interpretability of the statistical model it produces. This type of regression uses shrinkage.
Shrinkage is where data values are shrunk towards a central point, like the mean.
vii. Elastic Net Regression
This type of regression is a hybrid of lasso and ridge regression techniques. It is useful when
there are multiple features which are correlated.
-----------------------------------------Q#2 THE END-------------------------------------------
Question#3 Write a short note on one way ANOVA. Write down main assumptions
underlying and way ANOVA.
Answer:
Introduction to Analysis of Variance (ANOVA)
The t-tests have one very serious limitation – they are restricted to tests of the significance of the
difference between only two groups. There are many times when we like to see if there are
significant differences among three, four, or even more groups. For example we may want to
investigate which of three teaching methods is best for teaching ninth class algebra. In such case,
we cannot use t-test because more than two groups are involved. To deal with such type of cases
11
one of the most useful techniques in statistics is analysis of variance (abbreviated as ANOVA).
This technique was developed by a British Statistician Ronald A. Fisher.
Analysis of Variance (ANOVA) is a hypothesis testing procedure that is used to evaluate mean
differences between two or more treatments (or population). Like all other inferential procedures.
ANOVA uses sample data to as a basis for drawing general conclusion about populations.
Sometime, it may appear that ANOVA and t-test are two different ways of doing exactly same
thing: testing for mean differences. In some cased this is true – both tests use sample data to test
hypothesis about population mean. However, ANOVA has much more advantages over t-test. t-
tests are used when we have compare only two groups or variables (one independent and one
dependent). On the other hand ANOVA is used when we have two or more than two independent
variables (treatment). Suppose we want to study the effects of three different models of teaching
on the achievement of students. In this case we have three different samples to be treated using
three different treatments. So ANOVA is the suitable technique to evaluate the difference.
One Way ANOVA (Logic and Procedure)
The one way analysis of variance (ANOVA) is an extension of independent two-sample ttest. It
is a statistical technique by which we can test if three or more means are equal. It tests if the
value of a single variable differs significantly among three or more level of a factor. We can also
say that one way ANOVA is a procedure of testing hypothesis that K population means are
equal, where K ≥ 2. It compares the means of the samples or groups in order to make inferences
about the population means. Specifically, it tests the null hypothesis:
Ho: µ1 = µ2 = µ3 = ... = µk
Where µ = group mean and k = number of groups

12
If one way ANOVA yields statistically significant result, we accept the alternate hypothesis
(HA), which states that there are two group means that are statistically significantly different
from each other. Here it should be kept in mind that one way ANOVA cannot tell which specific
groups were statistically significantly different from each other. To determine which specific
groups are different from each other, a researcher will have to use post hoc test. As there is only
one independent variable or factor in one way ANOVA so it is also called single factor ANOVA.
The independent variable has nominal levels or a few ordinal levels. Also, there is only one
dependent variable and hypotheses are formulated about the means of the group on dependent
variable. The dependent variable differentiates individuals on some quantitative dimension.
Assumptions Underlying the One Way ANOVA
There are three main assumptions
i) Assumption of Independence
According to this assumption the observations are random and independent samples from the
populations. The null hypothesis actually states that the samples come from populations that
have the same mean. The samples must be random and independent if they are to be
representative of the populations. The value of one observation is not related to any other
observation. In other words, one individual’s score should not provide any clue as to how any of
the other individual should score. That is, one event does not depend on another. A lack of
assumption of independence leads to most serious consequences. If this assumption is violated,
one way ANOVA will be inappropriate to statistic,
ii) Assumption of Normality
The distributions of the population from which the samples are selected are normal. This
assumption implies that the dependent variable is normally distributed in each of the groups.
13
One way ANOVA is considered a robust test against the assumption of normality and tolerates
the violation of this assumption. As regards the normality of grouped data, the one way ANOVA
can tolerate data that is normal (skewed or kurtotic distribution) with only a small effect on I
error rate. However, platy kurtosis can have profound effect when group sizes are small. This
leaves a researcher with two options:
i) Transform data using various algorithms so that the shape of the distribution becomes
normally distributed. Or
ii) Choose nonparametric Kruskal-Wallis H Test which does not require the assumption of
normality. (This test is available is SPSS).
iii) Assumptions of Homogeneity of Variance
The variances of the distribution in the populations are equal. This assumption provides that the
distribution in the population have the same shapes, means, and variances; that is, they are the
same populations. In other words, the variances on the dependent variable are equal across the
groups.
Logic Behind One Way ANOVA
In order to test pair of sample means differ by more than would be expected by chance, we
might conduct a series of t-tests on K sample means – however, this approach has a major
problem, i.e.
When we use a t-test once, there is a chance of Type I error. The magnitude of this error is
usually 5%. By running two tests on the same data we will have increased his chance of making
error to 10%. For the third administration, it will be 15%, and so on. These are unacceptable
errors. The number of t-tests needed to compare all possible means would be:
K (K-1)
2
14
Where K= numbers of mean

When more than one t-test is run, each at a specific level of significance such as α = .05, the
probability of making one or more Type I error in a series of t-test is greater than α. The
increased number of Type I error is determined as:
1 – (1 - α) c
Where α = level of significance for each separate t-test
c = number of independent t-test
An ANOVA controls the chance for these errors so that the type I error remains at 5% and a
researcher can become more confident about the results.
---------------------------------------------Q#3 THE END-----------------------------------------
Question#4 What do you know about chi- square (x2) goodness of fit test? Write down
the procedure for goodness of fit test.
Answer:
Introduction
The chi-square (χ2) statistics is commonly used for testing relationship between categorical
variables. It is intended to test how likely it is that an observed difference is due to chance. In
most situations it can be used as a quick test of significance. In this unit you will study this
important technique in detail.
The Chi-Square Distribution
The Chi-Square (or the Chi-Squared - χ2) distribution is a special case of the gamma distribution
(the gamma distribution is family of right skewed, continuous probability distribution. These
15
distributions are useful in real life where something has a natural minimum of 0.). a chi-square
distribution with n degree of freedom is equal to a gamma distribution with a = n/2 and b = 0.5
(or β = 2). Let us consider a random sample taken from a normal distribution. The chi-square
distribution is the distribution of the sum of these random samples squared. The degrees of
freedom (say k) are equal to the number of samples being summed.
For example, if 10 samples are taken from the normal distribution, then degree of freedom df =
10. Chisquare distributions are always right skewed. The greater the degree of freedom, the
more the chi-square distribution looks like a normal distribution.
What is a Chi-Square Statistic?
A Chi-Square Statistic is one way to a relationship between two categorical (nonnumerical)
variables. The Chi-Square Statistic is a single number that tells us how much difference exists
between the observed counts and the counts that one expects if there is no relationship in the
population. There are two different types of chi-square tests, both involve categorical data.
These are:
a) A chi-square goodness of fit test, and
b) A chi-square test of independence.
In the coming lines these tests will be dealt in some details.
Chi-Square (χ2 ) Goodness-of-Fit Test
The chi-square (χ2 ) goodness of fit test (commonly referred to as one-sample chi-square) is the
most commonly used goodness of fit test. It explores the proportion of cases that fall into the
various categories of a single variable, and compares these with hypothesized values. In some
simple words we can say that it is used to find out how the observed value of a given
phenomena is significantly different from the expected value. Or we can also say that it is used
16
to test if sample data fits a distribution from a certain population. In other words we can say that
chi-square goodness of fit test tells us if the sample data represents the data we expect to find in
the actual population. It tells us whether sample data are consistent with a hypothesized
distribution. This is a variation of more general chi-square test. The setting for this test is a
single categorical variable that can have many levels. In chi-square goodness of fit test sample
data is divided into intervals. Then, the numbers of points that fall into the intervals are
compared with the expected numbers of points in each interval. . The null hypothesis for the chi-
square goodness of fit test is that the data does not come from the specified distribution. The
alternate hypothesis is that the data comes from the specified distribution. The formula for chi-
square goodness of fit test is:
χ 2 = ∑ (observed values – expected values) 2
Expected values
Procedure for Chi-Square (χ2 ) Goodness of Fit Test
For using chi-square (χ2 ) goodness of fit test we will have to set up null and alternate
hypothesis. A null hypothesis assumes that there is no significance difference between observed
and expected value. Then, alternate hypothesis will become, there is significant different
difference between the observed and the expected value. Now compute the value of chi-square
of fit test using formula:
χ 2 = ∑ (observed values – expected values) 2
Expected values
a) The chi-square test can only be used to put data into classes. If there is data that have not been
put into classes then it is necessary to make a frequency table of histogram before performing
the test.
17
b) It requires sufficient sample size in order for chi-square approximation to be valid.
When to Use the Chi-Square Goodness of Fit Test?
The chi-square goodness of fit test is appropriate when the following conditions are met:
 The sampling method is simple random.
 The variable under study is categorical.
 The expected value of the number of sample observation in each level of the variable is at
least 5.
For the chi-square goodness of fit test, the hypotheses take the form:
H0: The data are not consistent with a specified distribution.
Ha: The data are consistent with a specified distribution.
The null hypothesis (H0) specifies the proportion of observations at each level of the categorical
variable. The alternative hypothesis (Ha) is that a least one of the specified proportion is not true.
-----------------------------------------Q#4 THE END----------------------------------------------
Question#5 What is chi-square (x2) independence test? Explain in detail.
Answer:
Chi-Square Independence Test
A chi-square (χ2 ) test of independence is the second important form of chi-square tests. It is
used to explore the relationship between two categorical variables. Each of these variables can
have two of more categories. It determines if there is a significant relationship between two
18
nominal (categorical) variables. The frequency of one nominal variable is compared with
different values of the second nominal variable. The data can be displayed in R*C contingency
table, where R is the row and C is the column. For example, the researcher wants to examine the
relationship between gender (male and female) and empathy (high vs. low). The researcher will
use chi-square test of independence. If the null hypothesis is accepted there would be no
relationship between gender and empathy. If the null hypothesis is rejected then the conclusion
will be there is a relationship between gender and empathy (e.g. say females tent to score higher
on empathy and males tend to score lower on empathy). The chi-square test of independence
being a non-parametric technique follow less strict assumptions, there are some general
assumptions which should be taken care of:
i) Random Sample - Sample should be selected using simple random sampling method.
ii) Variables - Both variables under study should be categorical.
iii) Independent Observations – Each person or case should be counted only once and none
should appear in more than one category of group. The data from one subject should not
influence the data from another subject.
iv) If the data are displayed in a contingency table, the expected frequency count for each
cell of the table is at least 5.
Both the chi-square tests are sometime confused but they are quite different from each other.
 The chi-square test for independence compares two sets of data to see if there is relationship.
 The chi-square goodness of fit test is to fit one categorical variable to a distribution.
The Chi-square test of independence checks whether two variables are likely to be related or not.
We have counts for two categorical or nominal variables. We also have an idea that the two
variables are not related. The test gives us a way to decide if our idea is plausible or not.
19
The sections below discuss what we need for the test, how to do the test, understanding results,
statistical details and understanding p-values.
What do we need?
For the Chi-square test of independence, we need two variables. Our idea is that the variables are
not related.
For a valid test, we need:
 Data values that are a simple random sample from the population of interest.
 Two categorical or nominal variables. Don't use the independence test with continuous
variables that define the category combinations. However, the counts for the combinations of
the two categorical variables will be continuous.
 For each combination of the levels of the two variables, we need at least five expected
values. When we have fewer than five for any one combination, the test results are not
reliable.
When to Use Chi-Square Test for Independence
The test procedure described in this lesson is appropriate when the following conditions are met:
 The sampling method is simple random sampling.
 The variables under study are each categorical.
 If sample data are displayed in a contingency table, the expected frequency count for each
cell of the table is at least 5.
This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3)
analyze sample data, and (4) interpret results.

20
State the Hypotheses
Suppose that Variable A has r levels, and Variable B has c levels. The null hypothesis states that
knowing the level of Variable A does not help you predict the level of Variable B. That is, the
variables are independent.
Ho: Variable A and Variable B are independent.
Ha: Variable A and Variable B are not independent.
The alternative hypothesis is that knowing the level of Variable A can help you predict the level
of Variable B.
Formulate an Analysis Plan
The analysis plan describes how to use sample data to accept or reject the null hypothesis. The
plan should specify the following elements.
 Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or 0.10;
but any value between 0 and 1 can be used.
 Test method. Use the chi-square test for independence to determine whether there is a
significant relationship between two categorical variables.
Analyze Sample Data
Using sample data, find the degrees of freedom, expected frequencies, test statistic, and the P-
value associated with the test statistic. The approach described in this section is illustrated in the
sample problem at the end of this lesson.
 Degrees of freedom. The degrees of freedom (DF) is equal to:
DF = (r - 1) * (c - 1)
Where r is the number of levels for one categorical variable, and c is the number of levels for the
other categorical variable.

21
 Expected frequencies. The expected frequency counts are computed separately for each
level of one categorical variable at each level of the other categorical variable. Compute r * c
expected frequencies, according to the following formula.
Er,c = (nr * nc) / n
where Er,c is the expected frequency count for level r of Variable A and level c of Variable B,
nr is the total number of sample observations at level r of Variable A, nc is the total number of
sample observations at level c of Variable B, and n is the total sample size.
 Test statistic. The test statistic is a chi-square random variable (Χ 2) defined by the following
equation.
Χ2 = Σ [ (Or,c - Er,c)2 / Er,c ]
where Or,c is the observed frequency count at level r of Variable A and level c of Variable B, and
Er,c is the expected frequency count at level r of Variable A and level c of Variable B.
 P-value. The P-value is the probability of observing a sample statistic as extreme as the test
statistic. Since the test statistic is a chi-square, use the Chi-Square Distribution Calculator to
assess the probability associated with the test statistic. Use the degrees of freedom computed
above.
Interpret Results
If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null
hypothesis. Typically, this involves comparing the P-value to the significance level, and rejecting
the null hypothesis when the P-value is less than the significance level.
Common Uses of Chi-independence Square Test
The Chi-Square Test of Independence is commonly used to test the following:

22
 Statistical independence or association between two or more categorical variables.
The Chi-Square Test of Independence can only compare categorical variables. It cannot make
comparisons between continuous variables or between categorical and continuous variables.
Additionally, the Chi-Square Test of Independence only assesses associations between
categorical variables, and cannot provide any inferences about causation.
 The Chi-Square test of independence is used to determine if there is a significant relationship
between two nominal (categorical) variables.
The frequency of each category for one nominal variable is compared across the categories of the
second nominal variable. The data can be displayed in a contingency table where each row
represents a category for one variable and each column represents a category for the other
variable. For example, say a researcher wants to examine the relationship between gender (male
vs. female) and empathy (high vs. low). The chi-square test of independence can be used to
examine this relationship. The null hypothesis for this test is that there is no relationship
between gender and empathy. The alternative hypothesis is that there is a relationship between
gender and empathy (e.g. there are more high-empathy females than high-empathy males).
--------------------------------------------Q#5 THE END-----------------------------------------------

Saghar 8612-2

Uploaded by

Copyright:

Available Formats

Saghar 8612-2

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Saghar 8612-2

Uploaded by

Copyright:

Available Formats

1

ALLAMA IQBAL OPEN UNIVERSITY ISLAMABAD

Submitted By A.R Saghar

Course Title Educational Statistics

Course Code 8612

Level B.ed( 1.5 years)

Semester Spring 2021

Question#1 What do you know about?

a) An independent sample t-test.

b) A paired sample t-test

i) Independent sample t-test

the averages/means of two independent or unrelated groups to determine if there is a significant

difference between the two.

What are the hypotheses of an unpaired t-test?

What are the assumptions of an unpaired t-test?

 The dependent variable is normally distributed

 The observations are sampled independently

 The dependent variable is measured on an incremental level, such as ratios or intervals.

 The independent variables must consist of two independent groups.

When to use an unpaired t-test?

Examples of appropriate instances during which to use an unpaired t-test:

residents using 1,000 randomly selected participants from each city.

In the case of unequal variances, a Welch’s test should be used.

ii) Paired sample t-test

samples is approximately equal.

significant difference between the two groups.

sampling error or chance.

subjected to the same conditions.

other than what’s being tested.

What are the hypotheses of a paired t-test?

There are two possible hypotheses in a paired t-test.

the two groups.

What are the assumptions of a paired t-test?

 The dependent variable is normally distributed

 The observations are sampled independently

 The dependent variable is measured on an incremental level, such as ratios or intervals.

When to use a paired t-test?

----------------------------------------------Q#1 THE END-------------------------------------------

dependent variable from independent variable.

upon one independent variable. Merriam-Webster online dictionary defines regression as a

variables of others. According to Gravetter & Wallnua (2002), regression is a statistical

resulting straight line is called regression line.

Objectives of Regression Analysis

relationship exists, and alternative approaches are superior.

Why do we use Regression Analysis?

benefits of using regression analysis. These are as follows:

i) It indicates the significant relationships between dependent and the independent

ii) It indicates the strength of impact of multiple independent variables on a dependent

building productive models.

Commonly used types of regression are:

ii. Logistic Regression

iii. Polynomial Regression

non-linear relationship between the values of X with the corresponding values of Y.

iv. Stepwise Regression

reason to enter or remove any more.

vi. LASSO Regression

vii. Elastic Net Regression

there are multiple features which are correlated.

-----------------------------------------Q#2 THE END-------------------------------------------

underlying and way ANOVA.