Assignment 2
Assignment 2
Assignment 2
PART 1
Ques 1: In general, why is statistics conducted? Explain.
Answer: Statistics is conducted to give meaning to quantitative data. It converts it into
meaningful information and helps making inferences from the data. It aids in measuring things,
examining relationships, making predictions, testing hypotheses, developing concepts and
theories, exploring problems and in explaining behaviors or attitudes.
Ques 2: What is the meaning of ‘inference’ as it relates to populations and samples? Why
is ‘random sampling’ relevant when it comes to inference?
Answer: Inference means drawing conclusions from a given sample from the population. On
making an inference, you generalize the population based on the sample provided to you.
Random Sampling becomes relevant as it makes sure that the findings you get from your sample
should be close to what you would have gotten if you measured the complete population. All the
units in the population have an identical chance of being chosen using the simplest random
sample.
Ques 3: The following question pertain to p-values
PART 2
1. Is the statement below true or false? Explain your answer.
Probability of an individual being woman, given that the individual is a Fanshawe
College student is automatically the same as Probability of being an individual being
a Fanshawe College student, given that the individual is a woman; these
two probabilities mean the same thing.
Answer: This statement is false. The probability of occurrence of an international student
being an Indian given international student is a Fanshawe college student would not be the
same as probability of an international student studying at Fanshawe College, given he is an
Indian.
PART 3
1. Research question: Is the consumption of oil, on average, different between the
cities of Akure and Ota?
1. Stating Null & Alternate Hypothesis: The first step is stating Null hypothesis and
Alternate Hypothesis.
Null Hypothesis (Ho ) in this research question would be when there is no difference in
the consumption of oil in cities of Akure and Ota.
Alternate Hypothesis (Ha) would be when there is a difference in the consumption of oil
in cities of Akure and Ota.
2. Collect Data: Performing sampling and gathering data in a method that is intended to
test your hypothesis are crucial for a statistical test to be reliable. You cannot draw
statistical conclusions about your target demographic if your data are not
representative of that population.
3. Perform a statistical test: Based on data we have collected; we will run a T-test to
test if there is a difference in the consumption of oil in the two cities. With this, we
would get to know the difference as well as a p-value with this T-test.
4. Decide whether to reject or fail to reject your null hypothesis: In the analysis of
the difference in the consumption of oil in the two cities if the p-value is 0.04 which
is below your conventional value of 0.05, we have decided to reject your null
hypothesis of no difference.
5. Present the Findings: In the end, we would conclude if it is a null and alternate
hypothesis.
b. What statistical test would you conduct to answer this research question?
Run this statistical test (you can use R or Microsoft Excel); provide and explain your answer to
the above research question. Your answer should include major concepts and points covered in
class.
Answer: (i) The statistical test to be conducted would be T-test. Considering we have two
variables, there can be two hypothesis which can be stated as :
Ho= There is no statistical difference between the consumption of oil between these two cities.
Ha= There is a statistical difference between the consumption of oil between the two cities.
An analysis of the Excel Data was done to see if there is a difference or no difference,
The conclusion showed that there was no statistical difference between the consumption on these
cities.
c. Imagine the research question above was ‘updated’ to now include (i.e.,add) the city of Port
Harcourt:
i. What statistical test would you use to answer this updated question? Would it be the same
statistical test as in your answer to1(b) and why?
As there are three variables, ANOVA should be used. When numerous groups are being
compared, the t-test error structure will understate the true error.
ii. Provide and explain your answer to the updated research question. Your answer should
include major concepts and points covered in class.
Firstly, Develop a hypothesis.
In the second step, Choose a degree of relevance.
Calculate the F-Statistic.
To get a p-value, use the F-statistic.
If the null hypothesis is to be rejected, weigh the significance level and p-value.