EMPTY - Practice Test
EMPTY - Practice Test
EMPTY - Practice Test
Question 1:
A researcher conducts a survey to measure people's opinions on a political issue. The options for
responses are "Agree," "Disagree," and "Neutral." What is the measurement level of this variable?
a. Interval
b. Nominal
c. Ordinal
d. Ratio
Question 2:
x/y 0 1 Total
0 120 80 200
1 80 120 200
Total 200 200 400
Question 3:
I: Correlation measures the strength of the linear relationship between two variables.
II: Correlation can only range from -1 to +1.
Question 4:
Now, they are interested in estimating the range within which the population mean score is likely to
fall. Using a 95% confidence level, what is the confidence interval for the population mean score?
Question 5: Linear equation.
Question 6:
After the introduction of a new mobile app, a company receives feedback from its users. The product
development team wants to determine the proportion of users who preferred the previous version
of the app compared to the new one (assuming all users have a preference and none are indifferent).
The team asks you to design a study using a random sample and determine the required sample size.
They are willing to accept a margin of error of 3 percent points.
How large should the sample be? (Rounding errors will be accepted).
Question 7:
Which measure is most appropriate for assessing the relationship between study hours and exam
scores?
A researcher conducted a multiple regression analysis to examine the relationship between customer
satisfaction (Var satisfaction) and three predictor variables: service quality (Var quality), price (Var
price), and advertising expenditure (Var advertising). The researcher obtained the following
information:
Question 9:
Given:
In a large country some people want to support farmers in the transition towards a more sustainable
operation of their farm, while others think farmers should not get that support. Imagine selecting a
large number of samples (all sized 100) from a population. For each of the samples the percentage of
people who want to support farmers is calculated. You put all percentages in a histogram. The middle
of the histogram is at 0.5 (50%).
Two statements:
Call:
lm(formula = exam_grade ~ study_time, data = data)
Residuals:
Min 1Q Median 3Q Max
-1.4200 -0.6884 -0.1655 0.5024 3.2341
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.07059 0.18992 5.637 1.67e-07 ***
study_time 0.58888 0.03311 17.784 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Question 11:
A team of researchers conducted a study to assess the fitness levels of individuals in a particular
population. They found that the mean fitness score in the population is 120, with a standard
deviation of 15.Now, they are interested in estimating the percentage of people in the population
who are likely to have a lower fitness level than John. John's fitness score is 150.
Using this information, what is the estimated percentage of people in the population who have a
lower fitness level than John?
Question 12:
A researcher is studying the proportion of smartphone users who have installed a specific social
media app. In a random sample of 300 smartphone users, it was found that 180 of them had the app
installed.
Given this information, what is the 95% confidence interval for the proportion ?
Question 13: Linear equation:
- Reference category
- Group 1
Question 14 :
A researcher investigates the association between political affiliation (measured with three
categories: Democrat, Republican, Independent) and voting preference for a specific policy (four
options are considered, focusing on the "most preferred policy" only). The researcher utilizes the chi-
square statistic to analyze the significance of this relationship.
How many degrees of freedom are associated with the chi-square statistic in this test?
Question 15:
a. Interval
b. Nominal
c. Ordinal
d. Ratio
Question 16:
x/y 0 1 Total
0 40 10 50
1 30 20 50
Total 70 30 100
Question 17 :
Suppose you calculated the chi square in a table by hand in a 3x3 table. The outcome of your
calculation is 7.14.
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.27690 0.25924 -1.068 0.287
x1 2.03253 0.02767 73.447 <2e-16 ***
x2 -2.96574 0.02677 -110.767 <2e-16 ***
x3 0.04029 0.02616 ………. 0.125
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
- Reference category
- Group 1
- Group 2
Question 20:
A sociologist wants to examine the association between marital status (e.g., single, married,
divorced) and job satisfaction among employees in a company. The researcher aims to determine the
strength and nature of this relationship.
Which measure is the most suitable choice for analyzing the association between marital status and
job satisfaction?
a. Spearman's correlation
b. Pearson's correlation
c. Cramer's V
d. Kendall's tau-b
For the next part, please open R and copy paste the following code:
library(tidyverse)
library(haven)
library(broom)
library(modelr)
library(car)
library(lmtest)
library(dplyr)
##Made up assignment:
# Number of observations
n <- 236
Question 1: A researcher is interested in exploring the relationship between marital status and the
level of happiness in individuals. She believes that marital status could play a significant role in
shaping people's happiness levels. To investigate this relationship, the researcher gathers a dataset
consisting of information on individuals' marital status (married or single) and their corresponding
happiness scores, which are measured on a scale from 1 to 10. The researcher hypothesizes that
individuals who are married may experience higher levels of happiness compared to those who are
single.
Question 2: A researcher is interested in exploring the participation in yoga classes among individuals
in a certain population. Yoga is known to offer various benefits for physical and mental well-being.
The researcher conducts a survey to collect data on individuals' involvement in yoga classes, with
response options indicating whether they have applied to yoga (1) classes or not (0)
a- Describe shortly which statistical analysis would be appropriate to assess the participation in
yoga classes.
b- Explain in a few lines why you selected this analysis.
c- Upload a screenshot displaying the relevant output
d- Upload a screenshot of the commands or steps you would use to perform the statistical
analysis (even if you were unable to execute the analysis).
e- Based on the output from the dataset
a. estimate the percentage of individuals who have applied to yoga classes.
b. Provide the lower bound of the confidence interval for the estimated percentage
c. Provide the upper bound of the confidence interval for the estimated percentage
Question 3:
"After the summer season, a group of people decided to focus on their weight (measured in kg) and
get healthier. They started a weight management program in September, which included changes in
their diet and exercise habits during 4 months so until December.
To assess the effectiveness of the program, the participants' weights were measured first in
September and then in December. Let's analyze the data and find out if the weight management
program led to noticeable weight loss."
a- Describe shortly which statistical analysis would be appropriate to assess the effectiveness of
the program.
b- Explain in a few lines why you selected this analysis
c- Upload a screenshot displaying the relevant output
d- Upload a screenshot of the commands or steps you would use to perform the statistical
analysis (even if you were unable to execute the analysis).
e- Based on your analysis, provide a conclusion regarding whether there was a significant
change in weight over the specified time period."
f- Use a 95% confidence interval to answer the question and explain what does this interval
mean.
Question 4:
Remember the weight management program from the previous question? Before starting the
program, some participants were already practicing yoga while others were not. Now, a question
arises: Some researchers claim that the weight loss was actually due to the fact that some individuals
were already engaged in yoga before the program experience (so in September). To find out, in the
dataset there are multiple weights measured at different times. By examining this dataset, we may
gain insights into the potential impact of pre-existing yoga practice.
a- Describe shortly which test is used to answer this question (name of the test)”
b- Explain in a few (two or three) lines why you selected this test:
c- Upload a screenshot displaying the relevant (numerical, not graphical) output of the test.
d- Upload a screenshot of the commands used to perform the test (even if you were unable to
execute the test).
e- Based on the output, do you conclude that the distribution of employees across training
programs matches the expected distribution?
a- Describe shortly which test is used to assess the impact of participating in yoga classes on
happiness levels.
b- Explain in a few lines why you selected this test for evaluating the relationship between yoga
participation and happiness.
c- Provide the output and commands used to conduct the test.
d- Perform the test and draw conclusions regarding the effect of participating in yoga classes on
happiness levels. Avoid giving a simple yes/no answer and instead provide a brief
explanation based on the statistical analysis. Additionally, discuss the confidence interval and
its role in interpreting the results."
Question 7: The management of a wellness center is interested in understanding whether there are
significant differences in the happiness levels of its clients based on their diets. The center offers
various wellness programs and wants to explore if there is a connection between dietary choices and
happiness. The center believes that people with different dietary preferences may experience varying
levels of happiness and wants to validate this claim:
a- Which statistical test would you use to explore whether there are significant differences in
happiness levels among individuals with different diets?
b- Explain in a few lines why you selected this test.
c- Show (copy paste) the commands or steps you would use to perform the statistical analysis
(even if you were unable to execute the analysis). Also upload the output.
d- Based on your analysis, provide a conclusion regarding whether there is a significant
difference in happiness levels.
A popular health magazine published a claim that individuals following a Vegan diet tend to exhibit
higher motivation levels compared to individuals with Vegetarian diets.
a- Which statistical test would you use to answer this question and why?
b- Show (copy paste) the commands or steps you would use to perform the statistical analysis
(even if you were unable to execute the analysis). Also upload the output.
c- Based on your analysis, provide a conclusion regarding the claim.