Skittles Project With Reflection

Skittles Project Group 3
Joel Hanes, Alamissi Ouro-Gneni, Virginia Darger, Lily Ratliff

Introduction:
The study of statistics is composed of ways to prepare data collection, analyze results and
make conclusions about the particular study. In the course Intro to Statistics 1040, we were
assigned a semester project to help us understand various concepts of statistics and apply those
concepts to everyday life. At the beginning of the semester, each student in the class purchased a
bag of 2.71oz Skittles candy. Our initial task was data collection, which we did by counting how
many skittles of each color were in each individual bag and total Skittles per bag. Every student
had their own bag and the data was compiled into a spread sheet in order to analyze the results.
Our next task was to analyze and interpret the data. We did this by constructing visual graphs and
charts as well as statistical tables seen below. The goal for this assignment is to better understand
the concepts of the study of statistics and key components to effectively interpret the validity of
statistical studies. Understanding these concepts is useful in our everyday life and the Skittles
project helped the concepts become more relatable.
Data Collection:
Organizing and Displaying Categorical Data: Colors
Observations:
The colors came out to be relatively close in number. We expected a color or two to be lower in
count and maybe a favored color to be out in front, which we did see in half the charts while the
others were more evenly distributed. The data from our bag follows the pattern of the whole
classes bag except for the color purple which is higher in ours, but the other colors follow the
pattern of the whole classes.
Group Three Bag:
Summary statistics:
Column n M
Varia
ea nce
n
NUMBE 5 48. 43.3
R
4
Medi
an
46
Ra
ng
e
16
M
i
n
4
4
M
a
x
6
0
Q Q S
1 3 u
m
4 4 24
5 7 2
Class Bags:
Summary statistics:
Frequency table results for Total skittles per person:

Count = 26
Total skittles per
Frequency
Relative
person
Frequency
54
1 0.038461538
55
1 0.038461538
Percent of
Total
3.8461538
3.8461538
Cumulative
Frequency
1
2
56
58
1
5
0.038461538
0.19230769
3.8461538
19.230769
3
8
59
60
6
6
0.23076923
0.23076923
23.076923
23.076923
14
20
61
62
5
1
0.19230769
0.038461538
19.230769
3.8461538
25
26
Organizing and Displaying Quantitative Data: The Number of Candies per

Bag
Mean: 59.1
Standard Deviation: 1.90
5 Number Summary:
Min: 54
Q1: 58
Q2: 59
Q3: 60
Max: 62
Observations:
The shape of the distribution is almost a normal bell shape, although it is skewed to the left. It
seems that the class data is relatively normal which can be expected considering there are not
extreme outliers. We expected there would be about the same number of candies in each bag,
which wasnt the case. The number of candies from my individual bag was 26 and the total
number of bags in the sample is 26.
Reflection
The categorical data are qualitative variables that consist of names or labels (not
numbers) which represent counts or measurements. The pie charts and bar graphs make sense for
categorical data because they compare one categorical variable against others. Computation, or
arranging in ordering such as low to high, does not make sense for categorical data however,
survey responses of yes, no, and undecided are more appropriate.
Quantitative data is numerical variables consisting of number that can be measured,
ordered, or counted. The scatterplot and steam plot make sense to house quantitative data
because, they help determine whether there is a relationship between two variables or separating
each value into two parts. Computation makes sense for quantitative data to find an average or
mean, standard deviation, five numbers summary, and sum.
Confidence Interval Estimates

A confidence interval is a range of specific values that is used to represent what the true
value of a population parameter may be. The confidence interval is a range of values instead of
just a single number so statisticians can have a better understanding as to how close the
calculated estimate is to the population. We also associate the confidence level value (in the form
of a percentage) with the confidence interval because it provides us with a value of how accurate
our population parameter calculations are. The confidence level allows us to report our
confidence in the estimate population mean value being in-between the range of our confidence
interval. In the following section, our group performed three confidence interval estimates using
the class data of Skittles.
The 95% confidence interval estimate for the true proportion of purple
candies:
Based on the calculations from our sample data, we are 95% confidence that the interval between
0.182 and 0.222 actually does contain the true value of the population proportion of the purple
color candies.
The 99% confidence interval estimate for the true mean number of candies
per bag:
99% confidence interval results:
: Mean of variable
Variable
Sample
Mean
Mean candies per bag
59.076923
Std. Err.
D
F
2
5
0.371786
03
L. Limit
U. Limit
58.0405
93
60.1132
53
Based on the calculations from our data, we are 99% confident that the interval between 58.041
and 60.113 does contain the true mean value of the population of number of candies per bag.
The 98% confidence interval estimate for the standard deviation of the
number of candies per bag:
98% confidence interval results:
: standard deviation of variable
Variable
Standard Deviation.
Mean candies per bag
1.895744234
D
F
2
5
L. Limit
U. Limit
1.4238975
74
2.7922132
62
Based on the calculation from our data, we are 98% confident that the interval between 1.424
and 2.792 does contain the true value of the population standard deviation of the number of the
candies per bag.
Hypothesis tests
A hypothesis is an assumption or claim about some aspect of a population. The various
parameters of the population involved in hypothesis testing are mean, standard deviation,
probability, and variance. Hypothesis tests are used to evaluate the accuracy of the claim
(hypothesis) made about the property of a population. In the following section, our group
performed two hypothesis tests on our classs Skittle candy data.
Test the claim that 20% of all Skittles are green (class bags):
Hypothesis test results:
p: Proportion of successes
H0: p = 0.2
HA: p 0.2
Proportion
Cou
nt
Tot
al
Sample
Prop.
Std. Err.
Z-Stat
Pvalue
311
15
36
0.20247396
Variable
0.0102062
07
Sample
Mean
60.5
Candies per bag group 3
Std. Err.
D
F
3
0.86602
54
0.242397
42
0.808
5
T-Stat
Pvalue
0.013
8
5.19615
24
Since our p-value of 0.805 was greater than , we fail to reject the claim. We have sufficient
evidence to support the claim of H0 that 20% of all Skittles candies are green.
The mean number of candies in a bag of Skittles is 56 (class bags):
Hypothesis test results:
: Mean of variable
H0: = 56
HA: 56
Variable
Candies per bag class
Sample
Mean
59.076923
Std. Err.
0.371786
03
D
F
2
5
T-Stat
8.27605
89
Pvalue
<0.00
01
Since our p-value of 0.0001 was less than , we reject the claim that the mean number of candies
in a bag of Skittles is 56. There is sufficient evidence to warrant a rejection of the claim.
Reflection:
The conditions for doing interval estimates and hypothesis tests:
The sample must be a simple random sample or the sample size n must be > 30.
The population needs to have a normal distribution.
The data for our sample met both requirements as the class sample was simple and
random, and although our sample size was less than 30, it was generally normally distributed as
shown in the histogram from the previous section.
An error that could have occurred is that although there is a normal distribution, it is
slightly skewed, and the population size is less than 30, which could cause results to be skewed.
The sampling method could be improved by making the sample more random. For
example, we could likely get more accurate results were the sample to have been taken from
students in statistics classes all throughout Utah.
Reflective Writing of Math

1040
By Joel Hanes
As a result of the process of creating the skittles project and the course of the class Ive learned that my preconceived ideas and my beliefs were unfounded and mostly wrong. Coming into the class I believed statistics was
just number manipulating and making ideas fit your own. I have found that, except for the people who do
manipulate data consciously or subconsciously, the use of statistics is a scientific act and you can create or find data
that can be useful in daily life or a job.
Being able to use a software program such as Statcrunch to store data in an orderly way, organize the data in
different ways depending on needs, sorting the data to see it from different viewpoints and being able to use and
apply the collected data to find meaning from it and create visual representations of collected data to be used to
show others in an easy to understand format has been a great experience and appreciated tool.
I will be able to use the visual representations such as pie charts and histograms and also the confidence intervals
and hypothesis tests in my other classes such as anthropology; to show the differences and overlaps between cultures
and beliefs. In archaeology I can show the differences and overlaps between sites and artifact populations.
I will also be able to use the tools learned through the project and the class in my personal life and work as well. I
am active and will be able to calculate a mean for respirations, blood pressure and time spent during an activity to
see if I improve over time. With work I will be able to calculate the same with patients I take care of by keeping
track of their vital signs and see if it changes over time.
Through this project I have been able to learn how to create better reports for my classes by being able to use
graphs and I have learned an appreciation of how statistics is used to solve problems and understand the world
around us and it is not just data manipulation.

Skittles Project With Reflection

Uploaded by

Copyright:

Available Formats

Skittles Project With Reflection

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Skittles Project With Reflection

Uploaded by

Copyright:

Available Formats

Skittles Project Group 3

Joel Hanes, Alamissi Ouro-Gneni, Virginia Darger, Lily Ratliff

Frequency table results for Total skittles per person:

Organizing and Displaying Quantitative Data: The Number of Candies per

Confidence Interval Estimates

Candies per bag group 3

The population needs to have a normal distribution.

Reflective Writing of Math

You might also like