You are on page 1of 5

Erik Robbins

Skittles Term Project


This project involved counting and recording the contents of a 2.17 ounce bag of Skittles
candies. We recorded the number of red, orange, yellow, green and purple candies as given to us in
each bag. There were 24 students who recorded their information for a total of 1461 candies counted.
The numbers were somewhat close together but like any data set there were some outliers as well.
From what the graphs show below, the candies are quite evenly dispersed amond 1461 candies.
The sample size being 1461 is obviously not a single bag but from 24 bags. The total is quite close to
being an even amount of candies of each color. That is not necessarily true for each bag of candy. From
my bag I had two times the amount of yellow skittles than I did green and purple as shown below.
Below is my personally collected data: Total number of bags=1
Number of
red candies
12

Number of
orange
candies
15

Number of
yellow
candies
18

Number of
green candies
9

Number of
purple
candies
8

Below is the total collected data for the class: Total number of bags=24
Number of
red candies
320

Number of
orange
candies
278

Class sample: 1461


Mean: 60.783
Standard Deviation: 1.99 or 2.0
Min: 274
Q1: 278
Median: 289
Q3: 300
Max: 320

Number of
yellow
candies
289

Number of
green candies
300

Number of
purple
candies
274

The distribution is not a normal shape but bimodal. The graphs do reflect what we would expect
to see from a random sample of bags of skittles.The overall data collected by the class does not agree
with my bag. The data from the class makes it more even whereas my personal data is more skewed.
Reflection - The data we recorded from the skittles contain both categorical and quantitative
data including the color, total skittles and total bags of skittles used. Categorical data is the color portion
of the skittles data and quantitative data are the numbers. Pie graphs and bar charts are great ways to
graph categorical data. They will create a visual representation of the data simply and in a manner that
is easily and quickly understood. Scatter plots and line charts such as the boxplot are designed to be a
visual representation to easily understand the five-number summary. Categorical data is usually
gathered and then used to decide a mean and/or a standard deviation. To do many calculations in
statistics, the mean is required. Graphs, mean, median and mode are all good for both quanititative and
categorical data. Charts are great for categorical data, yet almost all data placed on a pie chart will
include both quanititative and categorical data.

Confidence interval Estimates


The meaning of a confidence interval is a range of values that estimates the true value of a
population parameter. The purpose and of a confidence interval is to estimate the true value of a
population parameter as it is not possible to get everyone to respond to a poll (Triola).
99% Yellow Confident that 99% of the time the proportion of the yellow candies will be between 10.16
and 13.99.
95% Mean Confident that 95% of the time the mean of the number of candies in the bag will be
between 60.071 and 61.679
98% S.D.
Discussion - It is pretty clear that 99% of the time the proportion of the yellow candies will be
between 10.16 and 13.99. It is also clear that 95% of the time the mean of the number of candies in the
bag will be between 60.071 and 61.679

SHypothesis Test
The meaning and purpose of hypothesis testing is to test an alternative hypothesis against an
already existing hypothesis that has generally been accepted.
I am confident that 95% of the time, red Skittles in a single bag do not total 20% of the volume in that
bag.
I am confident that 99% of the time, every bag of Skittles contains more than 55 candies.
Discuss This information is accurate from the information given with the error ratios in place.
Reflection The conditions for doing interval estimates were not fully met. In the third question
for the confidence intervals, the CI for the standard deviation did not come through. The standard
deviation was available but I looked for the way to solve and could not find the solution. The Hypothesis
testing went a bit smoother. The possible errors were that we had to compile all the data correctly
whereas usually (in the homework and in-class examples) it is given to us directly and correctly. If one
number is off slightly it could mess up the whole equation. From my research I can see that the
conclusions are that the company does its best to get 20% of each color candy into one bag, statistically
they do it. Bag for bag, the percentage per candy is not quite on.

Bibliography:
Tiola, Mario F. Elementary Statistics. 12th ed. Boston: Pearson, 2014. 1-822. Print.

You might also like