Standard Deviation and Its Applications
Standard Deviation and Its Applications
Standard Deviation and Its Applications
INTRODUCTION:
In statistics, the standard deviation (SD, also represented by the Greek letter sigma σ or the Latin
letter s) is a measure that is used to quantify the amount of variation or dispersion of a set of data
values.[1] A low standard deviation indicates that the data points tend to be close to
the mean (also called the expected value) of the set, while a high standard deviation indicates
that the data points are spread out over a wider range of values.
Standard deviation is a measure of how spread out a data set is. It's used in a huge number of
applications. In finance, standard deviations of price data are frequently used as a measure of
volatility. In opinion polling, standard deviations are a key part of calculating margins of error.
Standard deviation is a measurement used in statistics of the amount a number varies from the
average number in a series of numbers. The standard deviation tells those interpreting the data,
how reliable the data is or how much difference there is between the pieces of data by showing
how close to the average all of the data is.
A low standard deviation means that the data is very closely related to the average, thus very
reliable.
A high standard deviation means that there is a large variance between the data and the
statistical average, thus not as reliable.
Calculating Standard Deviation
The standard deviation is determined by finding the square root of what is called the variance.
The variance is found by squaring the differences from the mean.
1. Standard score
2. Percentages of Normal data lying within a certain number of standard deviations from the
mean
A distribution curve for a set of data is basically a frequency or relative frequency curve of the
data. It is found that the distribution curves for a lot of commonly occurring data sets follow a
certain pattern that came to be known as normal distributions.
34 34
% %
13. 13 2.3
2.3
5% .5 5% x
m5%m m m m %m m
3 2 + +2 +3
A normalcurve
has the following
characteristics:
1. Mean = mode = median. They all lie at the centre of the curve.
2. There are fewer data for values further away from the mean.
a) about 68% of the data lie within 1 standard deviation from the mean.
b) about 95% of the data lie within 2 standard deviations from the mean.
c) about 99.7% of the data lie within 3 standard deviations from the mean.
ESTIMATION
One can find the standard deviation of an entire population in cases (such as standardized
testing) where every member of a population is sampled. In cases where that cannot be done, the
standard deviation σ is estimated by examining a random sample taken from the population and
computing a statistic of the sample, which is used as an estimate of the population standard
deviation. Such a statistic is called an estimator, and the estimator (or the value of the estimator,
namely the estimate) is called a sample standard deviation, and is denoted by s (possibly with
modifiers). However, unlike in the case of estimating the population mean, for which the sample
mean is a simple estimator with many desirable properties (unbiased, efficient, maximum
likelihood), there is no single estimator for the standard deviation with all these properties,
and unbiased estimation of standard deviation is a very technically involved problem. Most
often, the standard deviation is estimated using the corrected sample standard
deviation (using N − 1), defined below, and this is often referred to as the "sample standard
deviation", without qualifiers. However, other estimators are better in other respects: the
uncorrected estimator (using N) yields lower mean squared error, while using N − 1.5 (for the
normal distribution) almost completely eliminates bias.
Firstly, the formula for the population standard deviation (of a finite population) can be applied
to the sample, using the size of the sample as the size of the population (though the actual
population size from which the sample is drawn may be much larger). This estimator, denoted
by sN, is known as the uncorrected sample standard deviation, or sometimes the standard
deviation of the sample (considered as the entire population), and is defined as follows
Where are the observed values of the sample items and is the mean value of these
observations, while the denominator N stands for the size of the sample: this is the square root of
he sample variance, which is the average of the squared deviations about the sample mean.
This is a consistent estimator (it converges in probability to the population value as the number
of samples goes to infinity), and is the maximum-likelihood estimate when the population is
normally distributed.[citation needed] However, this is a biased estimator, as the estimates are
generally too low. The bias decreases as sample size grows, dropping off as 1/n, and thus is most
significant for small or moderate sample sizes; for the bias is below 1%. Thus for very
large sample sizes, the uncorrected sample standard deviation is generally acceptable. This
estimator also has a uniformly smaller mean squared error than the corrected sample standard
deviation.
CORRECTED SAMPLE STANDARD DEVIATION:
If the biased sample variance (the second central moment of the sample, which is a downward-
biased estimate of the population variance) is used to compute an estimate of the population's
standard deviation, the result is
Here taking the square root introduces further downward bias, by Jensen's inequality, due to
the square root being a concave function. The bias in the variance is easily corrected, but the bias
from the square root is more difficult to correct, and depends on the distribution in question.An
unbiased estimator for the variance is given by applying Bessel's correction, using N − 1 instead
of N to yield the unbiased sample variance, denoted s2:
This estimator is unbiased if the variance exists and the sample values are drawn independently
with replacement. N − 1 corresponds to the number of degrees of freedom in the vector of
Taking square roots reintroduces bias (because the square root is a nonlinear function, which
does not commute with the expectation), yielding the corrected sample standard
deviation, denoted by s:
As explained above, while s2 is an unbiased estimator for the population variance, s is still a
biased estimator for the population standard deviation, though markedly less biased than the
uncorrected sample standard deviation. The bias is still significant for small samples (N less than
10), and also drops off as 1/N as sample size increases. This estimator is commonly used and
generally known simply as the "sample standard deviation".
First, let's look at what a standard deviation is measuring. Consider two small businesses with
four employees each. In one business, two employees make $19 per hour and the other two make
$21 per hour. In the second business, two employees make $15 per hour, one makes $24, and the
last makes $26
In both companies, the average wage is $20 per
hour, but the distribution of hourly wages is clearly different. In company A, all four employees'
wages are tightly bunched around that average, while at company B, there's a big spread between
the two employees making $15 and the other two employees.
Standard deviation is a measure of how far away individual measurements tend to be from the
mean value of a data set. The standard deviation of company A's employees is 1, while the
standard deviation of company B's wages is about 5. In general, the larger the standard deviation
of a data set, the more spread out the individual points are in that set.
The technical definition of standard deviation is somewhat complicated. First, for each data
value, find out how far the value is from the mean by taking the difference of the value and the
mean. Then, square all of those differences. Then, take the average of those squared differences.
Finally, take the square root of that average.
The reason we go through such a complicated process to define standard deviation is because
this measure appears as a parameter in a number of statistical and probabilistic formulas, most
REFERENCE:
http://examples.yourdictionary.com/examples-of-standard-deviation.html
https://en.wikipedia.org/wiki/Standard_deviation
KG COLLEGE OF ARTS AND SCIENCE
ASSIGNMENT-I
NAME : V.SUSHMITHA
ROLL.NO :152AAB52
CLASS : II-B
DEPARTMENT : COMMERCE