Statistical Techniques - Formatted
Statistical Techniques - Formatted
Statistical Techniques - Formatted
Areas to be covered:
Forecast And Budget
Forecasting Techniques
Scattered Diagram
Seasonal Variation
Linear Regression
Index Numbers
FORECAST AND BUDGET
• Forecast: A forecast is an estimate of what might happen in the future. Forecast is based
on some assumptions about the conditions that are expected to apply.
• Budget: A budget is a plan of what the organization is aiming to achieve and what it has
set as a target. Budgets are more realistic because management will try to establish
some control over the conditions that will apply in the future
FORECASTING METHODS
1. High- low method
2. Scatter graph method
3. Linear regression analysis
4. Time series analysis
1. High Low Method
It is simple forecasting technique based on historical data. It is already discussed in
cost classification chapter.
Advantages:
• It is easy to use and understand.
• It needs just two activity levels (highest and lowest)
Disadvantages:
• It considers two extreme points which may be representative of normal conditions.
• Based on two points so formula is not very accurate.
• Based on historical data.
2. Scatter graph method
One forecasting technique is the scatter graph method. This is graphical way of for casting.
Steps involve in forecasting under scatter graph method are:
a. Collect data of past volumes of output and the associated cost of producing that
output.
b. Plot the data on the graph which has cost on vertical axis and volume of output on the
horizontal axis.
c. Draw the line of best fit through the middle of the plotted points so that the distance of
points above the line is the same as the distance of points below the line.
The intersection of the line of best fit on the vertical axis is the fixed cost and slope of the line
represents variable costs. It is a method of visual judgments that is a disadvantage of this
method.
Correlation:
Two variables are said to be correlated if a change in the value of one variable is
accompanied by a change in the value of another variable.
For example:
• Total variable cost and production units.
• Selling price of a product and its demand.
The purpose of correlation analysis is to measure and interpret the strength of linear
relationship between two variables.
Degrees of correlation:
Two variables might be perfectly correlated, partly correlated or uncorrelated. Correlation
can be positive or negative. The differing degrees of correlation can be illustrated at scatter
diagrams.
Perfect correlation
(a) (b)
Y x Y x
x x
x x
x x
x x
X X
All the pairs of values lie on a straight line. An exact linear relationship
exists between the two variables
Partial correlation
(a) (b)
x x
Y x Y x
x x x x
x x x x
x x
X X
In (a), although there is no exact relationship, low values of X tend to
be associated with low values of Y, and high values of X with high
values of Y.
In (b) again, there is no exact relationship, but low values of X tend to b
associated with high values of Y and vice versa.
No correlation
(c)
Y x
x x
x x
x
x
X
The values of these two variables are not correlated to each other.
Positive correlation: means that the low values of one variable are
associated with low values of other. And high values of one variable
are associated with high values of other.
Negative correlation: means that the low values of one variable are
associated with high values of other. And high values of one variable
are associated with low values of other.
n( xy) - ( x) ( y)
r=
[n x2- ( x)2] [ n y 2-( y)2]
Where X and Y represents pairs of data for two variables X and Y. And ‘n’ stands for
number of pairs of data used in calculation. Remember that correlation coefficient (r) always
lie between -1 and +1. If your calculation results in anything outside this range you must revise
your calculations.
Example 1:
Statistical department of a provincial government is currently developing a data base to find
out whether there is any relationship between an individual’s annual income and his level of
education. Following information has been so far collected:
Individual Income Education
$’000 Years
X Y
1 45 20
2 63 19
3 36 16
4 52 20
5 29 12
o t h e r v a r i a b l e . I n o t h e r w o r d s r2 e x p r e s s e s t h e p r o p o r t i o n o f t o t a l
variance in the value of one variable that can be fully explained by
the other variable.
T h e c o e f f i c i e n t o f d e t e r m i n a t i o n i s s u c h t h a t 0 ≤ r2 ≤ + 1 , i . e . i t c a n n o t
Although correlation coefficient is used to trace out that whether there is any linear
relationship between any two variables but correlation coefficient solely cannot be used to
found that two variables are correlated we can use the line of best fit. We can use this
equation for forecasting; putting a value for variable X and deriving a forecast value for
dependent variable Y.
3. Linear Regression
Where,
Dependent variable: is the single variable being explained/predicted by regression model.
Independent variable: is the explanatory variable used to predict dependent variable.
b = n xy - x y and
n x2 - ( x) 2
a=
n n
Where n is the number of data pairs used in analysis.
Example 3: Following data is available for level of output and costs incurred at relevant
output level:
Output (‘000 units) 10 15 13 18 19 20
Cost ($’000) 40 55 48 65 69 81
Calculate total cost at an activity level of 22,500 units using regression analysis.
Solution:
Regression line and time series analysis
Regression line can also be used in time series analysis. Time to be taken as
coefficient is calculated.
As is the case with any other model, results from regression analysis will not be
accurate or reliable. There are a number of limitations of this model which cast doubt on
its results:
Reliability of Forecast in Linear Regression:
• This model assumes that there exists a linear relationship but this is not always true, there
might be a non-linear relationship. The model is only appropriate if there is a linear
relationship between two variables.
• The model assumes that that there are only two variables. Value of one variable, the
dependent variable Y, is predicted from value of one other variable, the independent
variable X. This is quite unrealistic as the value of Y might be affected by many other factors
not considered at all.
• Past behavior is used to forecast future. The model assumes that past movement pattern of
two variables will continue in the future. Again, this is an unrealistic assumption.
• Linear regression model is limited to predicting numeric output only. It cannot be used to
predict any other sort of information.
• A lack of explanation about what has been learned can be a problem. Prediction of a
figure not that is all desired.
Reliability of Forecast in Linear Regression:
• The model is only appropriate if used to predict value of dependent variable within relevant
range. Predicted results are not reliable if model is used for extrapolation.
o Interpolation means using a line of best fit to predict a value within the two extreme
o Extrapolation means using a line of best fit to predict a value outside the two extreme
points.
• There must be sufficient number of data pairs. Even if correlation is high between two
variables and have less than ten pairs of data any forecast value should be regarded as
somewhat unreliable.
Regression line and time series analysis
We can still use the forecast produced by the model with high confidence if correlation
coefficient between two variables is high. Coefficient of determination tells us that how
much of the variation in cost can be explained by volume level. Higher the coefficient of
determination the higher the reliance that could be placed on predicted result. As a general
rule if correlation is high (say positive or negative 0.9) the actual values will all lie close to
regression line. And if correlation is below 0.7 (-0.7 ≤ r ≤ +0.7), predicted value will only be a
• Many processes are linear so they are well defined by regression analysis.
4. Time Series Analysis
over time. The data often conforms to a certain pattern over time. It is use to
forecast sales.
This pattern can be extrapolated into the future and hence forecasts are possible.
Time periods may be any measure of time including days, weeks, months and
quarters.
4. Time Series Analysis
For example
b. Seasonal variations: are short term fluctuations in recorded values, a regular variation
around the trend over a fixed time period, usually one year.
c. Cyclical variations: are long term fluctuations in recorded values, economic cycle of
d. Random variations: irregular, random fluctuations in the data usually caused by factors
Trend
Downward trend Upward No clear
trend movement/static
Years Output/hour(units) Cost/unit Number of employees
($)
4 30 1 100
5 24 1.08 103
6 26 1.20 96
7 22 1.15 102
8 21 1.18 103
9 17 1.25 98
Finding a trend
One method of finding the trend is by the use of moving averages. (Take
moving averages which covers a cycle)
Moving averages of
2000 390
2001 380
2002 460
2003 450
2004 470
2005 440
2006 500
Take a moving average of the annual sales over a period of three years.
Moving average of an even number of results
If the moving average were taken of results in an even number of time periods, the
basic technique would be the same, but the midpoint of the overall period would not
relate to single period. The trend line average figures need to relate to a particular time
period. To overcome this difficulty, take a moving average of the moving average.
Example 5:
Calculate the trend using moving average.
2 840
645
3 420 650
655
4 720 657.50
660
2006 1 640 660
660
Solution: Year Quarter
Actual volume Moving average of 4 Midpoint of 2 moving
of sales quarters’ sales averages trend line
2 860 662.50
665
3 420 668.75
672.50
4 740 677.50
682.50
2007 1 670 683.75
685
2 900 687.50
690
3 430
4 760
687.50 650
5
8 1
b. Seasonal Variation
Short term fluctuations due to change in season. Affect seasonal businesses
like ice-cream manufacturing.
Additive model
Seasonal variations are the difference between actual and trend figures. An
average of the seasonal variations for each time period within the cycle must be
determined and then adjusted so that the total of the seasonal variations sums to
zero.
Seasonal variation = actual sales – trend
So
Time series (actual sales) = trend + seasonal variation
Here Y = T + R + S
Continue Example 5:
Seasonal
Year Quarter Actual volume of sales Trend variation
‘000 units ‘000 units ‘000 units
2005 1 600
2 840
3 420 650 -230
4 720 657.50 62.50
2006 1 640 660 -20
2 860 662.50 197.50
3 420 668.75 -248.75
4 740 677.50 62.50
2007 1 670 683.75 -13.75
2 900 687.50 212.50
3 430
4 760
The variation between the actual result for any particular quarter and the trend line
average is not the same from the year to year, but an average of these variations can be
taken.
Q1 Q2 Q3 Q4
2005 -230 62.50
2006 -20 197.50 -248.75 62.50
2007 -13.75 212.50
Total -33.75 410 -478.75 125
Average (divided by 2) -16.875 205 -239.375 62.50
Estimate of the seasonal or quarterly variation is almost done, but there is one more
important step to take. Variations around the basic trend line should cancel each other out,
and add to the ‘zero’. At the moment they do not. Therefore spread the total of the
variations (11.25) across the four quarters (11.25/4) so that the final total of the variations sum
to zero.
Q1 Q2 Q3 Q4 Total
The trend component will be same in both models but the seasonal and random
component will vary according to the model. In our example, we assume that random
component is small and so ignore it. So:
Y=TxS
Then:
S = Y/T
Continue Example 5:
Actual volume of Seasonal variation
Year Quarter Trend (T)
sales (Y) (Y/T)
Q1 Q2 Q3 Q4 Total
Pn Qn
Price Index = x 100 Quantity index = x 100
Po Qo
Solution:
Solution:
20x6 index number =
sales is more meaningful than sales has increased from $4,567,990 to $4,796,390.
• Comparing data and drawing conclusions is much easier with the help of indices.
• Calculating the quantity and price index separately helps the management to know
• The index can be calculated by different methods, therefore, there is no single correct
• The figures obtained are averages. Significant changes in variables cannot be seen with
• Indices consider new products that may appear; the old ones may be ignored.
Forecasting problems:
Forecasting problems:
All forecasting methods are subject to have errors but it vary from case to case. Some main
problems are:
future.
• Political and economical changes: (It creates uncertainty for example change in interest
• Environmental and Social changes: (Changes in market will affect other company’s’
market).