0% found this document useful (0 votes)
262 views

MBA - Quantitative Methods

Uploaded by

Syed Ameen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
262 views

MBA - Quantitative Methods

Uploaded by

Syed Ameen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 266

Quantitative Methods

1
How to Use Self-Learning Material?
The pedagogy used to design this course is to enable the student to assimilate the concepts
with ease. The course is divided into modules. Each module is categorically divided into units or
chapters. Each unit has the following elements:

Table of Contents: Each unit has a well-defined table of contents. For example: “1.1.1.
(a)” should be read as “Module 1. Unit 1. Topic 1. (Sub-topic a)” and 1.2.3. (iii) should
be read as “Module 1. Unit 2. Topic 3. (Sub-topic iii).

Aim: It refers to the overall goal that can be achieved by going through the unit.

Instructional Objectives: These are behavioural objectives that describe intended


learning and define what the unit intends to deliver.

Learning Outcomes: These are demonstrations of the learner’s skills and experience
sequences in learning, and refer to what you will be able to accomplish after going
through the unit.

Self-Assessment Questions: These include a set of multiple-choice questions to be


answered at the end of each topic.

Did You Know?: You will learn some interesting facts about a topic that will help you
improve your knowledge. A unit can also contain Quiz, Case Study, Critical Learning
Exercises, etc., as metacognitive scaffold for learning.

Summary: This includes brief statements or restatements of the main points of unit and
summing up of the knowledge chunks in the unit.

Activity: It actively involves you through various assignments related to direct application
of the knowledge gained from the unit. Activities can be both online and offline.

Bibliography: This is a list of books and articles written by a particular author on a


particular subject referring to the unit’s content.

e-References: This is a list of online resources, including academic e-Books and journal
articles that provide reliable and accurate information on any topic.

Video Links: It has links to online videos that help you understand concepts from a
variety of online resources.

Quantitative Methods
LEADERSHIP KLEF

President Vice Chancellor


Er. Koneru Satyanarayana Dr. G. Pardha Saradhi Varma

Pro-Vice Chancellor Incharge Registrar


Dr. N. Venkatram Dr. K. Subbarao

Quantitative Methods
CREDITS

Author
Dr. J. Venkata Ramana

Director CDOE
C. Shanath Kumar

Instructional Designer
Nabina Das

Content Editor
M. Mounika Supriya

Project Manager
K. D. N. Lakshmi

Graphic Designer
B. V. Satyanarayana

Quantitative Methods
First Edition, 2023.

KL Deemed to be University-CDOE has full copyright over this educational material. No


part of this document may be produced, stored in a retrieval system, or transmitted, in any
form or by any means.

Quantitative Methods
Author Profile

Dr. J. Venkata Ramana

Dr. J. Venkata Ramana is an Assistant Professor in the department of MBA, K L Business


School, Koneru Lakshmaiah Education Foundation, Andhra Pradesh, India.

He did M.Sc, M.M.M, Ph.D in the area of Management. He specialised in Marketing Management
and Business Analytics with 16 years of academic experience. He published 4 Scopus-Indexed
and 10 UGC Care-Indexed research papers in reputed Journals.

He has organised various faculty development programs, executive development programs,


national level quiz competitions and workshops in Management and also attended various
national and international conferences.

He is a fellow member of the World Economics Association (WEA), Green ThinkerZ, Indian
Academicians and Researchers Association (IARA) and Institute of Supply Management
(ISM). He is a reviewer for various international journals and guides research scholars in the
Management domain. He authored two books in the areas of Management

Quantitative Methods
Quantitative Methods

Course Description

Quantitative Methods is all about the decision-making process. This course begins with
fundamental principles and progresses to diverse decision-making processes. Quantitative
methods are defined as procedures that give the decision maker a systematic, and powerful
means of analysis, and assistance in developing policies for reaching pre-determined goals
based on measurable facts.

Probability, Random Variables, Probability Distributions, Introduction to R Programming,


Sampling, Theory of Estimation, Testing of Hypothesis, Parametric, and Non-Parametric tests,
Correlation, Regression, Time Series, and Index Numbers are some of the topics addressed in
Quantitative Methods. All these sub-topics help in data analysis and decision-making.

By the end of the course, students will be able to describe how to use various statistical methods
to solve problems and make decisions. Students will also examine uncertainty scenarios and
various forms of data, including uni-variate, bi-variate, and multi-variate data.

This course is designed to serve as a steppingstone for the students to analyse the managerial
decisions. The Quantitative Methods course contains Four Modules.

Quantitative Methods
1
MODULE 1
INTRODUCTION TO PROBABILITY
Concept of Probability: Definitions and rules for probability, Conditional probability, Independence
of events, Bayes’ theorem. Probability distributions: Random variables, Binomial Distribution,
Poisson Distribution, Normal Distribution. Introduction to R Programming: Evolution and features
of R Programming, Operators in R Programming, Data Structures in R Programming.

MODULE 2
OVERVIEW OF SAMPLING
Basic Concepts: Types of Sampling, Sampling distributions, Sampling distribution of mean
and proportion, Application of Central Limit Theorem, Determining the sample size. Estimation
and Testing of Hypothesis: Introduction to Estimation, Point Estimation, Interval Estimation,
Introduction to Hypothesis, one simple and two sample tests for means and proportions of large
samples (z-test), one sample and two sample tests for means of small samples (t-test), F-test
for two sample standard deviations, ANOVA (ANALYSIS OF VARIANCE) one- and two-way, Chi-
square tests for independence of attributes and goodness of fit, Sign test and Rank Test.

MODULE 3
CORRELATION AND REGRESSION
Introduction to Correlation: Meaning, measurement, graphic and algebraic, Scatter Plot, Pearson
Correlation Coefficient, Spearman’s Rank Correlation, Testing the significance of regression
coefficients, Regression: Meaning, Types, Estimating the regression coefficients.

MODULE 4
INDEX NUMBERS AND TIME SERIES ANALYSIS
Time series analysis: Meaning and Components of Time Series, Variations in time series,
Smoothing Methods, trend analysis, cyclical variations, Seasonal variations, and irregular
variations. Index Numbers: Unweight Index numbers, Weighted Index numbers.

Quantitative Methods
2
Table of Contents

Module 1
INTRODUCTION TO PROBABILITY
Unit 1.1 Concept of Probability
Unit 1.2 Probability Distributions
Unit 1.3 Introduction to R Programming

Module 2
OVERVIEW OF SAMPLING
Unit 2.1 Introduction to Sampling
Unit 2.2 Estimation and Testing of Hypothesis
Unit 2.3 ANOVA & Non-parametric Testing of Hypothesis

Module 3
CORRELATION AND REGRESSION
Unit 3.1 Introduction to Correlation
Unit 3.2 Regression

Module 4
INDEX NUMBERS AND TIME SERIES ANALYSIS
Unit 4.1 Time Series Analysis
Unit 4.2 Index Numbers

Quantitative Methods
3
QUANTITATIVE METHODS

Module - 1

INTRODUCTION TO PROBABILITY

Quantitative Methods
4
Module Description

We use the words ‘probability’ and ‘chance’ frequently in our daily conversations, yet most
people have only a hazy understanding of what they represent. For example, we might hear
in a weather report, “There’s a chance of heavy rain tomorrow”, or “There’s a chance that both
teams A and B win tomorrow’s match”, or “Probably you’re right”, or “It’s likely that I won’t be able
to come to your house for the get-together”. It has been observed that keywords like likelihood,
chances, possible, likely, and others appear in the above sentences and communicate the same
meaning, i.e., the event is not certain to occur, or, in other words, there is uncertainty in the
event’s occurrence. In plain language, the term “probability” denotes that there is a degree of
uncertainty regarding the event’s outcome.

This Module is divided into the following units:


Unit 1.1 Concept of Probability
Unit 1.2 Probability Distributions
Unit 1.3 Introduction to R Programming

Quantitative Methods
5
QUANTITATIVE METHODS

Module - 1
Unit - 1

CONCEPT OF PROBABILITY

Quantitative Methods
6
Unit Table of Contents
Unit 1.1 Concept of Probability

Aim -------------------------------------------------------------------------------------------------------------- 08
Instructional Objectives ------------------------------------------------------------------------------------ 08
Learning Outcomes ----------------------------------------------------------------------------------------- 08
Introduction ---------------------------------------------------------------------------------------------------- 09
1.1.1 Definitions and Rules of Probability ----------------------------------------------------------- 09
Self-Assessment Questions --------------------------------------------------------------------- 26
1.1.2 Conditional Probability and Independence of Events ------------------------------------- 27
Self-Assessment Questions --------------------------------------------------------------------- 30
1.1.3 Bayes’ Theorem ------------------------------------------------------------------------------------ 31
Self-Assessment Questions --------------------------------------------------------------------- 34
Summary ------------------------------------------------------------------------------------------------------- 35
Terminal Questions ------------------------------------------------------------------------------------------ 35
Answer Keys -------------------------------------------------------------------------------------------------- 36
Glossary -------------------------------------------------------------------------------------------------------- 37
Bibliography --------------------------------------------------------------------------------------------------- 37
External Resources ----------------------------------------------------------------------------------------- 37
e-References ------------------------------------------------------------------------------------------------- 37
Image Credits ------------------------------------------------------------------------------------------------- 38
Video Links ---------------------------------------------------------------------------------------------------- 38
Keywords ------------------------------------------------------------------------------------------------------ 38

Quantitative Methods
7
Aim
This unit aims to explain the basic concepts of probability and its applications in the
field of management.

Instructional Objectives
This unit intends to:
● Explain the basic concepts of probability
● Discuss the concepts of probability in various management applications

Learning Outcomes
At the end of this unit, you are expected to:
● Demonstrate conditional probabilities and various theorems of probability
● Apply the concepts of probability in marketing, HRM, finance, and other
functional areas

Quantitative Methods
8
Introduction
We use the words ‘probability’ and ‘chance’ frequently in our daily conversations, yet most people
have only a hazy understanding of what they represent. For example, we might hear in a weather
report, “There is a chance of heavy rain tomorrow”. “There are chances that both teams and
win tomorrow’s match” and “It is likely that I will not be able to come to your house for the get-
together,” are also some other examples. It has been observed that keywords like likelihood,
chances, possible, likely, and others appear in the above sentences and communicate the same
meaning, i.e., the event is not certain to occur, or, in other words, there is uncertainty in the event’s
occurrence. In plain language, the term “probability” denotes that there is a degree of uncertainty
regarding the event’s outcome. However, in mathematics and statistics, we attempt to describe
conditions under which we can make meaningful numerical assertions about uncertainty and use
specific methods for computing numerical probabilities and expectations. The term probability is
thus defined in the statistical sense and is unrelated to beliefs or any type of dreaming.

1.1.1 Definitions and Rules of Probability


Throwing a die, tossing a coin, picking cards from a pack of cards, and other games of chance
associated to gambling gave rise to the probability theory. Girolamo Cardano (1501-1576), an
Italian mathematician, was the first to publish a book on the subject, Book on Games of Chance
(Liber de Ludo Aleae), in 1663, after his death.

The concept of probability deals with uncertainty and randomness. The literal meaning of probability
is ‘Chance’. Galileo – an Italian Mathematician made inventions in quantitative probability.

The concept of probability developed from three gambling games:


● Tossing coins
● Throwing dice
● Playing cards
Sometimes you can measure a probability with a number like “10% chance of rain”, or you can
use words such as impossible, unlikely, possible, even chance, likely, and certain.
Example: “It is unlikely to rain tomorrow”.

Impossible Unlikely Even Chance Likely Certain

1-in-6 Chance 4-in-5 Chance

Fig. 1: Representation of Chance

Quantitative Methods
9
1.1.1.1 Basic Terminology in Probability

Random Experiment

An experiment is called Random Experiment under the following conditions:


● The experiment should be conducted under identical conditions
● The result may not be unique
● The result may be any one of the sets of possible results
E.g.: Tossing a coin, throwing a die, Selecting a card from pack of 52 cards.

Sample space

● The set of all possible outcomes of random experiment is called sample space and is
denoted by S

E.g.: S = { H , T } S = {1, 2,3, 4,5, 6}


,
Trial and Event

● Any performance of a random experiment is called Trial and its outcomes are called
Events.

E.g.: Tossing a coin for one time is called a trial and its outcomes { H , T } are events.

Exhausting Situations

● In a random experiment, ‘exhaustive events’ refers to the entire number of all conceivable
elementary outcomes. In other words, when there are no alternative options, a set is said
to be exhaustive.

Favourable Events

● Favourable events are the basic results that imply or favour the occurrence of an event,
i.e., the outcomes that aid in the occurrence of that event.

Mutually Exclusive Events

● An event is ‘mutually exclusive’ if it completely prevents all other events in a trial from
occurring. In other words, two occurrences and cannot happen at the same time.

Equally likely or Equi-probable Events

● If there is no reason to expect one event over another, the outcome is said to be ‘equally
likely’, i.e., each of the exhaustive outcomes has an equal chance of occurring.

Complementary Events

● Let E denote the event’s occurrence. The absence of event E is denoted by the complement
of E , E ' s complement is indicated by the symbol.

Quantitative Methods
10
P ( Not1) =
1 − p ( 2,3, 4,5, 6, ) =
1− 5 / 6 =
1/ 6

P (1) = 1/ 6 P ( Not1) =
1 − p ( 2,3, 4,5, 6, ) =
1− 5 / 6 =
1/ 6

Independent Events

● In a sequence of trials, two or more events are ‘independent’ if the outcome of one does
not affect the outcome of the other or vice versa.

1.1.1.2 Definitions of Probability

There are four approaches to construct a measure of probability of occurrence of an event. They
are:
● Mathematical or Classical Approach
● Statistical or Empirical Approach
● Axiomatic Approach.
● Subjective Approach
Classical or Mathematical Approach
● If a trial results in ‘n’ exhaustive, mutually exclusive, equally likely and independent
outcomes, and if ‘m’(m<=n) of them is favourable for the happening of the event E, then
the probability ‘P’ of occurrence of the event ‘E’ is given by

number of outcomes favourable to eventE


P ( E )=
Exhastive number of outcomes

Example: P ( H ) = ½

Favourable cases of event (H) =1

Exhaustive cases of experiment (H,T)=2


Empirical or Statistical Approach
● The ‘frequency’ approach to probability is another name for this method. The probability
is calculated by repeating the experiment many times. We acquire more accurate results
as the number of trials n increases.
● Definition: When a random experiment is repeated under similar conditions, the ratio
of the favourable number of trials to the total number of trials reaches its limiting value
since the total number of trials is endlessly enormous, i.e., Out of N Trials, M Trials are
favourable to happening of an event M.

Then P ( E ) = lim M / N
N →∞

Quantitative Methods
11
Example: P ( One head = ) m= / n 2= / 4 1/ 2or 0.5 A die is rolled 100 times. The
number 3 is rolled 12 times. The relative frequency of rolling a 3 is 12/100.

Axiomatic Approach

This approach was proposed by Russian Mathematician A. N. Kolmogorov in 1933.


‘Axioms’ are statements which are reasonably true and are accepted as such, without
seeking any proof.
Definition: Let S be the sample space associated with a random experiment. Let A be
any event in S . Then P ( A ) is the probability of occurrence of A if the following axioms
are satisfied.

P ( A ) > 0 , where A is any event. (Axiom of Non-negativity)

P ( S ) = 1 . (Axiom of Certainty)

P ( AUB
= ) P ( A) + P ( B ) , when event A and B are mutually exclusive. (Axiom of ad-
ditively)

Personal or subjective probability

● The probability that determined based on the human tendencies like experiences, belief,
etc., is called subjective probability.
● These are values (between 0 and 1 or 0 and 100%) assigned by individuals based on
how likely they think events are to occur.
● Example: The probability that a student is gains 90 marks in an exam is 60%

1.1.1.3 Rules of Probability

Probability of any event is always non-negative. That is P ( A ) ≥ 0

Probability of any event is always less than or equals to unity. That is P ( A ) ≤ 1

P ( A) ≤ 1

0 ≤ P ( A) ≤ 1

If P ( A ) = 0 , A is called impossible event

If P ( A ) = 1 , A is called sure or certain event

Quantitative Methods
12
Self-Assessment Questions

1. Literal meaning of probability is ____________.

A) Event
B) Chance
C) Relative
D) Case

2. Who made inventions in probability?

A) Galileo
B) Fisher
C) Pearson
D) Bowley

3. The set of all possible outcomes of random experiment is called as __________.

A) Sample
B) Sample Space
C) Space
D) Event

Quantitative Methods
13
1.1.1.4 Solved examples

1. What is probability of getting

a) 1 head in tossing coin for twice;


b) two heads coin for twice;
c) No heads coin for twice;
d) at least one tail?

Solution:

Tossing a coin for 2 times exhaustive cases

=m {=
TH , HT , TT } 3 S = { HH , HT , TH , TT } n = 4

head m
a) Favourable cases of getting one= {=
HT , TH } 2

Therefore P ( One head


= ) / n 2=
m= / 4 1/ 2or 0.5

=
b) Favourable cases of two heads m {=
HH } 1

P (Two heads
= ) m=
/n ¼

=
c) Favourable cases of no heads m {=
TT } 1

P ( No heads ) = ¼

=
d) Favourable cases of at least one tail m {=
TH , HT , TT } 3

P ( Atleast one tail ) = ¾

2. If a die is thrown, find the probability of getting

a) an odd face;
b) a prime face;
c) a face multiple of 3;
d) a face of 7;
e) any number between 1 and 6 (both inclusive).

Solution:

=
Exhaustive cases n {1,=
2,3, 4,5, 6} 6

=
a) Favourable cases of odd face= m {1,3,5
= } 3

p ( odd face
= ) m=
/ n 3=
/ 6 1/ 2

Quantitative Methods
14
=
b) Favourable cases of Prime face= m {=
2,3,5} 3

p ( prime
= ) 3=
/ 6 1/ 2

=m
c) Favourable cases of multiple of 3= {=
3, 6} 2

p ( multiple of 3) = 2 / 6

d) Favourable cases of face value= 7 m = {0}


p ( face=
7 ) 0=
/6 0

=
e) Favourable cases of getting 1to 6= m {1,=
2,3, 4,5, 6} 6

p (1 − 6=
) 6 / 6= 1

Axiomatic Definition of Probability

1. P ( a ) ≥ 0 (Axiom of Non-negativity)

2. P ( s ) = 1 (Axiom of Certainty)

3. P ( a ∪ b=
) P ( a ) + P ( b ) (a, b Mutually exclusive) (Axiom of additivity)

Example:

If a coin is tossed, then define the probability of getting head using axiomatic definition of
probability.

Ans: a: getting head, b: getting tail

1) P ( a
= ) ½ ( > 0)

2) P ( S ) = P ( H , T ) = ½ + ½ = 1

3) P ( HUT
= ) P ( H ) + P (T )

P ( S=
) ½ +½

1=1

Note: (Union means or: a ∪ b means a or b

Intersection means and: a ∩ b means a and b )

Quantitative Methods
15
3. If a coin is tossed for 3 times, then find the probability for

a) two heads;
b) Alternative tails;
c) At least two tails
d) No tails.

Solution:

If a coin is tossed for 3 times, then the sample space

S = {HHH , HHT , HTH , HTT , THH , THT , TTH , TTT } = n= 8

=
a) the favourable cases of getting two heads m {HHT , HTH , THH }
= 3

(E)
P= m
= /n 3/8

=
b) the favourable cases of getting alternative tails m {=
THT } 1

(E)
P= m
= /n 1/ 8

getting at least two tails m


c) the favourable cases of= {HTT , THT , TTH , TTT }
= 4

P (=
E ) m=
/n 4/8

tails m
d) the favourable cases of getting no= {=
HHH } 1

(E)
P= m
= /n 1/ 8

4. If a die thrown for two times then find the probability of a) The total on both the dice is
9;b) The total on both the dice is more than 10; c)First die shows face 3; d) second die
shows face 5; e) both the dice have the same face; f ) first die an even and second die
is an odd; g) the total on both the dice is 13; h) the total on both the dice is any number
between 2 to 12 (Both Inclusive).

Solution:

If a die thrown for two times, then the exhaustive cases are 62 = 36.

S = {(1,1) , (1, 2 )(1,3) , (1, 4 ) , (1,5 ) , (1, 6 ) , ( 2,1) , ( 2, 2 )( 2,3) , ( 2, 4 ) , ( 2,5 ) , ( 2, 6 ) , ( 3,1) , ( 3, 2 )( 3,3) , ( 3, 4 ) , ( 3,5 ) , ( 3, 6 ) ,

( 4,1) , ( 4, 2 )( 4,3) , ( 4, 4 ) , ( 4,5) , ( 6, 6 ) , ( 5,1) , ( 5, 2 )( 5,3) , ( 5, 4 ) , ( 5,5 ) , ( 5, 6 ) , ( 6,1) , ( 6, 2 )( 6,3) , ( 6, 4 ) , ( 6,5 ) , ( 6, 6 )} = 36


a) The favourable cases of getting total on both the dice is 9,

m {( 3, 6 ) , ( 6,3) , ( 4,5) , ( 5, 4 )}
= 4 P (=
E ) m=
/ n 4 / 36

Quantitative Methods
16
b) The favourable cases of getting total on both the dice is more than 10. (i.e., 11 or 12)

m {( 5, 6 ) , ( 6,5) , ( 6, 6 )} 3 P ( E ) = 3 / 36
=

c) The favourable cases of getting First die shows face 3,

m {( 3,1) , ( 3, 2 ) , ( 3,3) , ( 3, 4 ) , ( 3,5) , ( 3, 6 )} 6 P (=


= E ) m=
/n 6 / 36

d) The favourable cases of getting second die shows face 5,

m {(1,5) , ( 2,5) , ( 3,5) , ( 4,5) , ( 5,5) , ( 6,5)}


= 6 P (=
E ) m=
/n 6 / 36

e) The favourable cases of getting both the dice have the same face.

m {(1,1) , ( 2, 2 ) , ( 3,3) , ( 4, 4 ) , ( 5,5) , ( 6, 6 )}


= 6 P (=
E ) m=
/n 6 / 36

f) The favourable cases of getting first die an even and second die is an odd.

m {( 2,1) , ( 2,3) , ( 2,5) , ( 4,1) , ( 4,3) , ( 4,5) , ( 6,1) , ( 6,3) , ( 6,5)}


= 9

P (=
E ) m=
/n 9 / 36

m
g) The favourable cases of getting the total on both the dice is 13,= {}
= 0

P (=
E ) m=
/n 0 /=
36 0 the total on both the dice is 13 is called Impossible event.

h) The favourable cases of getting total on both the dice is any number between 2 to 12
(Both Inclusive)

m = {(1,1) , (1, 2 )(1,3) , (1, 4 ) , (1,5 ) , (1, 6 ) , ( 2,1) , ( 2, 2 )( 2,3) , ( 2, 4 ) , ( 2,5 ) , ( 2, 6 ) , ( 3,1) , ( 3, 2 ) , ( 3,3) ,

( 3, 4 ) , ( 3,5) , ( 3, 6 ) , ( 4,1) , ( 4, 2 )( 4,3) , ( 4, 4 ) , ( 4,5) , ( 6, 6 ) , ( 5,1) , ( 5, 2 )( 5,3) , ( 5, 4 ) , ( 5,5 ) , ( 5, 6 ) , ( 6,1) ,


( 6, 2 ) , ( 6,3) , ( 6, 4 ) , ( 6,5) , ( 6, 6 )} = 36 P (=
E ) m=
/n 36 /=
36 1 total on both the dice
is any number between 2 to 12 (Both Inclusive) is called sure or certain event.

Quantitative Methods
17
Permutation:

The literal meaning of permutation is arrangement.

A, B, C : Permutations of 2 in size AB, BC , CA, BA, CB & AC

Out of n letters, r letters can be arranged in npr

npr = n ! / ( n − r ) ! ; where=
n! n ( n − 1) ( n − 2 ) ……3 2 1

3 p=
2 3! / ( 3 − 2 )=
! 3.2.1/1= 6

Combination:

The literal meaning of combination is selection or order.

A, B, C : Combination of 2 in size AB, BC , CA

ncr = n ! / ( n − r )! r ! ; where=
n! n ( n − 1) ( n − 2 ) ……3 2 1

3c2 = 3! / ( 3 − 2 ) !2! = 3.2.1/1.2.1 = 3

Playing Card System (52 cards)

4 suits – Diamond, Hearts, Spade and Club

Each suit - 13 cards (2,3,4,5,6,7,8,9,10(9 Number cards) A, J , Q, K (4Face cards))

Diamond, Hearts - Red cards

Spade and Club – Black cards

5. If 2 cards are selected from a pack of 52 cards, then find the probability for a) a spade and
a heart card; b) a number and a face card; c) both the cards are black.

Solution:
Selection of 2 cards from a pack of 52 cards can be done in n = 52c2

a) favourable cases of getting a spade and a heart card ü = 13 üx 13

( e ) m=
p= /n 13c1 x 13c1 / 52c2

Quantitative Methods
18
b) favourable cases of getting a number and a face card m = 36c1 x16c1

( e ) m=
p= /n 36c1 x16c1 / 52c2

c) favourable cases of getting both the cards are black m = 26c2

( e ) m=
p= /n 26c2 / 52c2

6. If 4 cards are selected from a pack of 52 cards, then find the probability for a) one card
from each suit;
b) two of the selected are diamonds; c) at least one card is a spade card; d) all are num-
ber cards.

Solution:

Selection of 4 cards from a pack of 52 cards can be done in n = 52c4

a) favourable cases of getting one card from each suit , m = 13c1 x13c1 x13c1 x13c1

( e ) m=
p= /n 13c1 x13c1 x13c1 x13c1 / 52c4

b) favourable cases of getting two of the selected are diamonds, m = 13c2 x39c2

( e ) m=
p= / n 13c2 x39c2 / 52c4

c) favourable cases of getting at least one card are a spade card

m = (13c1 x39c3 ) + (13c2 x39c2 ) + (13c3 x39c1 ) + 13c4

p (e) = (13c1 x39c3 ) + (13c2 x39c2 ) + (13c3 x39c1 ) + 13c4 / 52c4


m/n =

d) favourable cases of getting all are number cards, m = 36c4

( e ) m=
p= / n 36c4 / 52c4

Quantitative Methods
19
7. a basket contains 3 white, 4 blue and 5 red balls. if 3 balls are selected at random then
find the probability for

a) one ball from each colour;


b) two of the drawn balls are white;
c) all are red balls;
d) at least one is blue ball;
e) none is a white ball.

Solution:

If 3 balls are selected from ( 3w + 4b + 5r =


12 ) in n = 12c3

a) the favourable cases for one ball from each colour, m = 3c1.4c1.5c1

(e)
p= m
= /n 3c1.4c1.5c1 /12c3

b) the favourable cases for two of the drawn balls are white, m = 3c 2.9c1

(e)
p= m
= /n 3c2 .9c1 /12c3

c) the favourable cases for all are red balls, m = 5c3

(e)
p= m
= /n 5c3 /12c3

d) the favourable cases for at least one is blue ball, m = 4c1.8c2 + 4c2 .8c1 + 4c3

p ( e ) = m / n = 4c1.8c2 + 4c2 .8c1 + 4c3 /12c3 or

p ( atleast one blue ball ) =


1 − p ( no blue ball ) =
1 – ( 8c3 /12c3 )

e) the favourable cases for none are a white ball, m = 9c3

(e)
p= m
= /n 9c3 /12c3

Quantitative Methods
20
8. Find the probability of getting 53 Sundays in a randomly selected leap year?

Solution:

7*52 weeks = 364 days.

364 days contains 52 Sundays definitely.

It required to find the probability for 53rd Sunday in the remaining 366-364 = 2 days

The exhaustive cases of the remaining two days will be

n {=
( sun, mon ) , ( mon, tue ) , ( tue, wed ) , ( wed , thu ) , ( thu, fri ) , ( fri, sat ) , ( sat , sun )} 7

of a Sunday will be m
The favourable cases = {=
( sun, mon ) , ( sat , sun )} 2

(e)
p= m
= /n 2/7

1.1.1.5 Addition theorem of Probability

● For Two Events: If A and B are any two events then the probability of happening of at

least one of the events is defined as P ( A ) = 0 P ( A ∪ B )= P ( A) + P ( B ) − P ( A ∩ B )

(If A, B are dependent) P ( A ∪ B )= P ( A ) + P ( B ) (If A, B are mutually exclusive)


● For Three Events: If A, B and C are any three events then the probability of happen-
ing of at least one of the events is defined as

P ( A ∪ B ∪ C=
) P ( A) + P ( B ) + P ( C ) − P ( A ∩ B ) − P ( B ∩ C ) − P ( C ∩ A) + P ( A ∩ B ∩ C )

(If A, B, C are dependent) P ( A ∪ B ∪ =


C) P ( A ) + P ( B ) + ( C ) (If A, B, C are
mutually exclusive)

1. A card is drawn at random from a pack of 52 cards. Find the probability that the drawn
card is either a spade or a king.

Solution:

Let A : Event of drawing a card of spade and B : Event of drawing a king card

Quantitative Methods
21
13
The probability of drawing a card of spade P ( A) =
52

4
The probability of drawing a king card P ( B ) =
52
Because one of the kings is a spade card also therefore, these events are not mutually

1
exclusive. The probability of drawing a king of spade is P ( A ∩ B ) =
52
So, the probability of the drawing a spade or king card is:

13 4 1 16 4
P ( A ∪ B ) = P( A) + P ( B ) − P ( A ∩ B ) = + − = =
52 52 52 52 13

2. A herd contains 30 cows numbered from 1 to 30. One cow is selected at random. Find
the probability that number of the selected cow is a multiple of 5 or 8.

Solution:

Let A be the event of number being a multiple of 5 within 30 and B be the event of number
being a multiple of 8 within 30.

Favourable cases for event A are {5, 10, 15, 20, 25, 30} = 6

Similarly favourable cases for event B are {8, 16, 24} = 3

6
The probability of the number being a multiple of 5 within 30 is P ( A ) =
30

3
The probability of the number being a multiple of 8 within 30 is P ( B ) =
30
Since A and B are mutually exclusive, the probability that number of the cow is a multiple of

6 3 9 3
5 or 8 is: P ( A ∪ B ) = P ( A) + P ( B ) = + = =
30 30 30 10

3. A card is drawn at random from a pack of 52 cards. Find the probability that the drawn
card is either a club or an ace of diamond.

Quantitative Methods
22
Solution:

Let A : Event of drawing a card of club; and B : Event of drawing an ace of diamond

13
The probability of drawing a card of club P ( A ) =
52

1
The probability of drawing an ace of diamond P ( B ) =
52
Since the events are mutually exclusive, the probability of the drawn card being a club, or an ace

13 1 14 7
of diamond is: P ( A ∪ B ) = P ( A) + P ( B ) = + = =
52 52 52 26

4. A herd contains 30 cows numbered from 1 to 30. One cow is selected at random. Find the
probability that the number of the selected cow is a multiple of 5 or 6.

Solution:

Let A be the event of number being a multiple of 5 within 30 and B be the event of number being
a multiple of 6 within 30.

Favourable cases for event A are {5, 10, 15, 20, 25, 30} = 6

Similarly favourable cases for event B are {6, 12, 18, 24, 30} = 5

6
The probability of the number being a multiple of 5 within 30 is P ( A ) =
30

5
The probability of the number being a multiple of 6within 30 is P ( B ) =
30
Since 30 is a multiple of 5 as well as 6, therefore the events are not mutually exclusive

1
( A ∩ B ) P ( A=
P= and B )
30
The probability that the number of the selected cow is a multiple of 5 or 6 is:

6 5 1 10 1
P ( A ∪ B ) = P( A) + P ( B ) − P ( A ∩ B ) = + − = =
30 30 30 30 10

Quantitative Methods
23
5. A herd contains 30 cows numbered from 1 to 30. One cow is selected at random. Find the
probability that the number of the selected cow is a multiple of 5 or 6 or 8.

Solution:

Let A - Multiple of 5, (5,10,15,20,25,30) =6, B - Multiple of 6, (6,12,18,24,30) =5,

C - Multiple of 8, (8,16,24) = 3

P ( A ) = 6 / 30

P ( B ) = 5 / 30

P ( C ) = 3 / 30

P( A ∩ B) =
1/ 30

P( B ∩ C ) =
1/ 30

P( A ∩ C ) = 0 / 30= 0

P ( A ∩ B ∩ C=
) 0 / 30
= 0

The probability for selected number multiple of 5 or 6 or 8

P ( AUBUC ) = P ( A) + P ( B ) + P ( C ) − P ( A ∩ B ) − P ( B ∩ C ) − P ( C ∩ A) + P ( A ∩ B ∩ C )

= 6 / 30 + 5 / 30 + 3 / 30 − 1/ 30 − 1/ 30 − 0 + 0

= 12
= / 30 2 / 5

6. If a card is selected at random from a pack of 52 cards, then find the probability that the
selected card is a Red or a Heart or a Jack card.

Solution:

Let A - Red Card, P ( A ) = 26 / 52

Let B - Heart Card, P ( B ) = 13 / 52

Quantitative Methods
24
Let C - Jack card, P ( C ) = 4 / 52

A ∩ B - Red and heart, P ( A ∩ B ) =


13 / 52

B ∩ C - heart and Jack, P ( B ∩ C ) =


1/ 52

A ∩ C - Red and Jack card, P ( A ∩ C ) =


2 / 52

A ∩ B ∩ C - Red and Heart and Jack Card, P ( A ∩ B ∩ C ) =


1/ 52

P ( A ∪ B ∪ C=
) P ( A) + P ( B ) + P ( C ) − P ( A ∩ B ) − P ( B ∩ C ) − P ( C ∩ A) + P ( A ∩ B ∩ C )

= 26 / 52 + 13 / 52 + 4 / 52 − 13 / 52 − 1/ 52 − 2 / 52 + 1/ 52

= 28 / 52

Quantitative Methods
25
Self-Assessment Questions

4. Number of approaches in probability are__________.

A) 3
B) 2
C) 4
D) 5

5. U Stands for the operation________.

A) And
B) Or
C) Not
D) No

6. ∩ Stands for the operation____________.

A) No
B) Or
C) Not
D) And

Quantitative Methods
26
1.1.2 Conditional Probability and Independence of Events

1.1.2.1 Conditional Probability

If A and B are any two events in the sample space S . Then the probability of happening of

event A where B is known is given by P ( A given B )

P ( A / B) =
P( A ∩ B) / P ( B ) where P ( B ) > 0

Similarly, P ( B / A ) =
P( A ∩ B) / P ( A ) where P ( A ) > 0

1.1.2.2 Multiplication Theorem

For two events:

If A and B are any two events in the sample space S . Then the probability of happening of

P ( A ) P ( B / A ) ;where P ( A ) > 0
event A and B is given by P ( A ∩ B ) =

= P ( B ) P ( A / B ) ;where P ( B ) > 0

For three events:

If A , B and C are any three events in the sample space S . Then the probability of happening
of event A , and B , and C is given by

P( A ∩ B ∩ C )
= P ( A) P ( B / A) P ( C / A ∩ B )

Note: P ( A ∩ B ∩ C ) =P ( A ) P ( B ) P ( C ) (since A , B and C are independent)

Independence of events:

P ( A) P ( B )
Two events A and B are said to be independent if P ( A ∩ B ) =

Three events A , B and C are said to be independent if P ( A ∩ B ∩ C ) =P ( A) P ( B ) P ( C )

Examples on conditional Probability and Multiplication theorem:

1. Box A contains 5 red and 3 white balls and Box B Contains 2 red and 6 white balls. If
a ball is drawn from each box, what is the probability that they are both of same colour?

Quantitative Methods
27
Solution:

P ( E1 ) 1/=
Let E1 = The ball drawn from box A and a red ball, = 2.5 / 8 5 /16

P ( E2 ) 1/=
Let E2 = The ball drawn from box B and a red ball, = 2.2 / 8 2 /1 6

Let E1 ∩ E2 = The red ball selected from Box A & B , P ( E1 ∩ E2 ) =


P( E1 ).P( E1 )

= 5 /16.2
= /16 10
= / 256 5 /128 .

P ( E3 ) 1/=
Let E3 = The ball drawn from box A and a WHITE ball, = 2.3 / 8 3 /16

P ( E2 ) 1/=
Let E4 = The ball drawn from box B and a WHITE ball, = 2.6 / 8 6 /1 6

Let E3 ∩ E4 = The WHITE ball selected from Box A & B , P( E3 ∩ E4 ) =


P( E3 ).P( E4 )

= 3 /16.6
= /16 18
= / 256 9 /128 .

The probability that they are both of same colour

P ( E1 ∩ E2 ) + P ( E3 ∩ E4 ) = 5 /128 + 9 /128 = 14 /128 = 7 / 64 = 0.109

2. The probabilities of 3 students solve a problem in statistics are 1/2,1/3 &1/4 respectively.
Then find the probability that that the problem will be solved.

Solution:

Let A denotes student 1 solves the problem, Then P ( A ) = 1/ 2

Let B denotes student 2 solves the problem, Then P ( B ) = 1/ 3

Let C denotes student 3 solves the problem, Then P ( C ) = 1/ 4

Probability that that the problem will be solved

P ( A ∪ B ∪ C=
) P ( A) + P ( B ) + P ( C ) − P ( A ∩ B ) − ( B ∩ C ) − ( C ∩ A) + P( A∩ B ∩C)

= P ( A) + P ( B ) + P ( C ) − P ( A) . P ( B ) − P ( B ) . P ( C ) − P ( C ) . P ( A) + P ( A) . P ( B ) . P ( C )

Quantitative Methods
28
= 1/2 + 1/ 3 + 1/ 4 − 1/ 2.1/ 3 − 1/ 3.1/ 4 − 1.4.1/ 2 + 1/ 2.1/ 3.1/ 4

= 3/ 4

3. If the probability that a communication system has high fidelity is 0.81 and the probability
that it has high fidelity and selectivity is 0.18. What is the probability that the system with
high selectivity given that it has high fidelity?

Solution:

Let A denotes the system has high fidelity, we have P ( A ) = 0.81

Let B denotes the system has high selectivity, we have P ( A ∩ B ) =


0.18

Probability that the system with high selectivity given that it has high fidelity.

P ( B / A=
) P( A ∩ B) / P ( A)

= ( 0.18) / ( 0.81)
= 2/9

Quantitative Methods
29
Self-Assessment Questions

7. P ( A / B ) can be termed as Probability of _______________.

A) A or B
B) A and B
C) A by B
D) A given B

8. P ( A ∪ B ) can be termed as Probability of A ______ B .

A) Intersection
B) Union
C) Given
D) Not

9. P ( A ∩ B ) can be termed as Probability of A ______ B .

A) Intersection
B) Union
C) Given
D) Not

10. If A and B are independent, then P ( A ∩ B ) is_____________.

A) 1
B) 2
C) 0
D) 3

11. If A and B are dependent, then P ( A ∩ B ) is__________.

A) 1
B) 2
C) 0
D) 0<p<1

Quantitative Methods
30
1.1.3 Baye’s Theorem
The concept of Baye’s theorem is applied to convert the Prior probabilities into Posterior
(Conditional) probabilities.

1.1.3.1 Statement of Baye’s theorem

Let A1 , A2… An are n mutually exclusive events in the sample space S such that P ( Ai ) > 0

and B be another event in the sample space S such that P ( B ) > 0 and B €UAi . Then
the conditional probability of Aj given B is

P ( A j ) .P ( B / A j ) .
P ( Aj / B ) = n

∑ P( A ).P ( B / A ) .
j =1
j j

Examples on Bayes theorem

1. In a bolt factory, machines A , B and C produces 20%, 30% and 50% of the total out-
put and 6%, 3% and 2% are defectives produced by A , B and C . If a bolt is drawn at
random and found to be defective, then find the probability that it is manufactured by i)
Machine A ii) Machine B iii) Machine C iv) Machines A or B .

Solution:

P ( A1) 20
Let A1 – Production by Machine A , = = /100 0.2

Let A2 – Production by Machine B , P


= ( A2 ) 30=
/100 0.3

Let A3 – Production by Machine C , P


= ( A3) 50=
/100 0.5
Let B denotes producing a defective bolt, then we have

P ( B=
/ A1) 6=
/100 0.06

P ( B=
/ A2 ) 3=
/100 0.03

P ( B=
/ A3) 2 /100
= 0.02

i) The probability for the defective both produced by machine A

P ( A1/ B ) P ( A1) P ( B / A1) / P ( A1) P ( B / A1) + P ( A2 ) P ( B / A2 ) + P ( A3) P ( B / A3)


=

Quantitative Methods
31
= ( 0.2.0.06 ) / ( 0.2.0.06 ) + ( 0.3.0.03) + ( 0.5.0.02 )
= 0.012 / 0.031
= 0.38
= 38%
ii) The probability for the defective both produced by machine B

P ( A2 / B ) P ( A2 ) P ( B / A2 ) / P ( A1) P ( B / A1) + P ( A2 ) P ( B / A2 ) + P ( A3) P ( B / A3)


=

= ( 0.3.0.03) / ( 0.2.0.06 ) + ( 0.3.0.03) + ( 0.5.0.02 )


= 0.009 / 0.031
= 0.29
= 29%
iii) The probability for the defective bolt produced by machine C

P ( A3 / B ) P ( A3) P ( B / A3) / P ( A1) P ( B / A1) + P ( A2 ) P ( B / A2 ) + P ( A3) P ( B / A3)


=

= ( 0.5.0.02 ) / ( 0.2.0.06 ) + ( 0.3.0.03) + ( 0.5.0.02 )


= 0.01/ 0.031
= 0.32
= 32%
iv) The probability for the defective both produced by machine A or B

P ( ( A1 ∪ A2 ) / B ) = P ( A1/ B ) + P ( A2 / B )
= 0.38 + 0.29
= 0.67
= 67%
v) The probability for the defective both produced by machine A or B or C

P ( ( A1 ∪ A2 ∪ A3) =
/ B ) P ( A1/ B ) + P ( A2 / B ) + P ( A3 / B )
=0.38 + 0.29 + 0.32 =0.99 =1 =100%

2. Of all staff in a university, 40% are women and 60% are men, among which 3% and 5%
are smoke cigarettes. If an employee is selected at random and found to be a smoker,
then find the probability that a) a man who is a smoker; b) a woman who is a smoker; c)
a smoker.

Solution:

Let A1 denotes women employees then,

P ( A1)
= 40 /100
= 0.4

Let A2 denotes men employees then,

P ( A2 )
= 60
= /100 0.6

Let B denotes the habit of smoking

Quantitative Methods
32
Then we have

P ( B=
/ A1) 3=
/100 0.03

P ( B /=
A2 ) 5=
/100 0.05

a) The probability that a smoker who is a men

P ( A2 / B ) P ( A2 ) P ( B / A2 ) /  P ( A1) P ( B / A1) + P ( A2 ) P ( B / A2 ) 

= ( 0.6 X 0.05) / ( 0.4 X 0.03) + ( 0.6 X 0.05)


= 0.03 / 0.042
= 0.71
= 71%

b) The probability that a smoker who is a women

P ( A1/ B ) P ( A1) P ( B / A1) /  P ( A1) P ( B / A1) + P ( A2 ) P ( B / A2 ) 

= ( 0.4 X 0.03) / ( 0.4 X 0.03) + ( 0.6 X 0.05 ) 

= 0.012 / 0.042
= 0.28
= 28%

c) The probability for a smoker

P ( A1 ∪ A2 ) / B  =
P ( A1/ B ) + P ( A2 / B )

= 0.71 + 0.28 = 0.99 = 1

Quantitative Methods
33
Self-Assessment Questions

12. Baye’s theorem is used to convert prior probabilities into _____________


probabilities.

A) Posterior
B) Union
C) Intersection
D) Cost

13. P ( A ) , P ( B ) , P ( AUB ) , …. are called ____________ probabilities.

A) Intersection
B) Prior
C) Given
D) Posterior

14. P ( A / B ) , P ( B / A ) ,, …. are called ____________ probabilities.

A) Intersection
B) Prior
C) Given
D) Posterior

15. The sum of all probabilities equals to__________.

A) 1
B) 2
C) 0
D) 3

16. If P ( A ) = 0 then A is called _____________ event.

A) Even chance
B) Impossible
C) Sure
D) Certain

Quantitative Methods
34
Summary

• Probability is used to analyse uncertainty.


• Various uncertainty situations can be analysed with the help of probability.
• Coins, Dice, and Playing cards are the most significant experiments in
probability.
• This unit can provide the information on analysing uncertainty.

Terminal Questions

1. Write a short note on mathematical definition of probability.


2. Write a short note on statistical definition of probability.
3. Write a short note on subjective definition of probability.
4. If a coin is tossed for 2 times, then find the probability of getting a) at least one
head; b) no heads.
5. Find the probability for 53 Sundays in a randomly selected leap year.
6. A basket contains 3 red, 4 white and 5 blue balls. Then find the probability of get-
ting i) one ball from each colour; ii) all blue balls; iii) 2 balls are white.
7. If a die is thrown, then find the probability of getting i) an odd face; ii) multiple of 2.
8. If 4 cards are selected from a pack of 52 cards, then find the probability of getting
i) a king, a queen, a jack and an ace; ii) all are diamond cards; iii) three of the
selected cards are red cards.
9. In a bolt factory, machines A , B and C produces 20%, 30% and 50% of the total
output and 6%, 3% and 2% are defectives produced by A , B and C . if a bolt
is drawn at a random and found to be defective, then find the probability that it is
manufactured by i) machine A ; ii) machine B ; iii) machine C .

Quantitative Methods
35
Answer Keys

Self-Assessment Questions

Question No Answers

1 C

2 A

3 B

4 C

5 B

6 D

7 D

8 B

9 A

10 C

11 D

12 A

13 B

14 D

15 A

16 B

Quantitative Methods
36
Glossary
• Conditional probability: Defined as the likelihood of an event or outcome
occurring, based on the occurrence of a previous event or outcome.
• Probability: A probability is a number that reflects the chance or likelihood that
a particular event will occur.
• Event: The outcomes of an experiment.

BIBLIOGRAPHY
1. Chung, K. L. (2000). A Course in Probability Theory (3rd ed.). San Diego, CA:
Academic Press.
2. Feller, W. (1968). An introduction to Probability Theory and its applications: Vol-
ume I. John Wiley & Sons.
3. Gupta, S. C., & Kapoor, V. K. Fundamentals of Mathematical Statistics. S. Chand
Publications.

External Resources
1. Rukmangadachari, E. Probability and Statistics. Person Education.
2. Rohatgi, V. K., & Ehsanes Saleh, A. K. (2015). An introduction to probability and
statistics (3rd ed.). doi:10.1002/9781118799635

e-References
• Bayes Theorem: https://www.cuemath.com/data/bayes-theorem/

• Rules of Probability: https://stattrek.com/probability/probability-rules

Quantitative Methods
37
Image Credits

Representation of Chance
Fig. 1 https://www.advancedhighermaths.
co.uk/probability-2/

Video Links

Topic Link
Probability https://www.youtube.com/
watch?v=lZSL7Tm5ViA
Introduction to Probability, Basic Overview https://www.youtube.com/
- Sample Space, & Tree Diagrams watch?v=SkidyDQuupA
Probability Equation Questions https://www.youtube.com/
watch?v=76-cMKLmWJY

Keywords

• Random experiment
• Sample space
• Equally likely events
• Bayes theorem

Quantitative Methods
38
QUANTITATIVE METHODS

Module - 1
Unit - 2

PROBABILITY DISTRIBUTIONS

Quantitative Methods
39
Unit Table of Contents
Unit 1.2 Probability Distributions

Aim -------------------------------------------------------------------------------------------------------------- 41
Instructional Objectives ------------------------------------------------------------------------------------ 41
Learning Outcomes ----------------------------------------------------------------------------------------- 41
Introduction ---------------------------------------------------------------------------------------------------- 42
1.2.1 Random Variable ----------------------------------------------------------------------------------- 42
Self-Assessment Questions --------------------------------------------------------------------- 44
1.2.2 Binomial Distribution ------------------------------------------------------------------------------ 45
Self-Assessment Questions --------------------------------------------------------------------- 50
1.2.3 Poisson Distribution ------------------------------------------------------------------------------- 51
Self-Assessment Questions --------------------------------------------------------------------- 55
1.2.4 Normal Distribution -------------------------------------------------------------------------------- 56
Self-Assessment Questions --------------------------------------------------------------------- 58
Summary ------------------------------------------------------------------------------------------------------- 59
Terminal Questions ------------------------------------------------------------------------------------------ 59
Answer Keys -------------------------------------------------------------------------------------------------- 60
Glossary -------------------------------------------------------------------------------------------------------- 61
Bibliography --------------------------------------------------------------------------------------------------- 61
External Resources ----------------------------------------------------------------------------------------- 61
e-References ------------------------------------------------------------------------------------------------- 61
Image Credits ------------------------------------------------------------------------------------------------- 62
Video Links ---------------------------------------------------------------------------------------------------- 62
Keywords ------------------------------------------------------------------------------------------------------ 62

Quantitative Methods
40
Aim
To explain basic concepts of probability distributions, random variables, and their
types.

Instructional Objectives
This unit intends to:
● Discuss the basic concepts of probability distributions
● Distinguish between continuous and discrete random variables
● Describe the properties of binomial distribution, poisson distribution, and
nominal distribution

Learning Outcomes
At the end of this unit, you are expected to:
● Distinguish between various types of random variables
● Calculate probabilities of random variables
● Identify continuous random variable and discrete random variable, and their
properties
● Apply probability distributions in marketing, HRM, finance, and other
functional areas

Quantitative Methods
41
Introduction
Random variable concept is an extension to the concept of probability. It describes the probability
distribution between different points of a random variable. Binomial, Poisson, and Normal Distri-
butions have a number of applications in management. The probability distribution of a random
variable describes how the probability is distributed over the values of the random variable.

1.2.1 Random Variable


A random variable is a real value function, which assigns every element in the sample space to
a real number.

X ( S ) : SàR

For example, we flipped a fair coin three times and recorded whether it showed heads or tails.

The result or sample space is S = { HHH , HHT , HTH , THH , TTT , TTH , THT , HTT } . There are
eight possible outcomes, and each outcome is equal. Now, we flip a fair coin four times. How
many possible outcomes are there? There are 24 =16. How about 8 times? 256 possible out-
comes! Instead of considering all possible outcomes, we can consider assigning the variable
X to be, say, the number of heads in n flips of a fair coin. If we flipped the coin n = 3 times (as
above), then X can take on possible values of 0,1,2, or 3. By defining the variable X , as we
have, we created a random variable.

E.g.: A coin is tossed twice and getting the head is a random variable.

S = {HH , HT , TH , TT }
R = {2,1, 0}

1.2.1.1 Types of Random variable

Random variables are classified as discrete and continuous variables. The main difference be-
tween them is the type of possible values each variable can take. Let us understand these two
variables in detail below.

a. Discrete Random variable


It is a random variable which assumes only a finite or countable number of values. A
discrete random variable is also known as a stochastic variable. These discrete ran-
dom variables are always easily calculable whole numbers.

Eg :=
X 6,=
Y 1, 2,3, 4,5.

Quantitative Methods
42
b. Continuous Random variable
It is a random variable which assumes an infinite number of values between certain
intervals. Simply put, a random variable is called continuous if its possible values
have an entire interval of numbers.

Eg : X € [ 2,3] , Y € ( 5, 6 )

1.2.1.2 Probability Distribution

It is the distribution of the probability among the various sample points in the sample space.

E.g.: A coin is tossed two times and getting a head is a random variable.

S = {HH , HT , TH , TT }

R = {2,1, 0}

P = {1/ 4, 2 / 4, 1/ 4} is the probability distribution

1.2.1.3 Conditions for Probability Distribution

1. P ( Xi ) ≥ 0

2. The sum of all probabilities is equal to 1

1.2.1.4 Discrete Probability Distribution

Discrete Probability Distribution is a distribution which assumes finite or countable number of


values.

1. Binomial Distribution
2. Poisson Distribution

1.2.1.5 Continuous Probability Distribution

Continuous Probability Distribution is a distribution which assumes an infinite number of values


between certain intervals.

1. Normal Distribution

Quantitative Methods
43
Self-Assessment Questions

1. Random variable is a _______ valued function.

A). Nominal
B). Real
C). Ordinal
D). Case

2. Number of types of random variables are_______.

A). 2
B). 3
C). 4
D). 1

3. Sum of all probabilities is always equals to ________.

A). 2
B). 3
C). 4
D). 1

DID
YOU Binomial Distribution (BD) was discovered by James Bernoulli in 1700
KNOW as an extension to Bernoulli Distribution.

Quantitative Methods
44
1.2.2 Binomial Distribution
An experiment is said to follow the BD under the following conditions.

1. The experiment should be repeated a finite number of times. Say ' n ' times.

2. It has only two outcomes (Success ( s ) and Failure ( f ))

3. p ( s ) = p , p ( f ) 1
= q ; where p + q =

E.g.: A coin is tossed 10 times, then the probability for 3 heads.

1.2.2.1 Probability Mass Function (PMF)

If a trail repeated for n times and the probability of getting x successes can be represented by
the following Probability Mass Function:

P (= )
X x= ncx. p x q ( n − x ) ;=
x 0,1, 2,3……. n
= 0 ; otherwise
Where n , p are called parameters of B.D

1.2.2.2 Properties of Binomial Distribution

• The mean of Binomial Distribution, E ( x ) = np

• The Variance of Binomial Distribution, V ( x ) = npq

1.2.2.3 Examples of Binomial Distribution

1. A fair coin is tossed 6 times then, find the probability of getting 4 heads.

Answer:

Given that the number of trails, n = 6

p (The probability of success ( H ) ) = ½

p (The probability of failure (T ) ) =


1− ½ =½

The PMF of B.D

Quantitative Methods
45
P (= )
X x= ncx. p x q ( n − x ) ; =
x 0,1, 2,3……. n

P ( X= x=
) 6cx (1/ 2 ) (1/ 2 )
x 6− x

P (= ) 6cx (1/ 2 )
6
X x=

Now the probability of getting 4 heads (successes)

P (= ) 6C4 (1/ 2 )
6
X 4=

= 0.234

2. A die is rolled 10 times. What is the chance of getting exactly 2 times the face 1?

Answer:

n = 10

Success? face 1 p = 1/ 6

Failure? Face 2,3,4,5,6 q = 1 − p = 5 / 6

( x 2=
) 10c 2 (1/ 6 ) ( 5 / 6 )
2 (10 − 2 )
p=

= 0.27

3. A discrete variable X has the mean 6 and variance 2. If it is assumed that the distribution is

binomial, then find the PMF of B.D and also P ( 5 ≤ x ≤ 7 ) .

Answer:

Given that x ~ B ( 6, 2 )

That is mean, np = 6 ----- equation 1

Variance, npq = 2 ----- equation 2

Substitute 1 in 2

− > 6q = 2− > q = 2 / 6 = 1/ 3

Quantitative Methods
46
− > P = 1 − q = 1 − 1/ 3 = 2/3

From 1 − > np = 6− > n ( 2 / 3 ) = 6− > n = 6 (3 / 2) = 9

=n 9,
= p 2/=
3, q 1/ 3

 the PMF of B.D is p (= )


X x= ncx p x q n − x

p (= ) 9cx ( 2 / 3) (1/ 3)
x 9− x
X x= is the required PMF of B.D

P (5 ≤ x ≤ 7) =P ( X =+
5 ) P ( X =+
6) P ( X =
7)

= 9c5 ( 2 / 3) (1/ 3) + 9c6 ( 2 / 3) (1/ 3) + 9c7 ( 2 / 3) (1/ 3)


5 9 −5 6 9−6 7 9−7

= 126 ( 0.131)( 0.012 ) + 84 ( 0.087 )( 0.037 ) + 36 ( 0.058 )( 0.11)

= 0.198 + 0.27 + 0.229 = 0.69

4. If four coins are tossed 160 times and the number of times x heads occur is given below.

x 0 1 2 3 4
No. of times 8 34 69 43 6

Fit a binomial distribution and find out the expected frequencies.

Answer:

Given that n = 4 , N = 160

p = The probability of Success = getting head

p = 0.5

q =1 − 0.5 =0.5

 the PMF of B.D is p (= )


X x= ncx p x q n − x

p (= )
X x= 4cx 0.5 x 0.5( 4− x )

p (= ) 4cx 0.54
X x=

Quantitative Methods
47
The Expected frequencies

X = x P ( X = x) E(X )
= N
= P ( X x)

0 p ( X= 0=
) 4c0. 0.5=4 0.0625
= E ( 0 ) 160
= ( 0.0625) 10
1 p ( X= 1=
) 4c1. 0.54= 0.25 =
E (1) 160
= ( 0.25) 40
2 p ( X= 2=
) 4c2. 0.5=4 0.375=
E ( 2 ) 160
= ( 0.375) 60
3 p ( X= 3=
) 4c3. 0.5=4 0.25 =
E ( 3) 160
= ( 0.25) 40
4 p ( X= 4=
) 4c4. 0.5=4 0.0625
= E ( 4 ) 160
= ( 0.0625) 10

The PMF of Binomial Distribution is p (= ) 4cx 0.54


X x=

The expected frequencies

x 0 1 2 3 4
8 34 69 43 6
Oi
10 40 60 40 10
Ei

5. Fit a binomial distribution and obtain the expected frequencies for the following data.

x 0 1 2 3 4 5 6
6 28 56 60 36 12 2
f

Answer:

=n = 6, N Sum of the frequencies


= ( 6 + 28 +…+ 2 ) 200

We don’t know the experiment, so that calculate the mean of given data and equate to the
mean of B.D

Xbar = sum ( x * f ) / sum ( f )

x 0 1 2 3 4 5 6
6 28 56 60 36 12 2
f
0 28 112 180 144 60 12
x* f

Quantitative Methods
48
Sum ( f ) = 200

Sum ( x * f ) = 536

Xbar = sum ( x * f ) / sum ( f )

= 536
= / 200 2.68

Men of given data = the mean of B.D

2.68= np − > 2.68= 6 p − > p= 2.68 / 6= 0.44

q = 1 − p = 1 − 0.44 = 0.56

therefore, the PMF of B.D

p (= )
X x= ncx p x q n − x

C X ( 0.44 ) X ( 0.56 )
6− X
= 6
= X 0,1, 2,3, 4,5, 6

The expected frequencies

X =x P ( X = x) E(X )
= N
= P ( X x)

0 p (= )
X 0= 6C0 ( 0.45 ) ( 0.55 ) =
0 6−0
=
0.03 E ( 0) 200 ( 0.03)
= 6

1 p(= )
X 1= 6C1( 0.44 ) ( 0.56 ) =
1 6 −1
=
0.145 E (1) 200 ( 0.145 )
= 28

2 p (= )
X 2= 6C 2 ( 0.44 ) ( 0.56 ) =
2 6− 2
=
0.28 E ( 2) 200 ( 0.28 )
= 56

3 p (= )
X 3= 6C 3 ( 0.44 ) ( 0.56 ) =
3 6 −3
E ( 3)
0.3 = 200 ( 0.3)
= 60

4 p (= )
X 4= 6C 4 ( 0.44 ) ( 0.56 ) =
4 6− 4
=
0.17 E ( 4) 200 ( 0.17 )
= 34

5 p (= )
X 5= 6C 5 ( 0.44 ) ( 0.56 ) =
5 6 −5
=
0.05 E ( 5) 200 ( 0.05 )
= 10

6 p (= )
X 6= 6C 6 ( 0.44 ) ( 0.56 ) =
6 6−6
=
0.007 E ( 6) 200 ( 0.007 )
= 2

Quantitative Methods
49
Self-Assessment Questions

4. Binomial Distribution is discovered by _________.

A). James Bernoulli


B). Pearson
C). Fisher
D). Spearman

5. Mean of Binomial Distribution is ______.

A). p
B). np
C). npq
D). pq

6. Variance of Binomial Distribution is _________.

A). p
B). np
C). npq
D). pq

DID
YOU The Poisson Distribution was formulated by Siemon D Poisson as a
KNOW limiting case of Binomial Distribution.

Quantitative Methods
50
1.2.3 Poisson Distribution
The Binomial distribution will be converted as a poisson distribution under the following cases:

• The number of trails is indefinitely large. i.e., as n → ∞

• The probability of success is indefinitely small. i.e., as p → 0

• The Mean of Binomial Distribution is constant. i.e., np = λ

1.2.3.1 Probability Mass Function (PMF)

The Probability Mass Function (PMF) of Poisson Distribution is

e−λ λ x
P (=
X = x) = ; x 0,1, 2,......., ∞
x!

=0 ; otherwise

Where λ is the parameter of Poisson distribution and e = 2.718

1.2.3.2 Properties of Poisson Distribution

• Mean of Poisson Distribution, E ( x ) = λ

• Variance of Poisson Distribution, V ( x ) = λ

• Standard Deviation of Poisson Distribution, SD ( x ) = Sqrt ( λ )

1.2.3.3 Examples of Poisson distribution

1). The probability that an individual suffers from a bad reaction of certain drug is 0.001, deter-
mine the probability that out of 2000 individuals

(i) exactly 3
(ii) more than 2 suffers from the bad reaction.
Answer:

Given that n = 2000

p = 0.001

e−λ λ x
P (=
X = x) = ; x 0,1, 2,......., ∞
x!

Quantitative Methods
51
λ
where = np
= 2000 ( 0.001
=) 2

i). The probability that exactly 3 members suffering from the bad reaction is

( x 3=
p= ) e −2 23 / 3!

= ( 0.135) 8 / 6
= 0.18

ii). The probability that more than 2 members suffering from the bad reactions

p ( x > 2=
) p (=
x 3) + p (=
x 4 ) + ………… + p (=
x 2000 )

= 1 −  p ( x ≤ 2 ) 

=1 −  p ( x =0) + p ( x =
1) + p ( x =2 ) 

1 −  ( e −2 20 / 0!) + ( e −2 21 /1!) + ( e −2 22 / 2!) 


=

= 1 − e −2 (1 + 2 + 2 )  p ( x ≥ 2 )

= 1 − ( 0.135 ) 5

= 1 − 0.675

= 0.325
= 32.5%

2). 2% of the products produced by a machine are defective. Then what is the probability that out
of 100 products (i) 3 defectives; (ii) defectives between 3 to 5?

Answer:
that n
Given = 100,
= p 0.02

λ
= = 100 ( 0.02=
np ) 2

i). the probability for 3 defectives

( x 3=
p= ) e −2 23 / 3!

Quantitative Methods
52
= ( 0.135) 8 / 6
= 0.18

ii). p ( 3 ≤ x ≤ 5 ) =p ( x =3) + p ( x =4 ) + p ( x =5 )

= e −2 23 / 3! + e −2 24 / 4! + e −2 25 / 5!

= e −2 (1.33 + 0.66 + 0.266 )

= ( 0.135) ( 2.256 )
= 0.304
= 30.4%

3). Fit a Poisson distribution for the following data and also calculate the expected frequencies.

x 0 1 2 3 4

f 109 65 22 3 1

Answer:

Since the experiment is unknown, calculate the mean of the given data and itself is the mean of
Poisson Distribution.

X bar = sum of ( x * f ) / sum of (f)

x 0 1 2 3 4

f 109 65 22 3 1 200

x* f 0 65 44 9 4 122

X bar
= 122
= / 200 0.61

Therefore, λ = 0.61

The PMF of PD

e−λ λ x
P (=
X = x) = ; x 0,1, 2,......., ∞
x!
e −0.61 0.61x
P ( X = x) =
= ; x 0,1, 2,......., ∞ is the required PMF of PD
x!

Quantitative Methods
53
The expected frequencies

X =x P ( X = x) E(X )
= N
= P ( X x)

0
= −.061
e= 0.610 / 0! 0.543 E
= ( 0) 200 ( 0.543
= ) 108.6
= 109

=
1
−.061
e= 0.611 /1! 0.543
= ( 0.61) 0.331 (1)
E= 200 ( 0.331
= ) 66.2
= 66

=
2
−.061
e= 0.612 / 2! 0.543
= ( 0.018) 0.10
= E ( 2) 200 ( 0.10 )
= 20

3
= −.061
e= 0.613 / 3! 0.020 E ( 3)
= 200 ( 0.020 )
= 4

4
= −.061
e= 0.614 / 4! 0.0029 E (=
4) 200 ( 0.0029
= ) 0.5
= 1

The required PMF of PD is

e −0.61 0.61x
P ( X = x) =
= ; x 0,1, 2,......., ∞
x!

The expected frequencies are

x 0 1 2 3 4
Observed Frequencies 109 65 22 3 1

Expected Frequencies 109 66 20 4 1

Quantitative Methods
54
Self-Assessment Questions

7. Poisson Distribution is discovered by _______.

A). James Bernoulli


B). Pearson
C). Siemon D Poisson
D). Spearman

8. Mean of Poisson Distribution is _______.

A). p
B). np
C). npq
D). λ

9. Variance of Poisson Distribution is ________.

A). λ
B). np
C). npq
D). pq

Quantitative Methods
55
1.2.4 Normal Distribution
The Normal Distribution is a continuous probability distribution Developed by Gauss. Therefore,
this distribution is also known as Gaussian Distribution. For increased trails, all other distributions
tend to Normal Distribution.

1.2.4.1 Probability Density function


The Probability Density function of Normal distribution is

1 ( x − µ )2
f ( x; µ , σ )
= exp{− }
2πσ 2σ 2

Where µ, σ 2 are the parameters of Normal Distribution.

1.2.4.2 Properties of Normal Distribution

• Mean of Normal Distribution is E ( x ) = µ

Variance of Normal Distribution, V ( x ) = σ


2

• Normal Distribution have the range between −∞ to + ∞
• The Normal Probability curve is symmetrical or bell-shaped curve

Fig. 1: Representation of Normal Distribution

Quantitative Methods
56
1.2.4.3 Examples of Normal Distribution

1). Let X is a normal variate with mean 30 and S.D 5, then find the probabilities that

i). p ( 26 ≤ x ≤ 40 ) ii ) p ( x ≥ 45 ) .

Answer:
Given that mean, µ = 30
Standard Deviation, σ = 5 and x is normal variate

i). p ( 26 ≤ x ≤ 40 ) =p (( 26 − µ ) / σ ≤ ( x − µ) / σ ≤ ( ( 40 − µ ) / σ )
= p (( 26 − 30 ) / 5 ≤ z ≤ ( ( 40 − 30 ) / 5 )

= p ( −0.8 ≤ z ≤ 2 )

= p ( 0 ≤ z ≤ 0.8 ) + p ( 0 ≤ z ≤ 2 )
= 0.28814 + 0.47725 = 0.756

ii). p ( x ≥ 45 ) = p ( ( x − µ ) / σ ≥ ( 45 − µ ) / σ )
= p(z ≥ ( 45 − 30 ) / 5) = p ( z ≥ 3) = 0.5 − P ( 0 ≤ Z ≤ 3)
=0.5 − 0.4987 =0.0013

2). For a Normally distributed variate with mean 1 and Standard Deviation 3, then find

i). p ( 3.43 ≤ x ≤ 6.19 ) ii) p ( −1.43 ≤ x ≤ 6.19 ) iii ) p ( x ≥ 2 ) .

Sol: Given that mean, µ = 1 , Standard Deviation, σ = 3

i) p ( 3.43 ≤ x ≤ 6.19
= ) p ( ( 3.43 − µ ) / σ ≤ ( x − µ ) / σ ≤ ( 6.19 − µ ) / σ )

= p ( ( 3.43 − 1) / 3 ≤ z ≤ ( 6.19 − 1) / 3)
= p ( 0.81 ≤ z ≤ 1.73)

= p ( 0 ≤ z ≤ 1.73) − p ( ≤ z ≤ 0.81)

= 0.45818 − 0.29130
= 0.1671

Quantitative Methods
57
Self-Assessment Questions

10. Normal Distribution is also known as __________ Distribution.

A) Binomial
B) Gaussian
C) Poisson
D) Geometric

11. Mean of Normal Distribution is ________.

A) p
B) np
C) µ
D) λ

12. Variance of Normal Distribution is _________.

A) σ2
B) np
C) µ
D) λ

Quantitative Methods
58
Summary

• The concept of random variable is an extension to the concept of probability.


• Discrete and continuous random variables are the varieties of random variables.
• Binomial and Poisson Distributions are the discrete probability distributions.
• Normal Distribution is a continuous probability distribution.

Terminal Questions

1. Discuss Binomial Distribution and explain its properties.


2. Discuss Normal Distribution and explain its properties.
3. Discuss the Poisson Distribution and explain its properties.
4. If four coins are tossed 160 times and the number of times x heads occurs is given
below.

x 0 1 2 3 4
No. of times 8 34 69 43 6

5. Fit a Poisson distribution and also find out the expected frequencies.
6. If four coins are tossed 160 times and the number of times x heads occurs is given
below.

x 0 1 2 3 4
No. of times 8 34 69 43 6

7. Fit a binomial distribution and also find out the expected frequencies.
8. Fit a Binomial distribution for the following data and also calculate the expected
frequencies.

x 0 1 2 3 4
f 109 65 22 3 1

9. If four coins are tossed 160 times and the number of times x heads occurs is given
below.

x 0 1 2 3 4
No. of times 8 34 69 43 6

10. For a Normally distributed variate with mean 1 and Standard Deviation 3, then find

i) p ( 3.43 ≤ x ≤ 6.19 ) ii) p ( −1.43 ≤ x ≤ 6.19 ) iii) p ( x ≥ 2 )

Quantitative Methods
59
Answer Keys

Self-Assessment Questions

Question No Answers

1 B

2 A

3 D

4 A

5 B

6 C

7 C

8 D

9 A

10 B

11 C

12 A

Quantitative Methods
60
Glossary
• Random Variable: A random variable is a real valued function, which assigns
every element in the sample space to a real number.
• Discrete and continuous random variables: These are different types of
random variables.
• Binomial Distribution: The B.D was discovered by James Bernoulli in 1700 as
an extension to Bernoulli Distribution. An experiment is said to follow the B.D
under the following condition that the experiment should be repeated for a finite
number of times (say n times).
• Poisson Distribution: The Poisson Distribution was developed by Siemon D
Poisson as a limiting case of Binomial Distribution.
• Normal Distribution: A continuous probability distribution is called Normal
Distribution.

BIBLIOGRAPHY
1. Chung, K. L. (2000). A Course in Probability Theory (3rd ed.). San Diego, CA:
Academic Press.
2. Feller, W. (1968). An introduction to Probability Theory and its applications: Vol-
ume I. John Wiley & Sons.
3. Gupta, S. C., & Kapoor, V. K. Fundamentals of Mathematical Statistics. S. Chand
Publications.

External Resources
1. Rukmangadachari, E. Probability and Statistics. Person Education.
2. Rohatgi, V. K., & Ehsanes Saleh, A. K. (2015). An introduction to probability and
statistics (3rd ed.). doi:10.1002/9781118799635

e-References
• Normal Distribution: What It Is, Properties, Uses, and Formula: https://www.investo-
pedia.com/terms/n/normaldistribution.asp

• Poisson distribution: https://www.britannica.com/topic/Poisson-distribution

Quantitative Methods
61
Image Credits

Representation of Normal Distribution


https://courses.lumenlearning.com/
Fig. 1 suny-natural-resources-biometrics/
chapter/chapter-1-descriptive-statis-
tics-and-the-normal-distribution/

Video Links

Topic Link
Binomial distribution https://www.youtube.com/
watch?v=WWv0RUxDfbs
An Introduction to the Poisson Distribution https://www.youtube.com/watch?v=-
jmqZG6roVqU
Normal Distribution, Clearly Explained!!! https://www.youtube.com/
watch?v=rzFX5NWojp0

Keywords

• Binomial distribution
• Poisson distribution
• Normal distribution
• Discrete random variable
• Gaussian distribution

Quantitative Methods
62
QUANTITATIVE METHODS

Module - 1
Unit - 3

INTRODUCTION TO
R PROGRAMMING

Quantitative Methods
63
Unit Table of Contents
Unit 1.3 Introduction to R Programming

Aim -------------------------------------------------------------------------------------------------------------- 65
Instructional Objectives ------------------------------------------------------------------------------------ 65
Learning Outcomes ----------------------------------------------------------------------------------------- 65
Introduction ---------------------------------------------------------------------------------------------------- 66
1.3.1 Evolution and Features of R Programming -------------------------------------------------- 66
Self-Assessment Questions --------------------------------------------------------------------- 68
1.3.2 Operators in R Programming ------------------------------------------------------------------- 69
Self-Assessment Questions --------------------------------------------------------------------- 73
1.3.3 Data Structures in R Programming ------------------------------------------------------------ 74
Self-Assessment Questions --------------------------------------------------------------------- 79
Summary ------------------------------------------------------------------------------------------------------- 80
Terminal Questions ------------------------------------------------------------------------------------------ 80
Answer Keys -------------------------------------------------------------------------------------------------- 81
Glossary -------------------------------------------------------------------------------------------------------- 82
Bibliography --------------------------------------------------------------------------------------------------- 82
External Resources ----------------------------------------------------------------------------------------- 82
e-References ------------------------------------------------------------------------------------------------- 82
Video Links ---------------------------------------------------------------------------------------------------- 83
Keywords ------------------------------------------------------------------------------------------------------ 83

Quantitative Methods
64
Aim
This unit aims to explain the basic concepts of R Programming and its applications
in Management.

Instructional Objectives
This unit intends to:
● Explain the basic concepts of R Programming
● Analyse the application of R Programming in management

Learning Outcomes
At the end of this unit, you are expected to:
● Describe the basic concepts of R Programming
● Apply the concepts of R Programming in Marketing, HRM, Finance and other
functional areas

Quantitative Methods
65
Introduction
R Programming is a concept that illustrates the necessity of utilising programming approaches to
apply statistical tools. R is a widely used statistical software and data analysis tool that is written
in an open-source programming language. R comes with a command-line interface by default. R
is available on a variety of platforms, including Windows, Linux, and Mac OS X. In addition, the
R programming language is the most up-to-date tool.

1.3.1 R Programming’s Evolution and Features


1.3.1.1 R Programming Evolution

• R is an open source (freely downloaded and updatable) programming language that was
created in 1992 by Ross Ihaka and Robert Gentleman at Auckland University in New
Zealand.
• It was formerly known as the R&R Language and is now known as the R Programming
Language.
• The Comprehensive R Archive Network created R, a programming language and free
software environment for statistical computing and graphics (CRAN).

1.3.1.2 R Programming Features

• Statistical Programming Language (SPL) is a programming language for calculating sta-


tistics.
• Analytical Tool for Big Data (up to 1 billion data).
• Big Data’s Four V’s (Volume, Variety, Velocity and Veracity).
• R has a lot of packages (Set of functions) (about 14,000 Packets).
• R is a well-developed, simple, and powerful programming language that supports condi-
tionals, loops, user-defined recursive functions, and input and output.
• R features a powerful data handling and storing system.
• R has several operators for working with arrays, lists, vectors, and matrices.
• R provides a wide, cohesive, and integrated set of data analysis capabilities.
• R includes graphical tools for data analysis and visualisation, which can be done on the
computer or printed.
• In comparison to other technologies, R programming is superior because
- Libraries with Graphics
- Cost / Availability
- Improvements in Tooling
- Scenario for a Job
- Customer service that is second to none
- R 1.1.1 is the starting version.
- R 4.1.2 is the most recent version of R.
- R-based platforms

Quantitative Methods
66
R-Console is a command-line interface for the R-Console (R-Gui)

• R-Script is a programming language that allows you to create your own scripts (Input)
• R-Controller (Output)

R-Studio (IDE - Integrated Development Environment) is a programming environment for R.

• R-Script is a programming language that allows you to create your own scripts (Input)
• R-Controller (Output)
• R-Environment is an acronym for “Responsible Environment” (Saving Variables)
• R-Files & Folders (R-Files & Folders) (R-Files & Folders (Packages)

1.3.1.3 Variables

• In R, a variable is a name for a memory region. In R programming, variables can be used


to hold real and complex numbers, text, matrices, and even tables.
• In R programming, everything is an object.
• These rules must be followed for a variable to be valid.
• Only letters, numbers, and dot or underscore characters should be used.
• It must not begin with a number (e.g.: - 2iota)
• It should not begin with a dot and then a number (e.g.: - .2iota)
• It shouldn’t begin with an underscore (for example, _iota).

• E.g.: x = 15

• mystring = “Hello, World!”

Quantitative Methods
67
Self-Assessment Questions

1. R Programming is open source. What do you say?

A) Yes
B) No
C) May be
D) Can’t say

2. Number of characteristics of Big Data are __________.

A) 2
B) 3
C) 4
D) 1

3. ‘Number of Platforms in R’ are____________.

A) 5
B) 3
C) 2
D) 1

4. _____ maintains R Programming.

A) CLASS
B) CALM
C) COMB
D) CRAN

Quantitative Methods
68
1.3.2 Operators in R Programming
An operator is a symbol that tells the compiler to perform specific mathematical or logical manip-
ulations. R language is rich in built-in operators and provides following types of operators:

 Assignment Operator
 Arithmetic Operator
 Combining Operator
 Sequence or Colon Operator
 Relational Operator
 Logical Operator

1.3.2.1 Assignment Operator

 These operators are used to assign values to vectors or variables.


 Various assignment operators are
 = - Equality Operator
 <− - Left Assignment Operators
 << − - Left Assignment Operators
 −> - Right Assignment Operators
 − >> - Right Assignment Operators
 5
E.g.: > x =
> 5− > x
> 10− >> y
> x < −4
> y << −15

1.3.2.3 Arithmetic Operator

These operators are used to do the simple mathematical calculations:

+ - Addition
– - Subtraction
* - Multiplication
/ - Division
^ - Exponent
%% - Modulus (Remainder from division)
% / % - Integer Division

Quantitative Methods
69
E.g.: > x < − 5

> y < − 16

> x + y [1] 21

> x − y [1] − 11

> x * y [1] 80

> y / x [1] 3.2

> y % / % x [1] 3

> y %% x [1] 1
> y x [1]1048576

1.3.2.4 Combining Operator

C - Combining or concatenate operator.

E.g.: a = c (1, 2,3, 4,5 )


Output: 1 2 3 4 5

1.3.2.5 Sequence Operator / Colon

: - Sequence Operator
E.g.: ü
Output: 1 2 3 4 5

1.3.2.6 Relational Operator

< - Less than


> - Greater than
<= - Less than or equal to
>= - Greater than or equal to
== - Equal to
!= - Not equal to

E.g.: > x < − 5


> y < − 16

> x < y [1] TRUE

Quantitative Methods
70
> x > y [1] FALSE

> x <= 5 [1] TRUE

> y >= 20 [1] FALSE

>y =
=16 [1] TRUE

5 [1] FALSE
> x !=

1.3.2.7 Logical Operator

! - Logical NOT
& - Element-wise logical AND
&& - Logical AND
| - Element-wise logical OR
|| - Logical OR
• Note: Operators & and | perform element-wise operation producing result
having length of the longer operand.
 But && and || examines only the first element of the operands resulting into a
single length logical vector.
 Zero is considered FALSE and non-zero numbers are taken as TRUE. An exam-
ple follows.

E.g.: > x < − c (1, 2,3, 4 )

> y < − c ( 5, 6, 7,8 )

> ! x [1] FALSE FALSE FALSE FALSE

> x & y [1] TRUE TRUE TRUE TRUE

> x & & y [1] TRUE

> x | y [1] TRUE TRUE TRUE TRUE

> x || y [1] TRUE

1.3.2.8 Data Types and objects in R

Everything in R is an object.

Quantitative Methods
71
R has 5 basic data types or Objects

• Character
" a " , " swc "
• Numeric (real or decimal)
2 , 15.5
• Integer
2L (the L tells R to store this as an integer)
• Logical
TRUE , FALSE
• Complex
1 + 4i (Complex numbers with real and imaginary parts)

Note: there are three functions in R to check the data type- typeof ( ) , class ( ) , mode ( )

R provides many functions to examine features of vectors and other objects, for example
 class ( ) - what kind of object is it (high-level)?
x < − " ramana "

> class ( x )

[1] " character ''


 typeof ( ) - what is the object’s data type (low-level)?
y < − 1: 5

> typeof ( y )

[1] " integer "


● lengthof ( ) - how long is it?
y < − 1: 5
> y

[1] 1 2 3 4 5
> length ( y )

[1] 5

Quantitative Methods
72
Self-Assessment Questions

5. %% operator is indicating ____________.

A) Percentage
B) Modulus
C) Addition
D) Subtraction

6. Number of operators in R are___________.

A) 2
B) 3
C) 4
D) 6

7. Number of Data types in R are___________.

A) 5
B) 3
C) 2
D) 5

8. Combining or concatenate operator is represented as ________.

A) f
B) a
C) c
D) d

Quantitative Methods
73
1.3.3 Data Structures in R Programming
R has many data structures. These include:

 Vector
 Matrix
 Array
 List
 Data frame
 Factors

1.3.3.1 Vector

• Vector is a series of numbers or alphabets stored in a concatenate operator ‘c’.

• E.g.: x < −c (1, 2,3, 4,5 )

y < −c (‘x’,’ y’,’z’)


x 12345
y ‘x’ ‘ y’ ‘z’

length ( x ) 5

Length ( y ) 3

x [ 4] 4

1.3.3.2 Matrix

• Matrix is a 2-dimensional data structure in R .

• Syntax: matrix ( vector , nrow, ncol , byrow ) ; Default: byrow – F


• E.g.: a < −1: 9

> m < −matrix ( a, nrow


= 3,= = F)
ncol 3, byrow
> m

[,1] [, 2] [,3]
[1,] 1 4 7

[ 2,] 2 5 8

[3,] 3 6 9

Quantitative Methods
74
> m < −matrix ( a, nrow
= 3,= = T)
ncol 3, byrow
> m

[,1] [, 2] [,3]
[1,] 1 2 3

[ 2,] 4 5 6

[3,] 7 8 9

1.3.3.3 List

• List is a set of vectors of different sizes with different data types

• Syntax: list ( v1, v 2, v3...)


 a: 1 2 3 4 5 6 7 8 9
 f :1 2
 m:

[,1] [, 2]
[1,] 1 4

[ 2,] 5 6

 > list ( a, f , m )
[1]

[1] 1 2 3 4 5 6 7 8 9
[ 2]

[1] 1 2
[3]

[,1] [, 2]
[1,] 1 4

[ 2,] 5 6

Quantitative Methods
75
1.3.3.4 Array

• An array is a multi-dimensional data structure.

 Syntax: array (vector , dim = c ( r , c, d ) , dimnames = list ( rownames, colnames ))

 E.g.: > ab < −c (1, 2,3, 4,5, 6 )


> ab

[1] 1 2 3 4 5 6
> ar < −array (ab, dim =c ( 2,3,1) , dimnames = list ( c ( ' m ', ' f ') , c ( ' a ', ' b ', ' c ') ))
> ar
abc
m 1 3 5
f 246

1.3.3.5 Data frame

• Data frame is a set of vectors of different data types of same size

• Syntax: data. frame ( v1, v 2, v3....)


• Input:

 > sno < −c (1, 2,3, 4,5 )

 > name < −c (‘Raja’,’Ramu’,’Krish’,’hema’,’ravi’)

 > marks < −c (12,12,14,11,15 )

 > df < −data. frame ( sno, name, marks )

 Output: > df

 S .no name marks


1 Raja 12

2 Ramu 12
3 Krish 14

4 Hema 11
5 Ravi 15

Quantitative Methods
76
 Note: The function cbind( ) is used to add a column and rbind( ) is used to add a row to
the existing data frame.

 > region < −c (‘vij’,’nell’,’ong’,’hyd ’,’kdp’)

 > df < −cbind ( df , region )


 > df

 S .no name marks region

1 Raja 12 Vij

2 Ramu 12 Nell
3 Krish 14 Ong

4 Hema 11 Hyd

5 Ravi 15 Kdp

 > df 1 < −data. frame


= ( 6, 7), name c (‘Seeta’,’Geeta
( sno c= = ’) , marks c (15,14
= ) , region c (‘gun’,’gun’) )
 > df < −rbind ( df , df 1)

 > df
 S .no name marks region

1 Raja 12 Vij

2 Ramu 12 Nell

3 Krish 14 Ong

4 Hema 11 Hyd

5 Ravi 15 Kdp
6 Seeta 15 Gun

7 Geeta 14 Gun

• Note: To remove any record or entry, use ” [ −]” .

Quantitative Methods
77
1.3.3.6 Factors

 Factor - a categorical variable

 is. factor () - to check whether the vector is a factor or not


 as. factor () - converts vector into a factor

 > gd < −c (1,1,1,1, 2, 2, 2, 2, 2,1,1)

 > is. factor ( gd )


 [1] FALSE
 > gd

 [1] 1 1 1 1 2 2 2 2 2 1 1
 > gd < −as. factor ( gd )
 > gd

 [1] 1 1 1 1 2 2 2 2 2 1 1
 Levels : 1 2

 > is. factor ( gd )


 [1] TRUE

Quantitative Methods
78
Self-Assessment Questions

9. Categorical variable is represented by _______________.

A) Array
B) Modulus
C) Factor
D) Subtraction

10. Number of Data structures in R are_____________.

A) 2
B) 3
C) 4
D) 6

11._________ is a set of vectors of different sizes with different data types.

A) List
B) Factor
C) Array
D) Matrix

12. _____________is a set of vectors of different data types of same size.

A) List
B) Factor
C) Array
D) Data frame

Quantitative Methods
79
Summary

• R is an open-source statistical programming language.


• R is used to analyse the BIG DATA.
• R is rich in packages.
• R have various operators and Data structures to input and analyse the data

Terminal Questions

1. Discuss the evolution and Development of R Programming.


2. Explain the basic features of R Programming.
3. Differentiate R Console and R Studio.
4. Discuss the various Operators in R Programming.
5. Explain the Data types in R with suitable examples
6. Illustrate the Matrix with examples.
7. Illustrate the Array with examples.
8. Construct a date frame of Employee database.

Quantitative Methods
80
Answer Keys

Self-Assessment Questions

Question No Answers

1 A

2 C

3 C

4 D

5 B

6 D

7 D

8 C

9 C

10 D

11 A

12 D

Quantitative Methods
81
Glossary
• Data frame: Is a data structure that organises data into a 2-dimensional table
of rows and columns, much like a spreadsheet.
• List: An ordered data structure with elements separated by a comma and en-
closed within square brackets.
• Array: A data structure consisting of a collection of elements (values or vari-
ables), of same memory size, each identified by at least one array index or key.
• Modulus: A modulus function is a function which gives the absolute value of a
number or variable.

BIBLIOGRAPHY
1. Chung, K. L. (2000). A Course in Probability Theory (3rd ed.). San Diego, CA:
Academic Press.
2. Feller, W. (1968). An introduction to Probability Theory and its applications:
Volume I. John Wiley & Sons.
3. Gupta, S. C., & Kapoor, V. K. Fundamentals of Mathematical Statistics. S. Chand
Publications.

External Resources
1. Rukmangadachari, E. Probability and Statistics. Person Education.
2. Rohatgi, V. K., & Ehsanes Saleh, A. K. (2015). An introduction to probability and
statistics (3rd ed.). doi:10.1002/9781118799635

e-References
• Operators in R programming: https://intellipaat.com/blog/tutorial/r-programming/
operators/

• Data Structures in R Programming: https://study.com/academy/lesson/data-struc-


tures-in-r-programming.html

Quantitative Methods
82
Video Links

Topic Link
R Programming Tutorial - Learn the Basics https://www.youtube.com/watch?v=_
of Statistical Computing V8eKsto3Ug
R Programming for Beginners https://www.youtube.com/
watch?v=BvKETZ6kr9Q
Introduction to R Programming for Excel https://www.youtube.com/
Users watch?v=Ekp2mfxQSzw

Keywords

• CRAN
• Console
• Studio
• Big data
• Matrix

Quantitative Methods
83
QUANTITATIVE METHODS

Module - 2

OVERVIEW OF SAMPLING

Quantitative Methods
84
Module Description

The method of sampling depends on the type of analysis being performed, although it may
include simple random sampling or systematic sampling. Sampling is a process in statistical
analysis when researchers take a specific number of observations from a larger population.

In most biomedical studies, researchers form hypotheses on the connections between different
variables, gather data to verify those connections, and then attempt to infer connections from
the data gathered. Investigators frequently compare the average amount of a factor between two
groups or between one group and a standard reference to examine relationships.

A method for determining how well one can extrapolate observed results in a study sample to
the larger population from which the sample was drawn, hypothesis testing is a process used
to assess the strength of the evidence from the sample and provides a framework for making
decisions related to the population.

These presumptions can scarcely be met, nevertheless. Non-Parametric Tests are either
distribution-free or have a given distribution but do not specify the distribution’s parameters. They
also have significantly more loose assumptions.

This Module is divided into the following units:

Unit 2.1 Introduction to Sampling


Unit 2.2 Estimation and Testing of Hypothesis
Unit 2.3 ANOVA & Non-parametric Testing of Hypothesis

Quantitative Methods
85
QUANTITATIVE METHODS

Module - 2
Unit - 1

INTRODUCTION TO SAMPLING

Quantitative Methods
86
Unit Table of Contents
Unit 2.1 Concept of Probability

Aim -------------------------------------------------------------------------------------------------------------- 88
Instructional Objectives ------------------------------------------------------------------------------------ 88
Learning Outcomes ----------------------------------------------------------------------------------------- 88
Introduction ---------------------------------------------------------------------------------------------------- 89
2.1.1 Basic Concept of Sampling ---------------------------------------------------------------------- 89
Self-Assessment Questions --------------------------------------------------------------------- 91
2.1.2 Types of Sampling --------------------------------------------------------------------------------- 92
Self-Assessment Questions --------------------------------------------------------------------- 95
2.1.3 Sampling distributions ---------------------------------------------------------------------------- 96
Self-Assessment Questions --------------------------------------------------------------------- 97
2.1.4 Application of Central Limit Theorem --------------------------------------------------------- 98
Self-Assessment Questions --------------------------------------------------------------------- 99
Summary ------------------------------------------------------------------------------------------------------- 100
Terminal Questions ------------------------------------------------------------------------------------------ 100
Answer Keys -------------------------------------------------------------------------------------------------- 101
Glossary -------------------------------------------------------------------------------------------------------- 102
Bibliography --------------------------------------------------------------------------------------------------- 102
External Resources ----------------------------------------------------------------------------------------- 102
e-References ------------------------------------------------------------------------------------------------- 102
Video Links ---------------------------------------------------------------------------------------------------- 103
Keywords ------------------------------------------------------------------------------------------------------ 103

Quantitative Methods
87
Aim
This unit aims to explain the basic concepts of probability sampling and its
applications in Management.

Instructional Objectives
This unit intends to:
● Explain the concepts of Sampling and sampling procedures
● Apply the sampling methods in various applications of management

Learning Outcomes
At the end of this unit, you are expected to:
● Utilise the concepts of Sampling in various management applications
● Compare and contrast different sampling methods to select sample from the
population

Quantitative Methods
88
Introduction
The process of selecting a group of people from a population to research and characterise
the population is known as sampling. All members of a given group, as well as all conceivable
outcomes or measurements, are included in the population. The specific population will be
determined by the study’s scope.

When conducting research on a group of people, it’s uncommon that you’ll be able to collect data
from every single one of them. Rather, you choose a sample. The sample is the group of people
that will take part in the study. You must carefully consider how you will select a sample that is
representative of the entire group to make accurate conclusions from your findings.

2.1.1 The Basic Concept of Sampling


The sampling procedure is deriving inferences about an entire population based on measurements
taken from only a small percentage of the total population. Sampling allows researchers to
do research in situations when getting measurements from everyone or on everything would
be difficult. A sample is a subset, or a portion, of a larger population that is used to estimate
population characteristics.

The phrase “sampling” refers to the process of selecting individuals from


DID a population to be examined. A population (universe) is a whole group of
YOU
people from which a sample is drawn. People, sales territory, locations,
KNOW
items, or college students, for example, who all share a set of qualities.
A population element is the name given to each individual member.

Population:

A population is the total number of animates or in-animates included in a study. The number of
units in a population is referred to as Population Size, and it is represented by the letter N.

E.g.: The number of students in a college, for example, is a finite number.

There are an infinite number of stars in the sky.

Sample:

Sample is a term that refers to any finite subset of a population. Sample Size is the number of
units in a sample and is indicated by n.

E.g.: A sample is, for example, the number of students in a college class.

A sample is the number of stars in a certain area of the sky.

Parameter:

A parameter is any statistical constant in a population.

Quantitative Methods
89
For example, Population Mean (), Sample Variance (2) and Population Proportion (P) are all
examples of population statistics.

Statistic:

Any statistical constant of a Sample is referred to as a Statistic.

For instance, Sample Mean (), Sample Variance (s2), Population Proportion (p), and so on.

Sampling:

Sampling is the process of picking a representative sample from a larger population. That is, the
sample should include all of the population’s characteristics.

Quantitative Methods
90
Self-Assessment Questions

1. The aggregate of Animates or In-animates under a study is called as__________.

A) Population
B) Sample
C) Parameter
D) Statistic

2. Any finite subset of a population is called as ___________.

A) Population
B) Sample
C) Parameter
D) Statistic

3. Any statistical constant of a Population is called as ____________.

A) Population
B) Sample
C) Parameter
D) Statistic

4. Any statistical constant of a Sample is called as_____________.

A) Population
B) Sample
C) Parameter
D) Statistic

Quantitative Methods
91
2.1.2 Types of Sampling
Different types of sampling samples are used depending on the population. There are two types
of approaches for picking a representative sample from a population:

(i) Probability Based or Random Sampling Methods.


(ii) Non-probabilistic or non-random sampling methods

Probability Based Sampling Methods:

The method in which each population unit have the same chance to be appear in the Sample,
which can be classified as:

● Simple Random Sampling


● Stratified Sampling
● Systematic Sampling
● Cluster Sampling
● Area Sampling
● Multistage Sampling

Non-Probability Based Sampling Methods:

The method in which each population unit not have the same chance to be appear in the Sample,
which can be classified as:

● Purposive Sampling
● Judgement Sampling
● Convenience sampling
● Quota Sampling
● Panel Sampling
● Snowball Sampling

Simple Random Sampling

The sample units are chosen at random in random sampling. Following the definition of the
‘parent population,’ each item in that population has an equal chance of being included in any
sample. This method necessitates extreme caution to guarantee that samples are chosen at
random. A completely random option may not always be possible. However, the investigator
should strive to achieve as close to the ideal of random selection as possible.

Stratified Random Sampling

A stratified sampling strategy can be used when the population is heterogeneous in terms of the
variables under investigation and can be separated into reasonably homogeneous groups and
subgroups. When there are large groups of known size within the ‘parent population,’ this sort of
sampling is used to ensure that each subgroup is fairly represented within the entire sample. For
example, imagine a village has a population of 10,000 people and is separated into ten groups

Quantitative Methods
92
based on economic characteristics. A random sample representing the income of each subgroup
will be collected.

The fundamental benefit of stratified sampling is that it is simple to administer, and each stratum
is represented in the sample (which may not be the case with random and purposive selection),
allowing for distinct estimates for stratum means if needed. In agricultural, industrial, and applied
geographical research, stratified random sampling is commonly utilised.

Systematic Sampling

Instead of selecting everyone individually, this method uses a regular pattern of selection. This
technique is also referred to as quasi-random. For example, if a crop combination study is to be
conducted in 2,000 villages in an aerial unit and 20 sample villages are to be chosen, the villages
should be numbered from 1 to 2,000.

After serially arranging the villages, the hundredth village on the list is selected. The sample
villages that are required will be reached fast. When utilised correctly, systematic sampling can
be more convenient and effective than genuine random sampling. This method, while useful for
quick and effective sampling, has the drawback of subjectivity because not every village in the
area has an equal probability of being included in the sample.

Cluster sampling

Cluster sampling is a probability sampling approach that divides the population into different
groups (clusters) for research purposes. For data collection and analysis, researchers use a
basic random or systematic random sampling technique to pick random groups.

Area Sampling

When there isn’t a complete frame of reference available, area sampling is used. The entire area
under inquiry is divided into small sub-areas that are sampled at random or according to a set of
rules (stratification of sampling).

Multistage sampling

Multistage sampling is a sample strategy for doing research that splits the population into groups
(or clusters). Significant clusters of the selected people are broken into sub-groups at various
points during this sampling process to make primary data collecting easier.

Purposive sampling

Purposive sampling is a type of non-probability sampling in which researchers choose people


from the public or population based on their purpose to participate in the surveys.

Judgment sampling

Judgment sampling, also known as judgmental sampling or authoritative sampling, is a non-


probability sampling approach in which the researcher selects sample units based on his prior
knowledge or professional judgment.

Quantitative Methods
93
Convenience sampling

Convenience sampling (also known as grab sampling, inadvertent sampling, or opportunity


sampling) is a non-probability sampling technique in which a sample is taken from a population
segment that is close to hand. For pilot testing, this form of sampling is ideal.

Quota sampling

Quota sampling is a non-probability sampling technique in which researchers create a sample


of people who represent a population. These people were chosen by the researchers based on
specific characteristics or features. These samples can be used to estimate the population as a
whole.

Panel sampling

Panel sampling is selecting a set of people at random to be part of a panel that participates in a
study multiple times over a period of time. In a longitudinal survey, for example, the same group
of people may be polled periodically throughout time.

Snowball sampling

Snowball sampling (also known as chain sampling, chain-referral sampling, or referral sampling)
is a non-probability sampling approach used in sociology and statistics research, in which
current study subjects recruit prospective study subjects from among their friends. As a result,
the sample group is described as growing like a snowball.

Quantitative Methods
94
Self-Assessment Questions

5. Probability based sampling methods are also called ________sampling methods.

A) Population
B) Random
C) Parameter
D) Statistic

6. Sampling techniques are classified into___________ methods.

A) 3
B) 4
C) 2
D) 5

7. Relatively homogeneous groups and subgroups are the base for ____ Sampling.

A) Sample random
B) Quota
C) Space
D) Stratified

8. Judgement Sampling is _______ technique.

A) Random
B) Non-random
C) Space
D) Event

Quantitative Methods
95
2.1.3 Sampling distributions
A sampling distribution is a probability distribution of a statistic derived from a large number
of samples gathered from a particular population. The sampling distribution of a population is
the frequency distribution of a range of alternative outcomes that could occur for a population
statistic.

Academicians, statisticians, researchers, marketers, analysts, and others draw and employ a lot
of samples, not populations. A subset of a population is referred to as a sample. For example,
a medical researcher who wanted to compare the average weight of all babies born in North
America from 1995 to 2005 to those born in South America during the same time period couldn’t
draw the data for the entire population of over a million childbirths over the ten-year period in a
reasonable amount of time. Instead, he’ll base his judgement on the weight of, say, 100 babies
on each continent. The sample consists of 200 new-borns, with the average weight determined
as the sample mean.

2.1.3.1 Sampling distribution of mean and proportion

What most laypeople call an ‘average’ and statisticians call the arithmetic mean is the most
popular and widely used metric of describing the total data by one value. Its value is calculated by
adding all of the things together and dividing the sum by the number of items. Simple arithmetic
mean or Weighted arithmetic mean are two types of arithmetic mean.

It should be emphasised that statisticians dislike the term “average” because it connotes a too
ambiguous meaning. It has a variety of connotations. For instance, an average person, average
pay, average height, and so on. It can refer to any average, including mean, median, mode,
geometric mean, harmonic mean, and so on. In fact, the arithmetic mean is so widely used that
the term ‘mean or average’ is often used without qualifier to refer to this sort of average. That is,
unless otherwise mentioned, it is reasonable to presume that when someone says “the mean” or
“the average” of a set of observations, they are referring to the arithmetic mean.

Quantitative Methods
96
Self-Assessment Questions

9. A sampling distribution is a probability distribution of _________.

A) Population
B) Random
C) Parameter
D) Statistic

10. Medical research on sampling distribution is conducted at_________.

A) North America
B) India
C) UK
D) Germany

11. Mean is also known as _________.

A) Population
B) Random
C) Average
D) Statistic

12. Arithmetic mean is of _____ types.

A) 3
B) 2
C) 4
D) 5

Quantitative Methods
97
2.1.4 Application of Central Limit Theorem
The central limit theorem asserts that if you collect sufficiently enough random samples with
replacement from a population with a mean and standard deviation, the distribution of the
sample means will be nearly normally distributed. This holds true whether the source population
is normal or skewed, as long as the sample size is large enough (often 30). The theory is
valid even for samples less than 30 if the population is normal. This is true even if the population
is binomial, as long as min (np, n(1,p))>5 where n is the sample size and p is the population’s
probability of success. This means that when generating conclusions about a population mean
based on the sample mean, we can utilise the normal probability model to measure uncertainty.

2.1.4.1 Determining the sample size

Consequential research necessitates a grasp of the statistics that underlie the various sample
size decisions that must be made. A simple equation will allow you to set the migraine tablets
down and confidently sample, knowing that your survey will be statistically accurate with the
proper sample size.

Variables in Sample Size Based on the Population Targeted

Before you can calculate a sample size, you must first figure out a few details about the target
demographic and the sample size you require:

1. Size of the Population – How many people fall into your demography as a whole? If you
want to know about moms in the United States, for example, your population size would be
the total number of mothers in the United States. Not every population needs to be this big.
Even if your population is small, you should be aware of who falls within your demographics.
If you’re unsure about the actual amount, don’t worry. The population is frequently unknown
or approximated using two educated guesses.

2. Margin of Error (Confidence Interval) – Because no sample is perfect, you must choose
how much error to allow. The confidence interval specifies how much higher or lower you
are ready to allow your sample mean to deviate from the population mean. You’ve probably
seen a confidence interval in a political survey on the news. It will look something like this, for
example: “With a margin of error of +/- 5%, 68 percent of voters responded yes to Proposition
Z.”

3. Level of Confidence — How confident do you want to be that the actual mean falls inside
your confidence interval? 90 percent confident, 95 percent confident, and 99 percent
confident are the most common confidence intervals.

4. Standard of Deviation – Do you anticipate a lot of variation in your responses? Because we


haven’t yet conducted our survey, we’ll go with 0.5 because it’s the most forgiving figure and
ensures that your sample will be large enough.

Quantitative Methods
98
Self-Assessment Questions

13. Population mean is denoted with ________ symbol.

A) .
B)
C)
D)

14. In large sample, _________.

A) 30
B) 25
C) 42
D) 51

15. There are _______number of methods to determine sample size.

A) 2
B) 3
C) 4
D) 1

16. Expected margin of error is_________.

A) 5%
B) 10%
C) 1%
D) 20%

17. Sample size determined based on_________ size.

A) Error
B) Statistic
C) Parameter
D) Population

Quantitative Methods
99
Summary

• The unit introduction to sampling helps in understanding the basic concepts of sam-
pling such as population, sample, parameter and statistic.
• Various sampling techniques such as simple random, stratified, systematic, cluster
sampling, etc., will provide the representative samples from the population.
• Along with random sampling techniques, the non-random sampling methods also
provide representative samples from the population.

Terminal Questions

1. Differentiate Random vs Non-random Sampling methods.


2. Explain
a) Population
b) Sample
c) Parameter
d) Statistic
3. Discuss the applications of central limit theorem.
4. Explain the methods of determining sample size.
5. Differentiate Sampling with and without replacement.

Quantitative Methods
100
Answer Keys

Self-Assessment Questions

Question No Answers

1 A

2 B

3 C

4 D

5 B

6 C

7 D

8 B

9 D

10 A

11 C

12 B

13 D

14 A

15 C

16 A

17 D

Quantitative Methods
101
Glossary
• Sampling distribution: A probability distribution of a statistic obtained from a
larger number of samples drawn from a specific population.
• Snowball sampling: A recruitment technique in which research participants
are asked to assist researchers in identifying other potential subjects.
• Quota sampling: A non-probability sampling method that relies on the non-ran-
dom selection of a predetermined number or proportion of units.
• Cluster sampling: A probability sampling technique where researchers divide
the population into multiple groups (clusters) for research.

BIBLIOGRAPHY
1. Chung, K. L. (2000). A Course in Probability Theory (3rd ed.). San Diego, CA:
Academic Press.
2. Feller, W. (1968). An introduction to Probability Theory and its applications:
Volume I. John Wiley & Sons.
3. Gupta, S. C., & Kapoor, V. K. Fundamentals of Mathematical Statistics. S. Chand
Publications.

External Resources
1. Rukmangadachari, E. Probability and Statistics. Person Education.
2. Rohatgi, V. K., & Ehsanes Saleh, A. K. (2015). An introduction to probability and
statistics (3rd ed.). doi:10.1002/9781118799635

e-References
• Sample Distribution: Definition, How It’s Used, With an Example:
https://www.investopedia.com/terms/s/sampling-distribution.asp

• Types of Sampling:
https://www.qualitygurus.com/types-of-sampling/

Quantitative Methods
102
Video Links

Topic Link
https://www.youtube.com/watch?v=l-
Sampling Distribution-I
9rfMOZXk0Y
Introduction to sampling distributions | https://www.youtube.com/
Sampling distributions watch?v=z0Ry_3_qhDw
https://www.youtube.com/
Sampling distribution example problem
watch?v=0ZstEh_8bYc

Keywords

• Population
• Sample
• Parameter
• Stratified sampling

Quantitative Methods
103
QUANTITATIVE METHODS

Module - 2
Unit - 2

ESTIMATION AND TESTING OF


HYPOTHESIS

Quantitative Methods
104
Unit Table of Contents
Unit 2.2 Estimation and Testing of Hypothesis

Aim -------------------------------------------------------------------------------------------------------------- 106


Instructional Objectives ------------------------------------------------------------------------------------ 106
Learning Outcomes ----------------------------------------------------------------------------------------- 106
Introduction ---------------------------------------------------------------------------------------------------- 107
2.2.1 Concept of Estimation ---------------------------------------------------------------------------- 107
Self-Assessment Questions --------------------------------------------------------------------- 109
2.2.2 Introduction to Hypothesis ----------------------------------------------------------------------- 110
Self-Assessment Questions --------------------------------------------------------------------- 114
2.2.3 One sample and two sample tests for means and proportions of large samples
(z-test) ------------------------------------------------------------------------------------------------ 115
Self-Assessment Questions --------------------------------------------------------------------- 121
2.2.4 One sample and two sample tests for means of small samples (t-lest) -------------- 122
Self-Assessment Questions --------------------------------------------------------------------- 126
2.2.5 F-test -------------------------------------------------------------------------------------------------- 127
Self-Assessment Questions --------------------------------------------------------------------- 129
Summary ------------------------------------------------------------------------------------------------------- 130
Terminal Questions ------------------------------------------------------------------------------------------ 130
Answer Keys -------------------------------------------------------------------------------------------------- 131
Glossary -------------------------------------------------------------------------------------------------------- 132
Bibliography --------------------------------------------------------------------------------------------------- 132
External Resources ----------------------------------------------------------------------------------------- 132
e-References ------------------------------------------------------------------------------------------------- 132
Image Credits ------------------------------------------------------------------------------------------------- 133
Video Links ---------------------------------------------------------------------------------------------------- 133
Keywords ------------------------------------------------------------------------------------------------------ 133

Quantitative Methods
105
Aim
This unit aims to explain the basic concepts of Estimation, Testing of Hypothesis,
and its applications in Management.

Instructional Objectives
This unit intends to:
● Explain the concepts of estimation and methods of estimation
● Discuss the concepts of testing of hypothesis and its process
● Apply estimation, testing of hypothesis in various applications of management

Learning Outcomes
At the end of this unit, you are expected to:
● Demonstrate the concepts of estimation and its methods
● Examine the concepts of hypothesis testing
● Utilise the concepts of estimation, testing of hypothesis in Marketing, HRM,
Finance, etc.

Quantitative Methods
106
Introduction
Theory of estimation and testing of hypothesis are the statistical tools to analyse various
population parameters with their sample statistics. Both are two different procedures to analyse
the population. Theory of estimation and testing of hypothesis both together called Statistical
Inference.

Estimation theory is a branch of statistics that deals with estimating the values of parameters
based on measured empirical data that has a random component. The parameters describe an
underlying physical setting in such a way that their value affects the distribution of the measured
data.

A hypothesis is an assumption about a population parameter. It is a statement about the


population that may or may not be true. Hypothesis testing aims to make a statistical conclusion
about accepting or not accepting the hypothesis. So, a statistical hypothesis is an assertion or
conjecture concerning one or more populations. To prove that a hypothesis is true, or false, with
absolute certainty, we would need absolute knowledge. That is, we would have to examine the
entire population. Instead, hypothesis testing concerns how to use a random sample to judge if
it is evidence that supports or not the hypothesis.

2.2.1 Concept of Estimation


Estimation is the process of making inferences from a sample about an unknown population
parameter. An estimator is a statistic that is used to infer the value of an unknown parameter.

Objective Function

θ
Z Estimator θ0

Known
Model /
Information set Convergence
Constraints
region

Fig. 1: Representation of process of estimation

2.2.1.1 Types of Estimation

Estimation can be broadly classified into


● Point estimation
● Interval estimation

Point estimation

Point Estimation is the process of estimating any population parameter with its sample statistic.
In point estimation, any sample statistic is considered as its point estimator. The sample mean
is the point estimator for the population mean, the sample variance is the point estimator for the

Quantitative Methods
107
population variance etc. The sample mean ( xbar ) is the point estimator for the popula-
( )
tion mean ( µ ) , the sample variance s is the point estimator for the population variance
2

α = 5% or 1% or 10% , etc.

Interval estimation

Point estimation in which the point estimator may not coincide with the true value of the popu-
lation parameter in general. Hence, it is suggested to estimate any population parameter in an
interval. The process of estimating any population parameter in an interval is called Interval
estimation.

The Formula to generate the confidence interval of a population parameter is “point estimate
± ((critical value) (standard error))”
=
E.g.: Confidence interval for population mean µ xbar ± zα σ / sqrt ( n )

• µx xbar ± 1.64 σ / sqrt ( n )


90% confidence interval=
• µx xbar ± 1.96 σ / sqrt ( n )
95% confidence interval=
• µx xbar ± 2.58 σ / sqrt ( n )
99% confidence interval=

Quantitative Methods
108
Self-Assessment Questions

1. Estimation of population parameter with its sample statistic is known as ________.

A). Hypothesis
B). Estimation
C). Parameter
D). Statistic

2. Number of types of Estimation are ________.

A). 5
B). 4
C). 3
D). 2

3. Estimation of Parameter at a single numerical value is called ______ Estimation.

A). Population
B). Sample
C). Point
D). Interval

4. Estimation of Parameter at an interval is called ______ Estimation.

A). Population
B). Sample
C). Point
D). Interval

Quantitative Methods
109
2.2.2 Introduction to Hypothesis

The concept of testing of Hypothesis is used to analyse the population parameter by using the
sample statistic. In hypothesis testing, an assumption on population parameter is taken and
tested.

Hypothesis Testing

Null Hypothesis Alternative Hypothesis


(HO) (H1)

HO Rejected
HO Accepted
Rejected Region

Fig. 2: Hypothesis Testing

2.2.2.1 Notations and Terminology

Hypothesis:

Any Statement or Assumption about the Population Parameter is called Hypothesis.

E.g.: The Population Mean, µ = 15


The Population Variance, σ 2 ≠ 20
The Population Proportion, P > 0.5

Types of Hypothesis:

Based on the characteristics of Hypothesis, which can be classified in two.

Null Hypothesis:

The Hypothesis of No Difference or equality is called Null Hypothesis and is denoted by


H0 .

E.g.:
H 0 : µ = 15

H 0 : σ 2 = 20

H 0 : P = 0.5

Quantitative Methods
110
Alternative Hypothesis:

The Hypothesis against to the Null Hypothesis is called Alternative Hypothesis and is denoted
by H1 .

E.g.: H1 : µ > 15
H1 : σ 2 < 20

H1 : P0.5

Statements of Hypothesis:

Various statements existing in Testing of Hypothesis can be classified as

1. Accepting The Null Hypothesis H 0 , when H 0 is True


2. Rejecting The Null Hypothesis H 0 , when H 0 is False
3. Rejecting The Null Hypothesis H 0 , when H 0 is True
4. Accepting The Null Hypothesis H 0 , when H 0 is False

Among the above statements, 1 and 2 are True Statements, whereas 3 and 4 are Error State-
ments.

Type I Error:

The statement of “Rejecting The Null Hypothesis H 0 , when H 0 is True” is called Type I Error.

Type II Error:

The statement of “Accepting The Null Hypothesis H 0 , when H 0 is False” is called Type II Error.

Level of Significance:

The probability of committing Type I Error is called Level of Significance and is denoted by α .
i.e., P (Type I Error ) P
= ( Rejecting The Null Hypothesis H 0 , when H 0is True ) α .
Power of the Test:

The probability of committing Type II Error is called β and (1 − β ) is called Power of the Test. i.e.,
1 − P (Type II Error ) =
1 − P ( Accepting The Null Hypothesis H 0 , when H 0is False ) =
1− β

Critical Region:

The region of rejection of Null Hypothesis H 0 is called Critical Region. i.e., the region formed by
the sample points where Null Hypothesis H 0 is rejected is called Critical Region and is denoted
by W .

Quantitative Methods
111
Acceptance Region:

The region of Acceptance of Null Hypothesis H 0 is called Acceptance Region. i.e., the region
formed by the sample points where Null Hypothesis H 0 is Accepted is called Acceptance Region
and is denoted by W .

Two-tailed Test:

If the alternative Hypothesis H1 : θθ 0 , then such a test is called Two-tailed test. For a Two-tailed
test, the critical Value of Zα is focus on both the tails of Normal Probability Curve.

Left-tailed Test:

If the alternative Hypothesis H1 : θ < θ 0 , then such a test is called Left-tailed test. For a Left-
tailed test, the critical Value of Zα is focus on the left tail of Normal Probability Curve.

Right-tailed Test:

If the alternative Hypothesis H1 : θ > θ 0 , then such a test is called Right-tailed test. For a
Right-tailed test, the critical Value of Zα is focus on the right tail of Normal Probability Curve.

Types of Testing of Hypothesis:

Testing of Hypothesis can be classified into Large Sample testing of Hypothesis ( n > 30 ) and
Small Sample Testing of Hypothesis ( n ≤ 30 ) . Can be represented by the following flow chart

Hypothesis

Null Hypothesis Alternate Hypothesis

Hypothesis Statistical Tests


Testing

Parametric Tests Non-Parametric Tests


(Z Test, T-Test, F-test, ANOVA) (Chi-square Test, U-Test, H-Test)

Fig. 3: Different Tests on hypothesis

Quantitative Methods
112
Procedure for Testing of Hypothesis

Step I: Construct the Null Hypothesis H 0


H 0 : There is no significant difference between the Population Parameter θ

and Sample statistic t. i.e., H 0 : θ = θ 0

Step II: Construct the Alternative Hypothesis H1

H1 : There is a significant difference between the Population Parameter θ and Sample statistic t.

i.e., H1 : θ ≠ θ 0

Step III: Choose an appropriate level of Significance α = 5% or 1% or 10%

Step IV: Under H 0 , The test Statistic


t − E (t )
z or t or F or X 2 =
S .E (t )

Step V: If t ≤ tα , Accept the Null Hypothesis 0

If t ≤ tα , Reject the Null Hypothesis H 0 . i.e., Accept the Alternative Hypothesis H1 .

The critical Values of Zα

Critical Values of Zα Level of Significance (α )


1% 5% 10%
Two-tailed Test 2.58 1.96 1.645
Right-tailed Test 2.33 1.645 1.28
Left-tailed test -2.33 -1.645 -1.28

Quantitative Methods
113
Self-Assessment Questions
5. Assumption on the population parameter known as_____.

A). Hypothesis
B). Estimation
C). Parameter
D). Statistic

6. Number of types of Hypotheses are________.

A). 5
B). 4
C). 3
D). 2

7. Null Hypothesis is denoted with:

A). H1

B). H1

C). H 2

x−y
z=
D). σ 12 σ 22
+
n1 n2

8. Number of steps in Hypothesis testing procedure are _____.

A). 4
B). 5
C). 3
D). 2

9. Rejecting H0 when H0 is true is called _______ Error.

A). Type III


B). Type 0
C). Type I
D). Type I

Quantitative Methods
114
2.2.3 One sample and two sample tests for means and proportions of large
samples (z-test)

The Z-Test is used to test the hypothesis on population parameters using their sample statis-
tics under large sample case. i.e., n > 30 . Various such tests are

2.2.3.1 Large Sample test for Population Mean or Z-test for Population Mean

Step I: Construct the Null Hypothesis H 0


H 0 : There is no significant difference between the Population
Mean µ and Sample mean x

H 0 : µ = µ0

Step II: Construct the Alternative Hypothesis H1


H1 : There is a significant difference between the population
Mean µ and Sample mean x

H1 : µ ≠ µ0

Step III: Choose an appropriate level of significance α = 5% or 1% or 10%


x −µ
Step IV: Under H 0 , The test Statistic z =
σ

n ˜ N ( 0, 1)
Step V: If Z ≤ Zα , Accept the Null Hypothesis H 0
If t = , Reject the Null Hypothesis H 0 . i.e., accept the Alternative Hypothesis H1 .

Example

A sample of 64 students have a mean weight of 70 kgs and with a standard deviation of 25 kgs.
Then test whether the populations mean weight is 56 kgs or not test at 5% level of significance.

Solution

Given that sample size,=n 64 ( > 30 )


Sample Mean, x = 70 , Standard Deviation, σ = 25
Then the test procedure for population mean consists of the following steps
Null Hypothesis, H 0 : The population mean weight is 56 Kgs
i.e., H 0 : µ = 56
Alternative Hypothesis H1 : The population mean weight is not 56 Kgs
i.e., H1 : µ ≠ 56

Quantitative Methods
115
Choose an appropriate level of significance α = 5%

Under H 0 , The test statistic


x −µ
z= ˜ N ( 0, 1)
σ
n
70 − 56
z=
25

64
z = 4.48

Now,=z 4.48
= 4.48

zα at 5% Level of Significance is 1.96

Since 4.48 > 1.96, Reject the Null Hypothesis H 0 . That is accept the alternative hypothesis H1

Hence, the population average weight is not equal to 56 Kgs.

2.2.3.2 Large Sample test for Population Proportion or Z-test for Population
Proportion

Step I: Construct the Null Hypothesis H 0


H 0 : There is no significant difference between the Population

Proportion (P) and Sample Proportion (p)

H 0 : P = P0

Step II: Construct the Alternative Hypothesis H1


H1 : There is a significant difference between the Population Proportion (P) and
Sample Proportion (p)
H 0 : P ≠ P0

Step III: Choose an appropriate level of Significance α = 5% or 1% or 10%

Step IV: Under H 0 , the test Statistic


p−P
z= ˜ N ( 0, 1)
PQ
n

Step V: If Z ≤ Zα , Accept the Null Hypothesis H 0


If Z > Zα , Reject the Null Hypothesis H 0 . i.e., accept the Alternative Hypothesis H1 .

Quantitative Methods
116
Example

In a sample of 1000 people in Maharashtra, 540 are found to be wheat eaters and rest are rice
eaters. Then test whether both rice and wheat eaters are equally proportional in the state of
Maharashtra, test at 5% level of significance.

Solution

Given that
=
Sample Size, n 1000 ( > 30 )

=
Sample Proportion, p 540
= /1000 0.54

Then the test procedure for population mean consists of the following steps

Null Hypothesis, H 0 : Both wheat and rice eaters are equally proportional in the state of
Maharashtra.

i.e., H 0 : P = 0.5

Alternative Hypothesis H1 : Both wheat and rice eaters are not equally proportional in the state
of Maharashtra.

i.e., H 0 : P ≠ 0.5

Choose an appropriate level of Significance α = 5%

Under H 0 , The test Statistic

p−P
z=
PQ
n

0.54 − 0.5
z=
(0.5)(0.5)

1000

z = 2.53
=
Now, z 2.53
= 2.53
zα at 5% Level of Significance is 1.96

Since 2.53 > 1.96, Reject the Null Hypothesis H 0 . That is Accept the Alternative Hypothesis H1 .

Hence, both wheat and rice eaters are not equally proportional in the state of Maharashtra.

Quantitative Methods
117
2.2.3.3 Large Sample test for Equality of Two Population Means or Z-test for
Equality of Two Population Means

Step I: Construct the Null Hypothesis H 0 .


H 0 : There is no significant difference between the two Population
H 0 : µ1 = µ2

Step II: Construct the Alternative Hypothesis H1 .

H1 : There is no significant difference between the two Population Means

H1 : µ1 ≠ µ2

Step III: Choose an appropriate level of Significance α = 5% or 1% or 10%

Step IV: Under H 0 , The test Statistic


x−y
z= ˜ N ( 0, 1)
σ 12 σ 22
+

n n 1 2

Step V: If Z ≤ Zα , Accept the Null Hypothesis H 0

If Z > Zα , Reject the Null Hypothesis H 0 . i.e., accept the Alternative Hypothesis H1 .

Example

Two horses A and B are tested according to the different time periods to run on a particular track
with the following results
Horse A 28 30 32 33 33 29 34
Horse B 29 30 30 24 27 29
Test whether the average running capacities of both horses are same or not

Solution

Given that

=
Sample Size, n1 7,=
n2 6

Null Hypothesis, H 0 : The average running capacities of both horses are the same.

i.e., H 0 : µ 1 = µ 2

Quantitative Methods
118
Alternative Hypothesis H1: The average running capacities of both horses are not the same

i.e., H 0 : µ 1 ≠µ 2

Choose an appropriate level of Significance α = 5%

Under H 0 , The test Statistic


x−y
t= ˜ N ( 0, 1)
σ 12 σ 22
+
n1 n2
t = 2.45

=
Now, t 2.45
= 2.45

tα at 5% Level of Significance is 2.20

Since 2.45 > 2.20, Reject the Null Hypothesis H 0 . That is Accept the Alternative Hypothesis H1 .

Hence, the average running capacities of both horses are not the same

2.2.3.4 Large Sample test for Equality of Two Population Proportions or Z-test for
Equality of Two Population Proportions

Step I: Construct the Null Hypothesis H 0

H 0 : There is no significant difference between the two Population Proportions


H 0 : P1 = P 2

Step II: Construct the Alternative Hypothesis

H1 : There is no significant difference between the two Population Proportions H1 : P1 ≠ P 2

Step III: Choose an appropriate level of Significance α = 5% or 1% or 10%

Step IV: Under H 0 , The test Statistic


p1 − p 2
z= ˜ N ( 0, 1)
1 1
PQ  + 

 n1 n2 

x1 + x 2
where P = and Q = 1− P
n1 + n2
Step V: If Z ≤ Zα , Accept the Null Hypothesis H 0

If Z > Zα , Reject the Null Hypothesis H 0 . i.e., accept the Alternative Hypothesis H1

Quantitative Methods
119
Example

A sample of 600 students at a certain college, 400 are found to boys and 200 are girls. In another
college, in a sample of 900 students, 450 are boys and 450 are girls. Then test whether there is
any significant difference among the proportion of boys are same in both the colleges test at 5%
level of significance.

Answer

Given that n1 = 600 , x1 = 400 , n2 = 900 , x2 = 450

=
Then the sample proportions 1 / n1
p1 x= 400 / =
600 0.66

= 2 / n2
P2 x= 200 /=
400 0.5

H 0 : The proportions of boys are the same in both the colleges

H 0 : P1 = P 2

H1 : The proportions of boys are not the same in both the colleges

H1 : P1 P 2

Choose an appropriate level of Significance α = 5% or 1% or 10%

Under H 0 , The test Statistic


p1 − p 2
˜ N ( 0, 1)
z=
1 1
PQ  + 

 n1 n2 

x1 + x 2
where P = and Q = 1− P
n1 + n2

χ 2 P = 0.56 and Q = 1 − P = 0.44

Z = 6.11

Zα = 1.96

Since 6.11 > 1.96

That is Z > Zα , reject the Null Hypothesis H 0

Hence, the proportions of boys are not same in both the colleges.

Quantitative Methods
120
Self-Assessment Questions

10. ____ test is used to test the large sample test for means.

A). Z
B). t
C). F
D). Q

11. Proportion means________.

A). Mean
B). Ratio
C). Variance
D). S.D

12. Population mean is denoted by_________.

A). σ
B). µ
C). Σ

D). χ

13. Population Proportion is denoted by____________.

A). σ 2
B). µ
C). P

D). χ 2

Quantitative Methods
121
2.2.4 One sample and two sample tests for means of small samples
(t-lest)
The t -Test is used to test the hypothesis on population parameters using their sample statis-
tics under small sample case. i.e., n ≤ 30 . Various such tests are:

t – Table values

Table 4.4 Student-t Distribution


v t50 t90 t95 t99
1 1.000 6.314 12.706 63.657
2 0.816 2.920 4.303 9.925
3 0.765 2.353 3.182 5.841
4 0.741 2.132 2.770 4.604
5 0.727 2.015 2.571 4.032
6 0.718 1.943 2.447 3.707
7 0.711 1.895 2.365 3.499
8 0.706 1.860 2.306 3.355
9 0.703 1.833 2.262 3.250
10 0.700 1.812 2.228 3.169
11 0.697 1.796 2.201 3.106
12 0.695 1.782 2.179 3.055
13 0.694 1.771 2.160 3.012
14 0.695 1.761 2.145 2.977
15 0.691 1.753 2.131 2.947
16 0.690 1.746 2.120 2.921
17 0.689 1.740 2.110 2.898
18 0.688 1.734 2.101 2.878
19 0.688 1.729 2.093 2.861
20 0.687 1.725 2.086 2.845
21 0.686 1.721 2.080 2.831
30 0.683 1.697 2.042 2.750
40 0.681 1.684 2.021 2.704
50 0.680 1.679 2.010 2.679
60 0.679 1.671 2.000 2.660
∞ 0.674 1.645 1.960 2.576

Quantitative Methods
122
2.2.4.1 Small Sample test for Population Mean or t-test for Population Mean

Step I: Construct the Null Hypothesis H 0

H 0 : There is no significant difference between the Population


Mean µ and Sample mean x
H0 : µ = µ0

Step II: Construct the Alternative Hypothesis H1

H1 : There is a significant difference between the Population


Mean µ and Sample mean x
H1 : µ ≠ µ0

Step III: Choose an appropriate level of Significance α = 5% or 1% or 10%

Step IV: Under H 0 , The test Statistic


x −µ
t=
s
n ˜ N ( 0, 1)
1 1
Where
x=
n
∑ xi = s2
n −1
∑ ( xi − x ) 2

Step V: If t ≤ tα , Accept the Null Hypothesis H 0

If t > tα , Reject the Null Hypothesis H 0 . i.e., accept the Alternative Hypothesis

Problem 1:
A random sample of 10 students had the I .Q’s 70, 120, 110, 101, 88, 83, 95, 98, 107 and 100.
Then test whether the average I .Q levels of students is 100 or not. Test at 5% level of signifi-
cance.

Solution
Given that

n 10 ( < 30 )
Sample Size,=
1
Sample Mean, x =
n
∑ xi

1
= ( 70 + 120 + ........ + 107=) 97.2
10

Quantitative Methods
123
1
=s2
Standard Deviation, n −1
∑ ( xi − x ) 2

s 2 = 203.73

=s 203.73 14.27
=

Then the test procedure for Population Mean consists of the following steps

Null Hypothesis, H 0 : The Population Mean I .Q is 100

i.e., H 0 : µ = 100
Alternative Hypothesis H1: The Population Mean I.Q is not100.

i.e., H1 : µ ≠ 100
Choose an appropriate level of Significance α = 5%

Under H 0 , The test Statistic


x −µ
t=
s ˜ N ( 0, 1)
n
97.0 − 100
t=
14.27
10
t = − 0.62
−0.62 =
Now, t = 0.62

tα at 5% Level of Significance at (10-1 = 9) df is 2.26

Since 0.62 < 2.26, Accept the Null Hypothesis H 0


.
Hence, the population Mean I .Q is 100.

2.2.4.2 Small Sample test for Equality of Two Population Means or t-test for Equality of
Two Population Means

Step I: Construct the Null Hypothesis H 0

H 0 : There is no significant difference between the two Population Means H 0 : µ1 = µ2

Step II: Construct the Alternative Hypothesis H1

H1 : There is no significant difference between the two Population Means


H1 : µ1 ≠ µ2

Quantitative Methods
124
Step III: Choose an appropriate level of Significance α = 5% or 1% or 10%

Step IV: Under H 0 , The test Statistic


x−y
t=
s12 s 22 ˜ N ( 0, 1)
+
n1 n2

1 1
=s1
Where n −1
∑ 2
( xi − x )= s2
n −1
∑ ( yi − y ) 2

Step V: If t ≤ tα , Accept the Null Hypothesis H 0

If t > tα , Reject the Null Hypothesis H 0 . i.e., accept the Alternative Hypothesis H1
.

Quantitative Methods
125
Self-Assessment Questions
14. ____ test is used to test the small sample test for means.

A). Z
B). t
C). F
D). Q

15. Proportion Standard deviation is denoted by_______.

A). σ
B). µ
C). Σ
D). s

16. Small sample means with n value_________________.

A). n = 30
B). n ≠ 30
C). n > 30
D). n ≤ 30

Quantitative Methods
126
2.2.5 F-test
The F-test is used to test the equality of variances under small sample case.

2.2.5.1 F-test for two population standard deviations


The F-test for two population standard deviations is used to test the equality of two population
standard deviations in a small sample case. i.e., n ≤ 30 .

Step I: Construct the Null Hypothesis H 0

H 0 : There is no significant difference between the two Population Standard deviations

H 0 : σ 12 = σ 2 2

Step II: Construct the Alternative Hypothesis H1

H1 : There is no significant difference between the two Population

H1 : σ 12 ≠ σ 2 2
Step III: Choose an appropriate level of Significance α = 5% or 1% or 10%

Step IV: Under H 0 , The test Statistic

s12
F= ˜F
s 22

1 1
Where s1
=
n −1
∑ 2
( xi − x )= s2
n −1
∑ ( yi − y ) 2

Step V: If F ≤ Fα , Accept the Null Hypothesis H 0

If F > Fα , Reject the Null Hypothesis H 0 . i.e., accept the Alternative Hypothesis H1
.

F Table Values

Critical Values of the f Distribution for α = 0.05

0 fa

Quantitative Methods
127
Quantitative Methods
128
Self-Assessment Questions
17. ____ test is used to test the variances.

A). Z
B). t
C). F
D). Q

18. Population variance is denoted by______________.

A). σ 2
B). µ

C). Σ
D). χ2

Quantitative Methods
129
Summary

● The unit Estimation and Testing of Hypothesis is used to analyse various


population parameters with their sample statistics.
● Estimation is the process of estimating various population parameters with their
sample statistics.
● Testing of Hypothesis in which an assumption on population parameter is
tested.
● Z-test is used to test the population means under large sample case.
● T-test is used to test the population means under small sample case.
● F-test is used to test the population variances under small sample case.

Terminal Questions

1. Explain the computational procedure for testing of Hypothesis.


2. A Sample of 100 iron bars are drawn from a large number of iron bars with a mean
length of 4 feet and with a S.D of 0.6 feet. Then test whether the average length of all
iron bars is 4.2 feet or not at 5% Level of Significance.
3. A Sample of 100 iron bars are drawn from a large number of iron bars with a mean
length of 4 feet and with a S.D of 0.6 feet. Then test whether the average length of all
iron bars is 4.2 feet or not at 5% Level of Significance.
4. From a large consignment of Oranges, in a random sample of 64 oranges contains
18 spoiled oranges. Then test whether 20% of the oranges in the consignment are
spoiled or not test at 1% level of significance.
5. Two machines in a shop were tested for 250 times out of which first machine failed
for 13 times and second machine failed for 7 times. Test at 5% level of significance
whether there is any difference between the failures of both the machines.

Quantitative Methods
130
Answer Keys

Self-Assessment Questions

Question No Answers

1 B

2 D

3 C

4 D

5 A

6 D

7 A

8 B

9 C

10 A

11 B

12 B

13 C

14 B

15 D

16 D

17 C

18 A

Quantitative Methods
131
Glossary
• Mean: Mean is the average of the given numbers and is calculated by dividing
the sum of given numbers by the total number of numbers.
• Variance: The mean squared difference between each data point and the cen-
tre of the distribution measured by the mean.
• Proportion: Is a mathematical comparison between two numbers
• Z-test: A statistical test to determine whether two population means are differ-
ent when the variances are known and the sample size is large.

BIBLIOGRAPHY
1. Chung, K. L. (2000). A Course in Probability Theory (3rd ed.). San Diego, CA:
Academic Press.
2. Feller, W. (1968). An introduction to Probability Theory and its applications:
Volume I. John Wiley & Sons.
3. Gupta, S. C., & Kapoor, V. K. Fundamentals of Mathematical Statistics. S. Chand
Publications.

External Resources
1. Rukmangadachari, E. Probability and Statistics. Person Education.
2. Rohatgi, V. K., & Ehsanes Saleh, A. K. (2015). An introduction to probability and
statistics (3rd ed.). doi:10.1002/9781118799635

e-References
• The F-Distribution: http://surl.li/gngiy

• The t-tests: https://www.bmj.com/about-bmj/resources-readers/publications/sta-


tistics-square-one/7-t-tests

Quantitative Methods
132
Image Credits

Representation of process of estimation


Fig. 1 https://kindsonthegenius.com/blog/theory-of-estimation-
unbiased-estimation/
Hypothesis Testing
Fig. 2 https://www.statisticalaid.com/statistical-hypothe-
sis-testing-step-by-step-procedure/
Different Tests on hypothesis
Fig. 3 https://www.analyticsvidhya.com/blog/2022/01/
learn-all-about-hypothesis-testing/

Video Links

Topic Link
https://www.youtube.com/
Introduction To Statistical Inference
watch?v=n3p4PK8kwOU
Introduction Hypothesis Testing in https://www.youtube.com/
Statistics watch?v=VK-rnA3-41c
https://www.youtube.com/
Z- test
watch?v=bB-J6_wcGgE
https://www.youtube.com/
T- test introduction
watch?v=fKZA5waOJ0U
https://www.youtube.com/watch?v=-
F- test
FlIiYdHHpwU

Keywords

● Estimation
● Hypothesis
● Level of Significance
● T-test
● F-test

Quantitative Methods
133
QUANTITATIVE METHODS

Module - 2
Unit - 3

ANOVA & NON-PARAMETRIC


TESTING OF HYPOTHESIS

Quantitative Methods
134
Unit Table of Contents
Unit 2.3 ANOVA & Non-Parametric Testing of Hypothesis

Aim -------------------------------------------------------------------------------------------------------------- 136


Instructional Objectives ------------------------------------------------------------------------------------ 136
Learning Outcomes ----------------------------------------------------------------------------------------- 136
Introduction ---------------------------------------------------------------------------------------------------- 137
2.3.1 Analysis of Variance (ANOVA) ------------------------------------------------------------------ 137
Self-Assessment Questions --------------------------------------------------------------------- 144
2.3.2 Chi-square test ------------------------------------------------------------------------------------- 145
Self-Assessment Questions --------------------------------------------------------------------- 150
2.3.3 Sign test --------------------------------------------------------------------------------------------- 151
Self-Assessment Questions --------------------------------------------------------------------- 153
2.3.4 Wilcoxon Signed Rank –Test ------------------------------------------------------------------- 154
Self-Assessment Questions --------------------------------------------------------------------- 157
Summary ------------------------------------------------------------------------------------------------------- 157
Terminal Questions ------------------------------------------------------------------------------------------ 158
Answer Keys -------------------------------------------------------------------------------------------------- 159
Glossary -------------------------------------------------------------------------------------------------------- 160
Bibliography --------------------------------------------------------------------------------------------------- 160
External Resources ----------------------------------------------------------------------------------------- 160
e-References ------------------------------------------------------------------------------------------------- 160
Video Links ---------------------------------------------------------------------------------------------------- 161
Keywords ------------------------------------------------------------------------------------------------------ 161

Quantitative Methods
135
Aim
This unit aims to explain the basic concepts of Analysis of Variance
(ANOVA), Non-parametric testing of hypothesis and its applications in
Management.

Instructional Objectives
This unit intends to:
• Explain the concepts of Analysis of Variance (ANOVA) and its methods
• Discuss the concepts of non-parametric testing of hypothesis
• Describe ANOVA, Non-parametric testing of hypothesis in management
applications

Learning Outcomes

At the end of this unit, you are expected to:


• Discuss the concepts of ANOVA and its methods
• Examine the concepts of non-parametric testing of hypothesis
• Evaluate application of ANOVA in Marketing, HRM, and Finance, etc.

Quantitative Methods
136
INTRODUCTION
Analysis of variance (ANOVA) is a statistical technique that is used to check if the means of two
or more groups are significantly different from each other. ANOVA checks the impact of one or
more factors by comparing the means of different samples. ANOVA is a statistical method that
analysis variances to determine if the means from more than two populations are the same. In
other words, we have a quantitative response variable and a categorical explanatory variable
with more than two levels. In ANOVA, the categorical explanatory is typically referred to as the
factor.

Non-parametric tests are used if the assumptions for the parametric tests are not met and are
commonly called distribution free tests. The advantage of non-parametric tests is that we do not
assume that the data come from any distribution.

In general, non-parametric tests:


● make few or no assumptions about the distribution of the data
● reduce the effect of outliers and heterogeneity of variance
● can often be used even for ordinal, and sometimes even nominal, data

Since non-parametric tests do not estimate population parameters, in general, there are
● no estimates of variance/variability
● no confidence intervals
● fewer measures of effect size

Also, non-parametric tests are generally not as powerful as parametric alternatives when the
assumptions of the parametric tests are met.

2.3.1 ANALYSIS OF VARIANCE (ANOVA)


The ANOVA is used to test the homogeneity of more than two population means by minimising
their variance.

Assumptions of ANOVA
● Randomisation
● Replication
● Local Control

ANOVA is classified as
● ANOVA One-way Classification
● ANOVA Two-way Classification

Quantitative Methods
137
2.3.1.1 ANOVA ONE-WAY CLASSIFICATION
Testing the homogeneity of more than two populations means with respect to one Classification
(Treatments). The procedure for ANOVA One way classification consists of the following steps:

● H 0 : There is homogeneity among the K Population means

i.e. Tk 2 Ho : µ1 = µ 2 = ……… = µ k

● H1 : There is no homogeneity among the K Population means


i.e. H 1: µ1 ≠ µ 2 ≠ … ≠ µ k
● Appropriate level of significance is α % = 5% or 1% (given/chosen)
● To test the above hypotheses the procedure is as follows:
● Calculations:
Ti Ti 2
1 X 11 X 12 . . . X 1n T1 T12
2 X 21 X 22 . . . X 2n T2 T2 2
. . . . . . . . .
. . . . . . . . .
k X k1 Xk2 . . . X kn Tk Tk 2

Row sum of squares (R.S.S) = ∑X i


2
where is the i th observation
,
2
G
Correction factor (C. F) ni = = where G is the Grand total and N is the no. of observations
N ,
in the entire experiment.

Total Sum of Squares (T.S.S) = St2 , = R.S.S - C.F


Ti 2
H1 Treatments Sum of Squares (t.S.S) =
= St2 ∑ n − C.F , Ti = ith row total, ni = no. of obvs
i

th
in i row Error Sum of Squares (S.S.E) H1(tr )

ANOVA TABLE
Source of Sum of Mean sum of
Degree of Freedom F-Ratio
Variation Squares squares

2 S2tr s 2t
Treatments St 2
K −1 s =
t F = 2 ~Fk −1,N − k
k −1 se

Se2
Error Se 2 N −K
2
s =
e
N-k
Total ST2 N −1

Quantitative Methods
138
Example:

Suppose 3 drying formulas for curing a glue are studied and the following times are observed.

Formula A 13 10 8 11 8

Formula B 13 11 14 14
Formula C 4 1 3 4 2 4

Carry out ANOVA one-way classification at 5% L.O.S and comment

Solution
H 0 : There is homogeneity among the means of A, B & C i.e. µ
= A µ=
B µC
H1 : There is no homogeneity among the means of A, B & C i.e. µ A ≠ µ B ≠ µC
Appropriate level of significance is 5% (given)

Ti Ti 2 / ni
Formula A 13 10 8 11 8 --- 50 2500/5 = 500
Formula B 13 11 14 14 --- --- 52 2704/4=676
Formula C 4 1 3 4 2 4 18 324/6 = 54
G =120 Σ Ti 2 / ni = 1230

Row sum of squares (R.S.S) = ∑X i


2
= 1262

G 2 1202
Correction factor (C.F) = = = 960
N 15
2
Sum of Squares due to Total (S.S.T) = ST = R.S.S - C.F = 1262 - 960 = 302

Ti 2
Sum of Suares due to Treatments (S.S.S.tr) = Str2 = ∑ n - C.F = 1230 - 960 = 270
i

Se2 = S.S.T - S.S.tr = ST2 − St2 = 302 - 270 = 32


Sum Error (S.S.E) =

ANOVA Table

Sum of Degrees of Mean Sum of Variance


Source of Variation
Squares freedom Squares Ratio
Treatments (Formulae) 270 2 270/2=135
F = 135/2.667
Error 32 12 32/12=2.667 = 50.625
Total 302 14

Quantitative Methods
139
Inference

The table value of F at 5% level of significance for (2,12) d.f is 3.89

{Fα =
, k −1, N − k F=
0.05,2,12 3.89}
Fcal > Ftab , We reject H 0 . Hence, we conclude that µ A ≠ µ B ≠ µC

2.3.1.2 ANOVA TWO-WAY CLASSIFICATION

ANOVA Two Classification is used to test the homogeneity of more than two population
Means w.r.t Two Classification (Treatments and Blocks). The procedure for ANOVA Two-way
classification consists of the following steps.

● Null Hypothesis

H 0(tr ) : There is homogeneity among the treatments i.e. µ1 = µ2 = ……… = µk

H 0(b ) : There is homogeneity among the blocks i.e. µ1 = µ 2 = ……… = µ h

● Alternative Hypothesis

H1(tr ) : There is no homogeneity among the treatments µ1 ≠ µ2 ≠ ……… ≠ µk

H1(b ) : There is no homogeneity among the Blocks i.e., µ1 ≠ µ2 ≠ ……… ≠ µh

● Appropriate level of significance is α % (given/chosen)

BLOCKS

1 2 . . . h Ti Ti 2
1 X 11 X 12 . . . X 1h T1 T12
2 X 21 X 22 . . . X2h T2 T2 2
Treatments . . . . . . . . .
. . . . . . . . .
k X k1 Xk2 . . . X kh Tk Tk 2
Bj B1 B2 . . . Bh G ΣTi 2
Bj2 B12 B2 2 . . . Bh 2 ΣB j 2

Quantitative Methods
140
∑X
2
Row sum of squares(R.S.S) = ij
, Where X i is the ith observation

G2
Correction factor (C.F) = where G is the Grand total,
N ,
N is no. of observations in expt.
2
Total Sum of Squares (T.S.S) = ST SR.S.S-C.F
1 k 2
2
Total Sum of Squares (t.S.S) = St = ∑ Ti - C.F
h i =1
1 h 2
2
Blocks Sum of Squares (b.S.S) = sb ∑ B j - C.F
k j1
2 2 2 2
Error Sum of Squares (E.S.S) = se S.S.T-S.S.tr-S.S.b ST S r Sb

ANOVA TABLE

Source of Sum of Degree of Mean sum of F-Ratio


Variation Squares Freedom squares

S2 s 2t
treatments St 2
k −1 s = t2
F1 = ~Fk −1,(k-1)(h-1)
t
k −1 s e2

S2B s 2b
Blocks Sb 2 h −1 s = 2
F2 = ~Fh −1,(k-1)(h-1)
b
h −1 s e2

Se2
Error Se 2 ( k − 1) ( h − 1) s e2 =
(k − 1)(h-1)

Total ST 2 kh − 1

Example

Carry out ANOVA two-way classification to the following data.

Blocks
B1 B2 B3 B4
Treatment 1 13 7 9 3
Treatment 2 6 6 3 1
Treatment 3 11 5 15 5

Solution
Null Hypothesis

µ1
H 0(tr ) : There is homogeneity among the treatments i.e. = µ=
2 µ3

Quantitative Methods
141
H 0(b ) : There is homogeneity among the blocks µ1
i.e. = µ=
2 µ=
3 µ4

Alternative Hypothesis

H1(tr ) : There is no homogeneity among the treatments µ1 ≠ µ2 ≠ µ3

H1(b ) : There is no homogeneity among the Blocks Aij µ1 ≠ µ2 ≠ µ 3 ≠ µ4


Appropriate level of significance is α % (given/chosen)

Blocks Ti Ti 2

Treatment 1 13 7 9 3 32 1024

Treatment 2 6 6 3 1 16 256

Treatment 3 11 5 15 5 36 1296

Bj 30 18 27 9 G =84 ΣTi 2 =2576


Bj2 900 324 729 81 ΣB j 2 =2034

Row sum of squares (R.S.S) = ∑S i


2
= 786

G 2 842
Correction factor (C.F) = = = 588
N 12
2
Sum of Squares due to Total (S.S.T) = ST = R.S.S - C.F = 786 - 588 = 198

1 k 2 1
Sum of Squares due to Treatments (S.S.tr) = S=
t
2

h i =1
Ti − C.F
=
4
2576 - 588 = 56

1 h 2 1
2
Sum of Sqaures due to Blocks (S.S.b) = S=
b ∑
k j =1
B j − C .F
=
3
2034 - 588 = 90

2 2 2
Sum oue to Error (S.S.E) = S.S.T - S.S.tr - S.S.b = ST − Str − Sb = 198 - 56 - 90 = 52
ANOVA Table

Source of Sum of Degrees Mean Sum Variance


Variation Squares of freedom of Squares Ratio

Treatments 56 2 56/2 = 28
Ft = 28/8.67 = 3.23
Blocks 90 3 90/3 = 30
Fb = 30/8.67 = 3.46
Error 52 6 52/6 = 8.67
Total 198 11

Quantitative Methods
142
Inference

● {F = F= }
5.14 , Ft < Ftab , we accept H 0(tr )
α , k −1,( k −1) ( h −1)
0.05,2,6

● {F = F=
α , h −1, ,( k −1) ( h −1)
0.05,3,6 4.76 } , Fb < Ftab , we accept H 0(b )

Quantitative Methods
143
Self-Assessment Questions

1. _____ Test is used to test the ANOVA.

A). t
B). F
C). Z

D). χ 2

2. Analysis of Variance is denoted by__.

A). ANOVA
B). COVA
C). ANCOVA
D). ANO

3. Number of classifications of ANOVA are ______.

A). 3
B). 4
C). 2
D). 5

4. ANOVA is used to test the homogeneity of several population means. What do you
say?

A). No
B). Can’t say
C). May be
D). Yes

Quantitative Methods
144
2.3.2 CHI-SQUARE TEST
The square of a standard normal variate is called Chi-square variate and is denoted by χ 2 .
The Chi-square test is used to test the homogeneity of the given data and can be classified into
two types.

● Chi-square test for goodness of fit


● Chi-square test for Independence of attributes.

2.3.2.1 Chi-square test for goodness of fit

The Chi-square test for goodness of fit is used to test the homogeneity of the given data by
comparing the observed and expected frequencies. The procedure of Chi-square test for good-
ness of fit consists of the following steps:

Step I: Construct the Null Hypothesis H 0

H 0 : There is no significant difference between the observed and expected frequencies


and Sample statistic t. i.e., H 0 : Oi = Ei

Step II: Construct the Alternative Hypothesis H1

H1 : There is a significant difference between the observed and expected frequencies


and Sample statistic t. i.e., H 0 : Oi ≠ Ei

Step III: Choose an appropriate level of Significance α = 5% or 1% or 10%

Step IV: Under H 0 , The test Statistic

(Oi − ei ) 2
χ =∑
2

ei

Step V: If χ 2 ≤ χ 2α , Accept the Null Hypothesis H 0

If χ 2 > χ 2α , Reject the Null Hypothesis H 0 . i.e., accept the Alternative Hypothesis H1 .

Example

The number of road accidents per day in a week on a highway are distributed as follows

Day Monday Tuesday Wednesday Thursday Friday Saturday Sunday

No. of Accidents 12 8 20 5 14 10 15

Quantitative Methods
145
Use chi-square test and test whether the accidents are uniformly distributed throughout the week.
Test at 5% level of significance

Solution

Given that the observed frequencies, Oi: 12 8 20 5 14 10 15


The sample size, n = 7

1
The expected frequency, E =
n
∑ Oi
Ei
= 1/ 7 (12 + 8 + 20 + 5 + 14 + 10 + 15 )
Ei
= 84
= /7 12

Null Hypothesis ( Ho ) : the accidents are uniformly distributed throughout the week.
Ho : Oi = Ei

Alternative Hypothesis ( H 1) : the accidents are not uniformly distributed throughout the week.
H 1: Oi ≠ Ei
The level of significance, α = 5%

(Oi − ei ) 2
The test statistic χ 2 = ∑ ei
To determine χ 2 , the following table is used

( Oi − Ei ) ( Oi − Ei )
2 2
Oi Ei / Ei
12 12 (12-12)2 = 0 0/12 = 0

8 12 (8-12)2= 16 16/12 = 1.33

20 12 (20-12)2= 64 64/12 = 5.33

5 12 (5-12)2= 49 49/12 = 4.08

14 12 (14-12)2= 4 4/12 = 0.33

10 12 (10-12)2= 4 4/12 = 0.33

15 12 (15-12)2= 9 9/12 = 0.75

χ2 12.15

Therefore χ 2 = 12.15

Quantitative Methods
146
χ 2α ( n − 1) df = χ 25% ( 7 − 1) df = χ 25% ( 6 ) df = 12.592

Since 12.15 < 12.592. that is χ 2 < χ 2α , Accept H 0

Hence, the accidents are uniformly distributed throughout the week.

2.3.2.2 Chi-square test (X2) test for Independence of Attributes

The Chi-square test for the Independence of Attributes is used to test the independence of the
attributes (Factors) with observed and expected frequencies. The procedure of the Chi-square
test for independence of attributes consists of the following steps:

(Oi − ei ) 2
χ2 = ∑
ei

Step I: If χ 2 ≤ χ 2α , Accept the Null Hypothesis H 0

If χ 2 > χ 2α , Reject the Null Hypothesis H 0 . i.e., accept the Alternative Hypothesis H1 .

To test whether the Two Factors ( F1and F2 ) affecting the given data are independent or not.
( r x c Contingency Table)
F2
1 2 . J . c
1 O11 O12 . O1J . O1c
2 O 21 O 22 . O2 j . O 2c
. .
F1 . .
i Oi1 Oi 2 . ü . Oic
. .
r Or1 Or 2 . ü . Orc

Procedure

Step 1: Null Hypothesis ( H 0 ) : The two factors F1 and F2 are independent.

H 0 : Oij = Eij

Step 2: Alternative Hypothesis ( H1 ) : The two factors F1and F2 are not independent

H 0 : Oij ≠ Eij

Step 3: Choose an appropriate level of significance α = 5% or 1% or 10%

Quantitative Methods
147
( Oij − eij )
2

Step 4: Test Statistic χ =∑


2

eij

Where χ 2
=
(i th
row total x j th column total )
Grand total ( G )

Step 5: If χ 2 ≤ χ 2α (r − 1)(c − 1) degree of freedom, Accept HO

If χ 2 > χ 2α (r − 1)(c − 1) degree of freedom, Reject HO

Example

The following table gives the classification of 100 workers according to their Gender and Nature
of work.

Nature of work
Skilled Unskilled
Gender Male 40 20
Female 10 30

Use Chi-Square test and test whether the Gender is an independent of Nature of work or not.
Test at 5% Level of significance.

Solution

Given That

Nature of work Row Total


Skilled Unskilled
Gender Male 40 ( O11 ) 20( O12 ) 60
Female 10( 021 ) 30( O22 ) 40
Column Total 50 50 100 = G

(
Expected frequency, Eij = i th row total x j th column total ) / Grand total ( G )

E11 = (1 row total x 1 column


= total ) / G ( 60 X 50 ) /100
= 30

E12 = (1 row total x 2 column


= total ) / G ( 60 X 50 ) /100
= 30

E 21 = ( 2 row total x 1 column


= total ) / G ( 40 X 50 ) /100
= 20

E 22 = ( 2 row total x 2 column


= total ) / G ( 40 X 50 ) /100
= 20

Quantitative Methods
148
Null Hypothesis ( H 0 ) : the Gender is an independent of Nature of work

H 0 : Oij = Eij

Alternative Hypothesis ( H1 ) : the Gender is not independent of Nature of work

H 0 : Oij ≠ Eij
The level of Significance, α = 5%

( Oij − eij )
2

The test statistic, χ =


2
∑ eij
To determine χ 2 , the following table is used

( Oij − Eij ) ( Oij − Eij )


2 2
Oij Eij / Eij
40 30 (40-30)2 = 100 100/30 = 3.33
20 30 (20-30)2 = 100 100/30 = 3.33
10 20 (10-20)2 = 100 100/20 = 5
30 20 (30-20)2 = 100 100/20 = 5
χ2 16.66

Therefore, χ 2 = 16.66

Now χ 2α ( r − 1)( c − =
1) χ 25% ( 2 − 1)( 2 − =
1) χ 25%1 df
= 3.841

Since, 16.6 > 3.841, χ 2 > χ 2α , Reject Ho that is Accept H1


Hence the Gender is not independent of Nature of work

Quantitative Methods
149
Self-Assessment Questions

5. The square of standard normal variate is called ___ variate.

A). t
B). F
C). Z

D). χ 2

6. Observed frequencies are denoted with_______.

A). Eij
B). Bij
C). Oij
D). Aij

7. χ 2 Test is of __Types.

A). 3
B). 4
C). 2
D). 5

8. Expected frequencies are denoted with______.

A). Eij
B). Bij
C). Oij
D). Aij

{F = F=
α , h −1, ,( k −1) ( h −1)
0.01,2,6 }
10.92

Quantitative Methods
150
2.3.3 Sign test
The sign test is a statistical method to test for consistent differences between pairs of obser-
vations, such as the weight of subjects before and after treatment. Given pairs of observations
(such as weight pre- and post-treatment) for each subject, the sign test determines if one mem-
ber of the pair (such as pre-treatment) tends to be greater than (or less than) the other member
of the pair.

● It is one of the earliest non-parametric tests.


● Its name comes from the fact that it is based on the direction of plus and minus
signs to denote changes in magnitude of observations in a given data.
● The objective of sign test is testing the significance of difference between the
medians of two random samples.

Procedure for Sign Test

● Setup Null Hypothesis H 0

● H 0 : There is no significance of difference between the medians of two random


samples

● Setup Alternative Hypothesis H1

● H1 : There is a significance of difference between the medians of two random


samples
● Choose Appropriate level of Significance
● Suppose Xi and Yi are two random variables.

● Calculate the signs of each pair of ( Xi − Yi ) .


● Omit those pairs where Xi = Yi .
● If number of observations in any one variable is more than the other variable
omit those observations.
● Count the number of “ + “ and “ − ” , and call it as “ n” .
● Count the number of “ + “ and call it as ‘ r ’ .
n
r−
2
● Test statistic is Z =
n
4

● If Z ≤ Zα , Accept the Null Hypothesis H 0

● If Z > Zα , Reject the Null Hypothesis H 0

Quantitative Methods
151
Example

Two archaeologists, X and Y worked at an ancient building for 25 days and found the following
artifacts. Use sign test to test the null hypothesis that the two archaeologists are equally good at
finding artifacts against the alternative hypothesis that X is better.

X 1 0 2 3 1 0 2 2 2 3 0 1 1 1 4 1 2 1 3 5 2 1 3 2 2
Y 0 0 1 0 2 0 0 1 1 2 0 1 2 1 1 0 2 2 6 0 2 3 0 2 1

Solution

Given that

X 1 0 2 3 1 0 2 2 2 3 0 1 1 1 4 1 2 1 3 5 2 1 3 2 2
Y 0 0 1 0 2 0 0 1 1 2 0 1 2 1 1 0 2 2 6 0 2 3 0 2 1
Xi − Yi + 0 + + - 0 + + + + 0 0 - 0 + + 0 - - + 0 - + 0 +

Now, n = Number of ‘+’ and ‘−‘ = 17


r = Number of ‘+’ = 12

Null Hypothesis H 0 : the two archaeologists are equally good at finding artifacts.

H0 : X = Y

Alternative hypothesis ( H1 ) : the archaeologist X is better than Y in finding artifacts.

H1 : X > Y

The Level of Significance α = 5%


n
r−
2
Test statistic is Z=
n
4
Z
= 12 − (17 / 2 ) / Sqrt (17 / 4 )

Z = 3.5 / 2.061
Z = 1.69
=
Now Zα Z
= 5% los 1.64
Since 1.69 > 1.64, Reject H 0 .

That is Accept H1
The archaeologist X is better than Y in finding artifacts.

Quantitative Methods
152
Self-Assessment Questions

9. The Sign test is based on the deviations of observations. What do you say?

A). Yes
B). No
C). May be
D). Can’t say

10. The test statistic used in Sign test is__________.

A). t
B). F
C).

D). χ 2

11. number of ‘ + ‘ in sigh test is denoted with__________.

A). s
B). p
C). r
D). n

Quantitative Methods
153
2.3.4 Wilcoxon Signed Rank
The Wilcoxon signed-rank test is a non-parametric statistical hypothesis test used either to test
the location of a set of samples or to compare the locations of two populations using a set of
matched samples.
● Rank test is a test which is based on ranks
● The Wilcoxon signed-rank test is the non-parametric test equivalent to the
parametric t-test.
● As the Wilcoxon signed-ranks test does not assume normality in the data, it can be
used when this assumption has been violated and the use of the t-test is inappropriate.
● The Wilcoxon signed-rank test is used when comparing two related samples, matched
samples, or repeated measurements on a single sample to assess whether their
population mean ranks differ.

Procedure for calculating Wilcoxon Signed rank test

● Find the difference between each pair of values and then take absolute values of the
differences.
● Assign ranks to the absolute value of the differences.
● Re-attached to each rank the positive or negative sign that was removed earlier.
● If ranks are repeated take the average of ranks.
● Calculate the sum of negative ranks and sum of positive ranks.
● Calculate the statistic W , which is equal to the smaller of the two sums

Quantitative Methods
154
Example

The weekly sales revenues of a product in 14 randomly selected retail stores before and after a
new competing product is released in the market is given below. Test whether there is a sig-
nificance of difference in sales of the product after the release of the competing product at 5%
level.

Retail Store 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Before 8 9 6 9 4 8 5 9 2 8 9 10 12 17

After 7 8 8 7 3 5 4 8 4 7 8 7 9 9

Solution

H 0 : There is no significance of difference in sales of the product after the release of the
competing product.

H1 : There is a significance of difference in sales of the product after the release of the compet-
ing product.
The level of Significance α = 5%
To prepare the test statistic W , the given data can be tabulated as follows

Sample Xa Xb Xa − Xb Xa – Xb Rank (| Xa – Xb |) Signed Ranks

1 8 7 1 1 4 +4
2 9 8 1 1 4 +4
3 6 8 -2 2 9.5 -9.5
4 9 7 2 2 9.5 +9.5
5 4 3 1 1 4 +4
6 8 5 3 3 13 +13
7 5 4 1 1 4 +4
8 9 8 1 1 4 +4
9 2 4 -2 2 9.5 -9.5
10 8 7 1 1 4 +4
11 9 8 1 1 4 +4
12 10 7 3 3 13 +13
13 12 9 3 3 13 +13
14 7 9 -2 2 9.5 -9.5

Quantitative Methods
155
Sum of ( + Ranks ) = 4 + 4 + 9.5 + 4 + 13 + 4 + 4 + 4 + 4 + 13 + 13 = 76.5

Sum of ( − Ranks ) = 9.5 + 9.5 + 9.5 = 28.5

=W Min ( Sum of + Ranks, Sum of – Ranks ) = Min ( 76.5, 28.5 ) = 28.5


W α at n = 14 is 21

Since 28.5 > 21 , Reject H 0

Hence Accept H1
There is a significance of difference in sales of the product after the release of the competing
product.

Quantitative Methods
156
Self-Assessment Questions

12. Signed Ranks are the base for Wilcoxon Signed Rank test. What do you say?

A). Yes
B). No
C). May be
D). Can’t say

13. The test statistic used in Wilcoxon Signed Rank test is___________.

A). W
B). F
C). X

D). χ 2

14. Wilcoxon Signed Rank test is________ Test.

A). Parametric
B). Mean
C). Non-parametric
D). variance

Summary

● The unit ANOVA & Non-parametric tests help in understanding the non-
parametric tests.
● ANOVA with two classifications used to analyse the means of more than two
populations with treatments, treatments & blocks.
● The sign test and Wilcoxon Signed Rank test are the significant non-parametric
tests based on the signs and ranks of sample observations.

Quantitative Methods
157
Terminal Questions

1. As a part of investigation of the collapse of the roof of a building, a testing laboratory is


given all the available bolts that connected the steel structure at 3 different positions
on the roof. The faces required to sheer each of these bolts are as follows.

Position 1 90 82 79 98 83 91
Position 2 105 89 93 104 89 95 86
Position 3 83 89 80 94

Perform an ANOVA to test at 0.05 L.O.S whether the difference among the
sample means at the 3 positions are significant.
Fα = F= 3.74
Tab: , k −1, N − k 0.05,2,14

2. Explain the computational procedure for Chi-square test for goodness of fit.
3. Look on the detergents as treatments and engines as blocks, obtain the appropriate
t ANOVA table and test at 0.01 level of significance whether there are differences in
the detergents or in the engines.

Engine 1 Engine 2 Engine 3


Detergent 1 45 43 51
Detergent 2 47 46 52
Detergent 3 48 50 55
Detergent 4 42 37 49

{F = F=
α , k −1,( k −1) ( h −1)
0.01,3,6 } {
9.78 , Fα ,h −1, ,( k=
−1) ( h −1)
F=
0.01,2,6 }
10.92

4. From a telephone directory, the sample of the100 digits are distributed as follows

Digit 0 1 2 3 4 5 6 7 8 9
Frequency 11 8 12 6 5 13 14 7 8 16

Use Chi-square test and test whether the digits are uniformly distributed or not.
5. Explain the computational procedure for Chi-square test for independence of attri-
butes.
6. Discuss the procedure for Sign Test with an example.
7. Discuss the procedure for Wilcoxon Signed Rank test with an example.

Quantitative Methods
158
Answer Keys

Self-Assessment Questions

Question No Answers

1 B

2 A

3 C

4 D

5 D

6 C

7 C

8 A

9 A

10 C

11 C

12 A

13 A

14 C

Quantitative Methods
159
Glossary
● Absolute value: The actual distance of the integer from zero, in a number line.
● Null hypothesis: A type of statistical hypothesis that proposes that no statisti-
cal significance exists in a set of given observations.
● Homogeneity: The quality of being similar or comparable in kind or nature.

BIBLIOGRAPHY
1. Chung, K. L. (2000). A Course in Probability Theory (3rd ed.). San Diego, CA:
Academic Press.
2. Feller, W. (1968). An introduction to Probability Theory and its applications:
Volume I. John Wiley & Sons.
3. Gupta, S. C., & Kapoor, V. K. Fundamentals of Mathematical Statistics. S. Chand
Publications.

External Resources
1. Rukmangadachari, E. Probability and Statistics. Person Education.
2. Rohatgi, V. K., & Ehsanes Saleh, A. K. (2015). An introduction to probability and
statistics (3rd ed.). doi:10.1002/9781118799635

e-References
● Two-Way ANOVA:
https://www.investopedia.com/terms/t/two-way-anova.asp

● What is a Chi-Square Test?


https://www.simplilearn.com/tutorials/statistics-tutorial/chi-square-test

Quantitative Methods
160
Video Links

Topic Link
https://www.youtube.com/
One way ANOVA Steps Involved in ANOVA
watch?v=7Nt-PeITLbY
Two Way ANOVA Steps Involved in Analy- https://www.youtube.com/
sis of Variance ANOVA watch?v=xMtmhctKyOU
Pearson's chi square test (goodness of fit) https://www.youtube.com/
Probability and Statistics watch?v=2QeDRsxSF9M
https://www.youtube.com/
Sign Test Concept and Example
watch?v=GV5y5IzhyEU&t=258s
Wilcoxon-Test (Wilcoxon Signed Rank https://www.youtube.com/
Test) watch?v=NZsL2eDQiDQ

Keywords

● Analysis of Variance (ANOVA)


● Chi-square
● Rank test

Quantitative Methods
161
QUANTITATIVE METHODS

Module - 3

CORRELATION & REGRESSION

Quantitative Methods
162
Module Description

Any statistical association between two random variables or bivariate data, whether causal or
not, is referred to in statistics as correlation or dependency. Although “correlation” can mean
any kind of association in the broadest sense, in statistics it typically refers to the strength of a
pair of variables’ linear relationships. Examples of common dependent phenomena include the
relationship between parent and child height and the relationship between a good’s price and the
number of units buyers are prepared to buy, as shown in the so-called demand curve.

In the common language, the word correlation refers to an association of some kind. We could
claim to have observed a connection between wheezy episodes and foggy days. However,
correlation is the statistical term used to describe the relationship between two quantitative
variables. We also assume that the relationship is linear, meaning that for every unit rise or
reduction in one variable, the other increases or decreases by a constant amount. Regression,
which entails estimating the best straight line to summarise the correlation, is the other method
that is frequently employed in similar situations.

A statistical method called regression links a dependent variable to one or more independent
(explanatory) variables. A regression model can demonstrate whether changes in one or more of
the explanatory variables are related to changes in the dependent variable. This is accomplished
by essentially fitting a best-fit line and observing the distribution of the data around this line.
Financial analysts and economists can use regression to make predictions and value assets,
amongst other things. To interpret regression findings correctly, several presumptions regarding
the data and the model itself must be true.

This Module is divided into the following units:

Unit 3.1 Introduction to Correlation


Unit 3.2 Regression

Quantitative Methods
163
QUANTITATIVE METHODS

Module - 3
Unit - 1

INTRODUCTION TO
CORRELATION

Quantitative Methods
164
Unit Table of Contents
Unit 3.1 Introduction to Correlation

Aim -------------------------------------------------------------------------------------------------------------- 166


Instructional Objectives ------------------------------------------------------------------------------------ 166
Learning Outcomes ----------------------------------------------------------------------------------------- 166
Introduction ---------------------------------------------------------------------------------------------------- 167
3.1.1 Meaning of Correlation --------------------------------------------------------------------------- 167
Self-Assessment Questions --------------------------------------------------------------------- 169
3.1.2 Measurement of Correlation -------------------------------------------------------------------- 170
Self-Assessment Questions --------------------------------------------------------------------- 171
3.1.3 Scatter Plot ------------------------------------------------------------------------------------------ 172
Self-Assessment Questions --------------------------------------------------------------------- 177
3.1.4 Pearson Correlation Coefficient ---------------------------------------------------------------- 178
Self-Assessment Questions --------------------------------------------------------------------- 183
3.1.5 Spearman’s Rank Correlation ------------------------------------------------------------------ 184
Self-Assessment Questions --------------------------------------------------------------------- 187
Summary ------------------------------------------------------------------------------------------------------- 188
Terminal Questions ------------------------------------------------------------------------------------------ 188
Answer Keys -------------------------------------------------------------------------------------------------- 189
Glossary -------------------------------------------------------------------------------------------------------- 190
Bibliography --------------------------------------------------------------------------------------------------- 190
External Resources ----------------------------------------------------------------------------------------- 190
e-References ------------------------------------------------------------------------------------------------- 190
Image Credits ------------------------------------------------------------------------------------------------- 191
Video Links ---------------------------------------------------------------------------------------------------- 191
Keywords ------------------------------------------------------------------------------------------------------ 191

Quantitative Methods
165
Aim
This unit aims to explain the basic concepts of Correlation and its applications in
Management.

Instructional Objectives
This unit intends to:
● Explain the concepts of correlation and types of correlation
● Apply the methods of correlation in various applications of management

Learning Outcomes
At the end of this unit, you are expected to:
● Demonstrate the basic concepts of correlation and different methods
correlation to analyse the data
● Interpret the concepts of correlation in management applications such as
marketing, HRM, finance, etc.

Quantitative Methods
166
INTRODUCTION
Correlation and regression are the Bi-variate Statistical tools used to analyse the Bi-variate data
( Xi, Yi )=i 1, 2……. n . The preamble for correlation is ‘Relation’, which means dependency. Let
Y= a + bX be the form of straight line, in which Y is the dependent variable and is depending
on the variable X is called Independent Variable. Ex: Among the variables Price & Demand,
the Demand (Y ) is the dependent variable and is depending on Price ( X ) is the Independent
Variable. Therefore, the variables X and Y are related.

3.1.1 MEANING OF CORRELATION


A correlation is a relationship between two variables. The data can be represented by the ordered
pairs ( x, y ) where x is the independent (or explanatory) variable and y is the dependent (or
response) variable.

Correlation: If the change in one variable leads to change in another variable, then the two
variables are said to be Correlated. If the effect of Change of one variable leads to the effect of
change of another variable, the two variables are said to be correlated.

• Price and demand of a product


• Volume and pressure of perfect Gas
• Demand and supply of a product
• Income and expenditure of a group of employees

3.1.1.1 TYPES OF CORRELATION

• Positive Correlation
o If the two variables deviate in the same direction, i.e., increase in one variable
leads to increase in another variable then the two variables are said to have
Positive Correlation
o Ex: Income and Expenditure of a group of employees
• Negative Correlation
o If the two variables deviate in the opposite direction, i.e., increase in one variable
leads to decrease in another variable then the two variables are said to have
Negative Correlation
o E.g.: Price and Demand of a product
• Perfectly Positive Correlation
o If the two variables deviate in the same direction with equal ratio and forms a
straight line, then the two variables are said to have Perfectly positive Correlation.
o E.g.: Income and Expenditure of a group of employees with equal ratio

Quantitative Methods
167
• Perfectly Negative Correlation.
o If the two variables deviate in the opposite direction with equal ratio and forms a
straight line, then the two variables are said to have Perfectly negative Correlation.
o E.g.: Price and Demand of a product with equal ratio
• No or Nonsense Correlation
o There is a correlation between two variables, but which are not related then such
a correlation is called Nonsense Correlation
o E.g.: The correlation between hair colours of people in the US, and rainfall in India
is a nonsense correlation

Quantitative Methods
168
Self-Assessment Questions

1. Change in one variable leads to change in another variable is called __.

A). Correlation
B). Relation
C). Parameter
D). Statistic

2. Number of types of correlation are________.

A). 4
B). 5
C). 3
D). 2

3. If two variables deviate in the same direction, it is called_______ correlation.

A). Zero
B). Negative
C). Positive
D). non-sense

4. If two variables deviate in the opposite direction, it is called_______ correlation.

A). Zero
B). Negative
C). Positive
D). Nonsense

Quantitative Methods
169
3.1.2 MEASUREMENT OF CORRELATION

Measurement of correlation means to find the nature of correlation and the amount of correlation
in Bi-variate data ( Xi, Yi ) =
i 1, 2 …. n . Various methods and techniques are used to measuring
the correlation in Bi-variate data. Following methods are used to determine the nature and the
amount of correlation in Bi-variate data ( Xi, Yi ) =
i 1, 2 …. n .

● Scatter Diagram
● Karl Pearson’s Coefficient of Correlation (K P C C)
● Spearman’s Rank Coefficient of Correlation (S R C C)

Quantitative Methods
170
Self-Assessment Questions

5. Measurement of correlation means to find the nature of correlation and the amount
of correlation in Bi-variate data. What do you say?

A). Yes
B). No
C). May be
D). Can’t say

6. There are ______number of methods to find correlation.

A). 4
B). 5
C). 3
D). 2

Quantitative Methods
171
3.1.3 SCATTER PLOT
● Scatter Plot or Scatter Diagram is the Diagrammatic or Graphical representation of a
Bi-variate data ( Xi, Yi )=i 1, 2, ….n in a Two-dimensional plane.
● Scatter Diagram explains the Nature of Correlation in a Bi-variate data ( Xi, Yi )=i 1, 2, ….n
● If all the ordered pairs in the Bi-variate data are very close to each other, then a good
amount of correlation is expected and if the points are widely scattered, then a poor
amount of correlation is expected.

Following scatter diagrams explains the nature of correlation among the Bi-variate data

( Xi, Yi )=i 1, 2, ….n

Positive Correlation

Negative Correlation

Perfectly Positive Correlation

Quantitative Methods
172
Perfectly Negative Correlation

No Correlation

Non linear Correlation

3.1.3.1 Examples:

1. Draw the Scatter Diagram and comment on the nature of correlation for the following
Bi-variate Data.

X 1 1.5 2 2.5 3 3.5


Y 1.5 2.5 3 4 4.5 6

Quantitative Methods
173
Solution:

The Scatter Diagram for the given data is as follows

Since the variable X and Y are increasing from left to right. Hence the variables X & Y are
positively correlated.

2. Draw the Scatter Diagram and comment on the nature of correlation for the following Bi-vari-
ate Data.

X 1 1.5 2 2.5 3 3.5


Y 7 6.5 5 4 2 1

Solution:

The scatter diagram for the given data is as follows

Since the variable X and Y are decreasing from left to right. Hence the variables X & Y are
negatively correlated.

Quantitative Methods
174
3. Draw the Scatter Diagram and comment on the nature of correlation for the following Bi-vari-
ate Data.

X 1 2 3 4 5 6

Y 2 4 6 8 10 12

Solution:

The Scatter Diagram for the given data is as follows

Since the variable X and Y are increasing from left to right with equal ration. Hence the variables
X & Y are perfectly positively correlated.

4. Draw the Scatter Diagram and comment on the nature of correlation for the following Bi-vari-
ate Data.

X 1 2 3 4 5 6
Y 12 10 8 6 4 2

Solution:

The Scatter Diagram for the given data is as follows

Quantitative Methods
175
Since the variable X and Y are decreasing from left to right with equal ration. Hence the vari-
ables X & Y are perfectly negatively correlated.

5. Draw the Scatter Diagram and comment on the nature of correlation for the following Bi-vari-
ate Data.

X 1 4 6 8 10 12
Y 2 7 4 6 10 2

Solution:

The Scatter Diagram for the given data is as follows

Since the variable X and Y neither increasing nor decreasing. Hence the variables X & Y
have no correlation.

Quantitative Methods
176
Self-Assessment Questions

7. Scatter Diagram is the graphical representation of Correlation. What do you say?

A). Yes
B). No
C). Maybe
D). Can’t say

8. Scatted Diagram explains the Nature of Correlation. What do you say?

A). Yes
B). No
C). Maybe
D). Can’t say

9. If all the ordered pairs in the Bi-variate data are very close to each other, then ____
amount of correlation is expected.

A). Worst
B). No
C). Poor
D). Good

10. If all the ordered pairs in the Bi-variate data are widely scattered, then ____amount
of correlation is expected.

A). Worst
B). No
C). Poor
D). Good

Quantitative Methods
177
3.1.4 Pearson Correlation Coefficient
● Scatter Diagram provides only the Nature of Correlation (Positive, Negative, perfectly
Positive etc…) but not the amount of correlation between the two variables in the Bi-vari-
ate data.
● The K P C C is used to determine not only the nature but also the amount of Correlation
in a Bi-variate Data developed by Karl Pearson.
● The K P C C is a measure of Linear Relation between two Variables X & Y in the
Bi-variate Data
● The K P C C is represented by ρ xy or Rxy
cov( x, y )
Rxy =
σ xσ y
1
n
∑ xy − xy
Rxy =
 1  1 
 ∑ x 2 − ( x )2   ∑ y 2 − ( y )2 
 n  n 
1 1
x=
n
∑ xi y = ∑ yi
n
,
n - Number of pair of observations

● Where COV ( x, y ) is the Co-variance of x and y


● σ x is the Standard Deviation of X
● σ y is the Standard Deviation of Y

3.1.4.1 Properties of Correlation

● The Correlation (KPCC or SRCC) is always lies between −1 and +1 . i.e., −1 ≤ ρ xy ≤ + 1


● If −1 < ρ xy < 0 , the variables x and y are Negatively Correlated.
● If 0 < ρ xy < 1 , the variables x and y are Positively Correlated.
● If ρ xy = c (1,y 2,3,
− 1 , the variables a =and are4,5 )
Perfectly Negatively Correlated.
● If ρ xy = +1, the variables x and y are Perfectly Positively Correlated.
● If ρ xy =0, the variables x and y have No Correlation
● If ρ xy is in between (0 to 0.3 (-0.3)) – Poor Correlation
● If ρ xy is in between (0.4(-0.4) to 0.7 (-0.7)) – Moderate Correlation

● If ρ xy=> 0.8 ( −0.8 ) – High Correlation

Quantitative Methods
178
3.1.4.2 Examples

1. Calculate the K P C C for the following data


15 , ∑ X =
∑ Y = 15 , ∑ XY =
44 , ∑ X 2 =
49 , ∑ Y 2 =
49 and n = 5 .

Solution

Given that
∑ Y = 15 , ∑ X = 15 , ∑ XY =
44 , ∑ X 2 =
49 , ∑ Y 2 =
49 and n = 5 .
cov( x, y )
The KPCC Rxy =
σ xσ y
1
n
∑ xy − xy
Rxy =
 1  1 
 ∑ x 2 − ( x )2   ∑ y 2 − ( y )2 
 n  n 
1 1
x=
n
∑ xi y = ∑ yi
n
,
= 5 (15 )
X bar 1/= 3

= 5 (15 )
Y bar 1/= 3

ρ xy 1/ 5 ( 44 ) – ( 3*3)  /  Sqrt{(1/ 5 ) * 49 − 9}  Sqrt{(1/ 5 ) * 49 − 9}

= (8.8 − 9 ) /  sqrt ( 0.8 ) * sqrt ( 0.8 ) 

= − 0.2 / ( 0.894 )( 0.894 )


= − 0.2 / 0.799
= − 0.250
Since Pxy = -0.25
Therefore, the variables x and y are negatively correlated

2. In a marketing survey, the price of milk and coffee in a town based on quality was found as
shown below. Could you find any relation between milk and coffee price?

Price of Milk 89 90 95 70 60 75 50
Price of Coffee 120 134 150 115 110 140 100

Solution
Given that, the number of pairs of observations, n =
The observations

Quantitative Methods
179
Xi : 89 90 95 70 60 75 50
Yi : 120 134 150 115 110 140 100

To determine the relation between the price of Milk and the price of coffee, we have to use the
Karl Pearson’s Coefficient of Correlation (KPCC)
cov( x, y )
Rxy =
σ xσ y
1
n
∑ xy − xy
Rxy =
 1  1 
 ∑ x 2 − ( x )2   ∑ y 2 − ( y )2 
 n  n 
The Given data can be tabulated as follows
Xi Yi X2 Y2 XY
89 120 7921 14400 10680
90 134 8100 17956 12060
95 150 9025 22500 14250
70 115 4900 13225 8050
60 110 3600 12100 6600
75 140 5625 19600 10500
50 100 2500 10000 5000

∑ X = 529 ∑ Y =869 ∑ X 2 =41671 R =109781 ∑ XY = 67140

From the Table, ∑ X = 529, ∑ Y =869, ∑ X =41671, ∑ Y =109781 and Length=(67140,


y) 3 7
2 2
n=
Xbar = 1/ n ∑ X = 1/ 7 ( 529 ) = 75.57

Ybar = 1/ n ∑ Y = 1/ 7 ( 869 ) = 124.14

ρ xy =
{1/ 7 ( 67140 ) – ( 75.57 )(124.14 ) }
{Sqrt (1/ 7 ( 41671) − 5710.82}{Sqrt (1/ 7 (109781) − 15410.73}
9591.42 – 9381.25
=
(15.56 ) (16.50 )
210.17
=
256.74
= 0.8186
Since ü = 0.8186
Therefore, the price of milk and the price of Coffee are strongly positively correlated.

Quantitative Methods
180
3. Calculate the coefficient of Correlation between the age of cars and annual maintenance
cost and comment

Age of Cars (Yrs) 2 4 6 7 8 10 12


Annual maintenance cost
1600 1500 1800 1900 1700 2100 2000
(Rupees)

Solution

Given that

Xi : 2 4 6 7 8 10 12
Yi : 1600 1500 1800 1900 1700 2100 2000

Where n = no.of Pairs of Observations =7


cov( x, y )
Rxy =
σ xσ y
1
n
∑ xy − xy
Rxy =
 1  1 
 ∑ x 2 − ( x )2   ∑ y 2 − ( y )2 
 n  n 

The given data can be tabulated as follows


X Y X2 Y2 XY
2 1600 4 2560000 3200
4 1500 16 2250000 6000
6 1800 36 3240000 10800
7 1900 49 3610000 13300
8 1700 64 2890000 13600
10 2100 100 4410000 21000
12 2000 144 4000000 24000
ΣX=49 ΣY=12600 Σ X2=413 Σ Y2=22960000 ΣXY=91900
Xbar = 1/ nΣX = 1/ 7 ( 49 ) = 7

Ybar = 1/ nΣy = 1/ 7 (12600 ) = 1800

1/ 7 ( 91900 ) − ( 7 )(1800 ) 


Rxy =
Sqrt 1/ 7 ( 413) − 49  Sqrt 1/ 7 ( 22960000 ) − 3240000 

Quantitative Methods
181
13128.57 – 12600
=
Sqrt (10 ) Sqrt ( 40000 )

528.57
=
632
= 0.836
Since Rxy = 0.836 , there exists a strong positive correlation between the Age of cars and its
maintenance of cost.

Quantitative Methods
182
Self-Assessment Questions

11. Range of Correlation is___________.

A). −1 < ρ xy < + 1


B). −1 < ρ xy < 0
C). −1 < ρ xy < − 2
D). 0 < ρ xy < 1

12. If 0 < ρ xy < 1 , the variables x and y are __________ Correlated.

A). Zero
B). No
C). Positively
D). Negatively

13. If −1 < ρ xy < 0 , the variables x and y are __________ Correlated.

A). Zero
B). No
C). Positively
D). Negatively

14. If ρ xy = + 1 , the variables x and y are __________ Correlated.

A). Perfectly positive


B). Perfectly Negative
C). Poor
D). Good

Quantitative Methods
183
3.1.5 Spearman’s Rank Correlation Coefficient (SRCC)
● The S R C C is used to calculate the Correlation for the Qualitative Bi-variate Data.
● Calculate the ranks of Xi and Yi observations

6∑ di2
● The SRCC ρ xy = 1 − , Where di = Rx-Ry (Deviations of Ranks)
n(n 2 − 1)
n = Number of Pairs of Observations
3.1.5.1 Properties of Rank Correlation

● The Correlation (KPCC or SRCC) is always lies between −1 and +1 . i.e., −1 ≤ ρ xy ≤ + 1


● If −1 < ρ xy < 0 , the variables x and y are Negatively Correlated.
● If 0 < ρ xy < 1 , the variables x and y are Positively Correlated.
● If ρ xy = − 1 , the variables x and y are Perfectly Negatively Correlated.
● If ρ xy = + 1 , the variables x and y are Perfectly Positively Correlated.
● If ρ xy = 0 , the variables x and y have No Correlation
● If ρ xy is in between (0 to 0.3 (-0.3)) – Poor Correlation
● If ρ xy is in between (0.4(-0.4) to 0.7 (-0.7)) – Moderate Correlation

● If ρ xy=> 0.8 ( −0.8 ) – High Correlation

Examples
1. Following are the marks obtained by 10 students in a class in two tests. Calculate the Rank
Correlation

Test – I 70 68 67 55 60 60 75 63 60 72
Test – II 65 65 80 60 68 58 75 63 60 70
Solution:
Given that

X 70 68 67 55 60 60 75 63 60 72

Y 65 65 80 60 68 58 75 63 60 70
n = No. of pairs of observations =10
6∑ di2
The rank correlation ρ xy = 1 −
n(n 2 − 1)
di
= Rx − Ry , n = number of pairs of observations
the given data can be tabulated as follows

Quantitative Methods
184
X Y Rx Ry di
= Rx − Ry di 2
70 65 3 5.5 3-5.5=-2.5 6.25
68 65 4 5.5 -1.5 2.25
67 80 5 1 4 16
55 60 10 8.5 1.5 2.25
60 68 8 4 4 16
60 58 8 10 -2 4
75 75 1 2 -1 1
63 63 6 7 -1 1
60 60 8 8.5 -0.5 0.25
72 70 2 3 -1 1
Σdi 2 50

6∑ di2
ρ xy = 1 −
n(n 2 − 1)
= 1 − ( 6*50 ) /10 ( 99 ) 

= 1 − ( 300 / 990 )
= 1 − 0.303
= 0.696
Therefore, the test I and Test II Marks are positively correlated.

2. Ten competitors in a musical test were ranked by 3 judges in the following order.

Ranks by A 1 6 5 10 3 2 4 9 7 8
Ranks by B 3 5 8 4 7 10 2 1 6 9
Ranks by C 6 4 9 8 1 2 3 10 5 7

Use Rank Correlation and determine which pair of judges have the nearest approach in their
judgment.

Solution

Given that the data

RA 1 6 5 10 3 2 4 9 7 8

RB 3 5 8 4 7 10 2 1 6 9

RC 6 4 9 8 1 2 3 10 5 7

Quantitative Methods
185
To determine the pair of judges having the nearest approach in their judgment, we have to find
=
the rank correlations between ( sno, BC( 6,and
c= 7), AC c (‘Seeta’,’Geeta
namerespectively. = ’) , marks c (15,14
= ) , region c (‘gun’,’
The Spearmans’ Rank Coefficient of Correlation (SRCC)
6∑ di2
ρ xy = 1 −
n(n 2 − 1)

The given data can be tabulated as follows

RA RB RC d1
= RA − RB d12 d2
= RB − RC d 22 d3
= RC − RA d32
1 3 6 -2 4 -3 9 5 25
6 5 4 1 1 1 1 -2 4
5 8 9 -3 9 -1 1 4 16
10 4 8 6 36 -4 16 -2 4
3 7 1 -4 16 6 36 -2 4
2 10 2 -8 64 8 64 0 0
4 2 3 5 25 -1 1 -1 1
9 1 10 8 64 -9 81 1 1
7 6 5 -1 1 1 1 -2 4
8 9 7 -1 1 2 4 -1 1
ρ AB = −0.21
ρ BC = −0.29
ρ AC = 0.63
0.63 > −0.21 > −0.29
ρ AC > ρ BC > ρ AB
Therefore, judges A and C are having the nearest approach in their judgment.

Quantitative Methods
186
Self-Assessment Questions

15. Rank Correlation is used to calculate the correlation between ____ Bi-variate data.

A). Quantitative
B). Numeric
C). Qualitative
D). Alphabet

16. If ρ xy = 0 , the variables x and y are __________ Correlated.

A). Some
B). Not
C). Positively
D). Negatively

17. If ρ xy = 0.95 , the variables x and y are __________ Correlated.

A). Zero
B). Moderate
C). c) Low
D). d) Highly

18. If ρ xy = − 1 , the variables x and Geeta


are __________ Correlated.

A). Perfectly positive


B). Perfectly Negative
C). Poor
D). Good

Quantitative Methods
187
Summary

• The unit aims to introduce the concept of correlation and their applications in the
field of management.
• Basically, correlation is of two types of positive correlation and negative correlation.
• Scatter plot, Karl Pearson Correlation Coefficient and Spearman’s Rank Correlation
Coefficient (SRCC) are the methods to determine the nature and amount of correla-
tion of a bi-variate data.

Terminal Questions

1. Differentiate Pearson’s vs Spearman’s correlation.


2. Explain:
a) Positive correlation
b) Negative correlation
3. Following are the marks obtained by 10 students in a class in two tests. Calculate the
correlation coefficient between the marks of two tests.

Test- 1 70 68 67 55 60 60 75 63 60 72
Test-2 65 65 80 60 68 58 75 63 60 70

4. Following are the ranks given by two judges for 12 Contestants in a singing
competition. Find out whether judges agree or not.

X 1 9 2 10 3 11 8 4 12 7 5 6
Y 2 9 1 7 4 10 8 3 12 6 5 10

Quantitative Methods
188
Answer Keys

Self-Assessment Questions

Question No Answers

1 A

2 B

3 C

4 B

5 A

6 C

7 A

8 A

9 D

10 C

11 A

12 C

13 D

14 A

15 C

16 B

17 D

18 B

Quantitative Methods
189
Glossary
• Correlation: A statistical term describing the degree to which two variables
move in coordination with one another.
• Positive Correlation: A relationship between two variables that tend to move
in the same direction.
• Bi-variate data: Data on each of two variables, where each value of one of the
variables is paired with a value of the other variable.

BIBLIOGRAPHY
1. Chung, K. L. (2000). A Course in Probability Theory (3rd ed.). San Diego, CA:
Academic Press.
2. Feller, W. (1968). An introduction to Probability Theory and its applications:
Volume I. John Wiley & Sons.
3. Gupta, S. C., & Kapoor, V. K. Fundamentals of Mathematical Statistics. S. Chand
Publications.

External Resources
1. Rukmangadachari, E. Probability and Statistics. Person Education.
2. Rohatgi, V. K., & Ehsanes Saleh, A. K. (2015). An introduction to probability and
statistics (3rd ed.). doi:10.1002/9781118799635

e-References
• MEASUREMENT OF CORRELATION: https://theintactone.com/2019/02/10/qt-u2-top-
ic-3-measurement-of-correlation-karl-pearsons-method-spearman-rank-correlation/

• Pearson Correlation Coefficient: https://www.scribbr.com/statistics/pearson-correla-


tion-coefficient/

Quantitative Methods
190
Image Credits

https://www.embibe.com/exams/correla-
Curves are drawn from:
tion/

Video Links

Topic Link
Types of correlation & what is correlation https://www.youtube.com/
coefficient: Correlation and Regression watch?v=8dPkvu4gAvc&t=85s
Correlation and Regression Analysis: Sim- https://www.youtube.com/
plest Way to Learn with Examples watch?v=xTpHD5WLuoA
Introduction to Correlation & Regression, https://www.youtube.com/
Part 1 watch?v=z7kMeJQWr4Y

Keywords

• Negative Correlation
• Rank Correlation
• No Correlation
• Rank Coefficient
• Perfectly Positive Correlation

Quantitative Methods
191
QUANTITATIVE METHODS

Module - 3
Unit - 2

REGRESSION

Quantitative Methods
192
Unit Table of Contents
Unit 3.2 Regression

Aim -------------------------------------------------------------------------------------------------------------- 194


Instructional Objectives ------------------------------------------------------------------------------------ 194
Learning Outcomes ----------------------------------------------------------------------------------------- 194
Introduction ---------------------------------------------------------------------------------------------------- 195
3.2.1 Meaning ---------------------------------------------------------------------------------------------- 195
Self-Assessment Questions --------------------------------------------------------------------- 196
3.2.2 Types of Regression ------------------------------------------------------------------------------ 197
Self-Assessment Questions --------------------------------------------------------------------- 200
3.2.3 Estimating the regression coefficients -------------------------------------------------------- 201
Self-Assessment Questions --------------------------------------------------------------------- 207
Summary ------------------------------------------------------------------------------------------------------- 208
Terminal Questions ------------------------------------------------------------------------------------------ 208
Answer Keys -------------------------------------------------------------------------------------------------- 209
Glossary -------------------------------------------------------------------------------------------------------- 210
Bibliography --------------------------------------------------------------------------------------------------- 210
External Resources ----------------------------------------------------------------------------------------- 210
e-References ------------------------------------------------------------------------------------------------- 210
Image Credits ------------------------------------------------------------------------------------------------- 211
Video Links ---------------------------------------------------------------------------------------------------- 211
Keywords ------------------------------------------------------------------------------------------------------ 211

Quantitative Methods
193
Aim
This unit aims to explain the basic concepts of Regression and its applications in
Management.

Instructional Objectives
This unit intends to:
● Explain the concepts of Regression and types of correlation
● Apply the methods of Regression in various applications of management

Learning Outcomes
At the end of this unit, you are expected to:
● Elaborate upon the different types of Regression used to analyse data
● Optimise using Regression in Marketing, HRM, finance, etc.

Quantitative Methods
194
INTRODUCTION
After understanding the relationship between the Bi-variate data (Xi, Yi) i=1,2……. n, we may
be interested in estimating or predicting the value of one variable given the value of other. The
variable predicted based on another variable is called the ‘Dependent’ or the ‘Explained’ variable
and the other the ‘Independent’ or ‘Predicting’ Variable.

The Prediction is based on average relationship derived statistically by Regression Analysis. The
Equation, Linear or otherwise, is called the regression equation or explaining equation.

3.2.1 MEANING
● The literal meaning of Regression is ‘stepping back towards average value’. That means,
Regression is used to determine the average relationship between two variables.
● Correlation is used to determine the rate of change in two variables whereas Regression
is used to determine the amount of dependency between the Two Variables.
● Let Y = a + b. X be the form of straight line
Where α =- Dependent
5% or 1% or 10%
Variable
X – Independent Variable
a – Intercept
b – Slope

Quantitative Methods
195
Self-Assessment Questions

1. ‘Stepping Back towards Average Value’ is the meaning of __.

A). Correlation
B). Relation
C). Parameter
D). Regression

2. In Y = a + bX , a stands for __.

A). Intercept
B). Slope
C). Correlation
D). Regression

3. In Y = a + bX , b stands for __.

A). Intercept
B). Slope
C). Correlation
D). Regression

4. In Y = a + bX , Y is _____Variable.

A). Intercept
B). Slope
C). Dependent
D). Independent

5. In Y = a + bX , X is _____Variable.

A). Intercept
B). Slope
C). Dependent
D). Independent

Quantitative Methods
196
3.2.2 TYPES OF REGRESSION
Regression is classified into
● Liner Regression
● Multiple Regression
● Non-Linear Regression

3.2.2.1 Linear Regression

The linear relationships are based on straight line trend, the equation of which has no-power higher
than one. A linear relationship can be both simple and multiple. Normally a linear relationship is
taken into account because besides its simplicity, it has a better predictive value. A linear trend
can be easily projected in the future.

LINEAR REGRESSION

The thing we want


to explain i.e 77% of the variance in y is If you only had data on x, this line pro-
DEPENDENT explained by x. Below c.30% means vides your best estimate of y. If the Fit
VARIABLE they’re hardly connected. Above 95% is strong and no major ourliers, x could
and they’re practically the same be used as a surrogate or forecast of y.
LINE OF BEST FIT
y
R2 = 0.77

DATA
POINT
95% CONFIDENCE BAND

If a data point falls outside these lines.


you’re 95% sure there is something
special about it causing it to do better
OUTLIER or worse than others an ‘outlier’ worth
understanding

INDEPENDENT The factor we think


VARIABLE might influence the
dependent variable
x

Fig. 1: Linear Regression


Types of Linear Regression

Linear regression is of two types


• Regression Line of ‘Y on X’

● Let Y= a + b. X be the form of Straight Line. Where Y is the dependent


Variable and is depending on X is the Independent Variable.

Quantitative Methods
197
● Then the Regression line of Y on X is represented as
y byx( x − x )
y −=
Where byx is called the Regression coefficient of Y on X and is given by
σy
byx = ρ xy
σx
cov( x, y )
byx =
σ x2
• Regression Line of ‘X on Y’

• Let X= a + b.Y be the form of Straight Line. Where X is the dependent


variable and is depending on Y is the independent variable.
• Then the Regression line of X on Y is represented as
• Where bxy is called the Regression Coefficient of X on Y and is given by
x bxy ( y − y )
x −=
Where bxy is called the Regression Coefficient of X on Y and is given by
σx
bxy = ρ xy
σy
cov( x, y )
bxy =
σ y2
3.2.2.2 Multiple Regression

 One Dependent with multiple independent variables

 Let Y is the dependent variable and x1 , x2 , x3 , ………….xn are n independent


variables then the multiple regression equation can be represented as

y a0 + a1 x1 + a2 x2 +………………+ an xn
=

where y is the dependent variable and x1 , x2 , x3 , ………….xn are independent variables.


Body length

Tail length

Mouse weight

Fig. 2: Multiple weight

Quantitative Methods
198
3.2.2.3 Non-Linear Regression

● Second degree Parabola (y = a+bx+cx2)

● nth degree Polynomial ( =


y a0 + a1 x + a2 x 2 +………………+ an x n )

● (
Exponential Curve y = ae x )
● (
Power Curve y = ax b )

Quantitative Methods
199
Self-Assessment Questions

6. Liner Regression is based on straight line trend. What do you say?

A). Yes
B). No
C). May be
D). Never

7. Number of Regression Lines are _________.

A). 4
B). 5
C). 3
D). 2

8. Number of Dependent variables in multiple Regression are__________.

A). 2
B). 1
C). 3
D). 4

9. y = ae x is the form of ______Curve.

A). Straight Line


B). Power
C). Exponential
D). Polynomial

10. y = ax b is the form of ______Curve.

A). Straight Line


B). Power
C). Exponential
D). Polynomial

Quantitative Methods
200
3.2.3 ESTIMATING THE REGRESSION COEFFICIENTS
Regression coefficients can be estimated through the direct method and principle of least squares
in which minimising the sum of squares of deviations of actual values from its estimated values.

Example 1

15
Estimate the regression equation of Y on X and X on Y , given the following data: ∑ Y =
15 , ∑ XY =
, ∑ X = 44 , ∑ X 2 =
49 , ∑ Y 2 =
49 and n §= .

Solution

15 , ∑ X =
Given data ∑ Y = 15 , ∑ XY =
44 , ∑ X 2 =
49 , ∑ Y 2 =
49 and n = 5
.
The regression line of X on Y

x bxy ( y − y )
x −=
cov( x, y )
bxy =
σ y2

σx
bxy = ρ xy
σy

Xbar = 1/ nΣx = 1/ 5 (15 ) = 3

Ybar = 1/ nΣY = 1/ 5 (15 ) = 3

bxy =
1/ nΣxy – ( xbar * ybar )
– ( ybar )
2 2
1/ nΣy

1/ 5 ( 44 ) – ( 3*3)
bxy =
1/ 5 ( 49 ) − 32

8.8 − 9
bxy =
9.8 − 9

−0.2
bxy =
0.8

bxy = − 0.25

The regression line of X on Y

x bxy ( y − y )
x −=

Quantitative Methods
201
X − 3 = − 0.25 (Y − 3)

X − 3 = − 0.25Y + 0.75

X + 0.25Y − 3.75 =
0

Is the required regression line of X on Y

The regression line of Y on X

X + 0.25Y − 3.75 =
0

cov( x, y )
Where byx =
σ x2

byx =
1/ nΣxy – ( xbar * ybar )
1/ nΣx 2 – ( xbar )
2

1/ 5 ( 44 ) – ( 3*3)
byx =
1/ 5 ( 49 ) − 32

8.8 − 9
byx =
9.8 − 9
byx = −0.25

the regression line of Y on X is

y byx( x − x )
y −=

Y − 3 = − 0.25 ( X − 3)

Y − 3 = − 0.25 X + 0.75

0.25 X + Y − 3.75 =0 is the required regression line of Y on X

Example 2:
From the following data, obtain the two regression lines

Sales 91 97 108 121 67 124 51 73 111 57


Purchases 71 75 69 97 70 91 39 61 80 47

Solution:
Let X be the sales and Y be the Purchases

Quantitative Methods
202
The regression line of Y on X is given by

y byx( x − x )
y −=

cov( x, y )
byx =
σ x2

byx =
1/ nΣxy – ( xbar * ybar )
1/ nΣx 2 – ( xbar )
2

Now the given data can tabulate as follows

X Y X2 Y2 F
91 71 8281 5041 6461
97 75 9409 5625 7275
108 69 11664 4761 7452
121 97 14641 9409 11737
67 70 4489 4900 4690
124 91 15376 8281 11284
51 39 2601 1521 1989
73 61 5329 3721 4453
111 80 12321 6400 8880
57 47 3249 2209 2679
ΣX = 900 ΣY= 700 2
ΣX = 87360 2
ΣY = 51868 Σ XY = 66900

From the given data n = 10

Xbar= 1/ n ΣX = 1/10 ( 900 ) = 90

Ybar = 1/ n ΣY = 1/10 ( 700 ) = 70

byx =
1/ nΣxy – ( xbar * ybar )
– ( xbar )
2 2
1/ nΣx

1/10 ( 66900 ) – ( 90*70 )


byx =
1/10 ( 87360 ) – ( 90 )
2

byx = 0.6132

bxy =
1/ nΣxy – ( xbar * ybar )
– ( ybar )
2 2
1/ nΣy

Quantitative Methods
203
1/10 ( 66900 ) – ( 90*70 )
byx =
1/10 ( 51868 ) – ( 70 )
2

bxy = 1.361

The regression line of Sales ( X ) on Purchases (Y ) is given by

x bxy ( y − y )
x −=

90 1.361 (Y − 70 )
X −=
=X 1.361Y − 5.27

=X 1.361Y − 5.27 is the required the regression line of Sales ( X ) on Purchases (Y )

The regression line of Sales (Y ) on Ads ( ) is given by


y byx( x − x )
y −=
− 70 0.6132 ( X − 90 )
Y=
=Y 0.6132 X + 14.812

=Y 0.6132 X + 14.812 is the required the regression line of Purchases (Y ) on Sales ( X )

Example 3:

The following data on advertisement expenditure (in crores) and sales (in tons) of a detergent.
Fit a regression equation and estimate the likely sales when advertisement expenditure is 100.

Ads ( X ) 20 43 63 26 53 31 58 46 58 70

Sales (Y ) 120 128 141 126 134 128 136 132 140 144

Solution:
The regression line of Sales (Y ) on Ads ( X ) is given by

y byx( x − x )
y −=

cov( x, y )
byx =
σ x2

byx =
1/ nΣxy – ( xbar * ybar )
1/ nΣx 2 – ( xbar )
2

Quantitative Methods
204
Now the given data can tabulate as follows

(Y ) X Y X2 Y2 XY
20 120 400 14400 2400
43 128 1849 16384 5504
63 141 3969 19881 8883
26 126 676 15876 3276
53 134 2809 17956 7102
31 128 961 16384 3968
58 136 3364 18496 7888
46 132 2116 17424 6072
58 140 3364 19600 8120
70 144 4900 20736 10080
ΣX =468 1329
ΣY = Σ X2 = 63293
177137 ΣXY =
24408 Σ Y 2 =

From the given data n = ü


Xbar= 1/ n ΣX = 1/10 ( 468 ) = 46.8

Ybar = 1/ n ΣY = 1/10 (1329 ) = 132.9

byx =
1/ nΣxy – ( xbar * ybar )
1/ nΣx 2 – ( xbar )
2

1/10 ( 63293) – ( 46.8*132.9 )


byx =
1/10 ( 24408 ) – ( 46.8 )
2

6329.3 – 6219.72
byx =
2440.8 – 2190.24

byx = 109.58 / 250.56

byx = 0.437

The regression line of Sales ( ) on Ads ( X ) is given by

y byx( x − x )
y −=

Y − 132.9= 0.437 ( X − 46.8 )

Quantitative Methods
205
Y − 132.9 =
0.437 X – 20.452

0 is the required the regression line of Sales (Y ) on Ads ( X )


0.437X – Y + 112.44 =

If X (Ads) =100, then the estimated sales (Y ) can be determined as

0.437 (100 ) − Y + 112.44 =


0

43.7 + 112.44 =
Y = 156.14

Y = 156.14
Hence if the ad budget is Rs 100 crores then estimated sales is 156.14 tons.

Quantitative Methods
206
Self-Assessment Questions

11. Regression line of Y on X is denoted by__________.

A). byx
B). bxy
C). bxx
D). byy

12. Regression line of X on Y is denoted by___________.

A). byx
B). bxy
C). bxx
D). byy

13. Standard deviation of x is denoted by__________.

A). σ x
B). y
C). σ x
2

D). σ y 2

14. Variance of x is denoted by_____.

A). σ x
B). σ y
C). σ x 2
D). σ y2

15. The symbol Σ is used for ________.

A). Subtraction
B). Division
C). Product
D). Sum

Quantitative Methods
207
Summary

• The unit aims to introduce the concept of Regression and their applications in
the field of management.
• Regression is classified in Linear, Multiple and Non-linear Regression
• Linear regression is of two types i) Regression line of Y on X and b) Regres-
sion line of X on Y

Terminal Questions

1. Differentiate between Correlation and Regression.


2. Explain different types of Regression.
3. A Panel of two judges P and Q graded 7 dramatic performances by independently
awarding marks as follows.

Performance 1 2 3 4 5 6 7
Marks by P 46 42 44 40 43 41 45

Marks by Q 40 38 36 35 39 37 41

For the 8th performance, Judge Q is not attended. If Judge P awarded 37 marks to
8th performance. Then estimate the score of Judge Q for 8th performance.

4. Price indices of Cotton and Wool for 12 moths are given as


Price Index of Cotton ( X ) 78 7 85 88 87 82 81 77 76 83 97 93
Price index of Wool (Y ) 84 82 82 85 89 90 88 92 83 89 98 99
Obtain the equations of lines of regression between the indices

Quantitative Methods
208
Answer Keys

Self-Assessment Questions

Question No Answers

1 D

2 A

3 B

4 C

5 D

6 A

7 D

8 B

9 C

10 B

11 A

12 B

13 A

14 C

15 D

Quantitative Methods
209
Glossary
• Regression: A statistical technique that relates a dependent variable to one or
more independent (explanatory) variables.
• Liner Regression: Used to predict the value of a variable based on the value
of another variable.
• Multiple Regression: Is used to estimate the relationship between two or
more independent variables and one dependent variable.

BIBLIOGRAPHY
1. Chung, K. L. (2000). A Course in Probability Theory (3rd ed.). San Diego, CA:
Academic Press.
2. Feller, W. (1968). An introduction to Probability Theory and its applications:
Volume I. John Wiley & Sons.
3. Gupta, S. C., & Kapoor, V. K. Fundamentals of Mathematical Statistics. S. Chand
Publications.

External Resources
1. Rukmangadachari, E. Probability and Statistics. Person Education.
2. Rohatgi, V. K., & Ehsanes Saleh, A. K. (2015). An introduction to probability and
statistics (3rd ed.). doi:10.1002/9781118799635

e-References
• Regression Coefficients: https://www.cuemath.com/data/regression-coefficients/

• What is Regression? https://www.investopedia.com/terms/r/regression.asp

Quantitative Methods
210
Image Credits

Linear regression:
Fig. 1 https://towardsdatascience.com/linear-regression-
explained-1b36f97b7572
Multiple regression:
Fig. 2
https://www.youtube.com/watch?v=zITIFTsivN8

Video Links

Topic Link
Regression Analysis, Regression Coeffi- https://www.youtube.com/
cient, Linear Regression watch?v=QAEZOhE13Wg
Regression Analysis, Regression Coeffi- https://www.youtube.com/
cient, Linear Regression Part-II watch?v=ddYNq1TxtM0
Regression Analysis, Angle Between Two https://www.youtube.com/
Regression Lines, Proof watch?v=YciBHHeswBM

Keywords

● Regression line
● Non-Linear Regression
● Intercept
● slope

Quantitative Methods
211
QUANTITATIVE METHODS

Module - 4

INDEX NUMBERS AND TIME


SERIES ANALYSIS

Quantitative Methods
212
Module Description

An index number is a statistic used to illustrate changes in variables or a group of related variables
across time, across regions, or in relation to other aspects of the variable being studied. It is
known as a change measure, a change measurement tool, or a change representation series.
The variations in economic activity are indicated using index numbers as a barometer. They also
offer a framework for making decisions and projecting the future. Three different sorts of index
numbers are typically employed. Price index, quantity index, and value index are the three.

A company’s past performance can be compared to the statistics of the present to assess its
performance. Time Series Analysis is the process of comparing data from the past and the
present. Instead of being restricted to a limited period, time series are prolonged throughout
a span of time. Time series analysis is significant because it can aid in future prediction. Time
series can forecast the future based on present and past trends.

Financial planning benefits from time series analysis because it provides insight into future data
based on previous and present performance data. By comparing the data from the present and
the past, one can estimate the data for an anticipated period.

This Module is divided into the following units:

Unit 4.1 Time Series Analysis


Unit 4.2 Index Numbers

Quantitative Methods
213
QUANTITATIVE METHODS

Module - 4
Unit - 1

TIME SERIES ANALYSIS

Quantitative Methods
214
Unit Table of Contents
Unit 4.1 Time Series Analysis

Aim -------------------------------------------------------------------------------------------------------------- 216


Instructional Objectives ------------------------------------------------------------------------------------ 216
Learning Outcomes ----------------------------------------------------------------------------------------- 216
Introduction ---------------------------------------------------------------------------------------------------- 217
4.1.1 Meaning of Time Series -------------------------------------------------------------------------- 217
Self-Assessment Questions --------------------------------------------------------------------- 219
4.1.2 Components of Time Series --------------------------------------------------------------------- 220
Self-Assessment Questions --------------------------------------------------------------------- 222
4.1.3 Trend analysis -------------------------------------------------------------------------------------- 223
Self-Assessment Questions --------------------------------------------------------------------- 231
4.1.4 Seasonal, cyclical and Irregular variations -------------------------------------------------- 232
Self-Assessment Questions --------------------------------------------------------------------- 234
Summary ------------------------------------------------------------------------------------------------------- 235
Terminal Questions ------------------------------------------------------------------------------------------ 236
Answer Keys -------------------------------------------------------------------------------------------------- 237
Glossary -------------------------------------------------------------------------------------------------------- 238
Bibliography --------------------------------------------------------------------------------------------------- 238
External Resources ----------------------------------------------------------------------------------------- 238
e-References ------------------------------------------------------------------------------------------------- 238
Image Credits ------------------------------------------------------------------------------------------------- 239
Video Links ---------------------------------------------------------------------------------------------------- 239
Keywords ------------------------------------------------------------------------------------------------------ 239

Quantitative Methods
215
Aim
This unit aims to explain the concept of Time Series and its applications in
Management.

Instructional Objectives
This unit intends to:
● Explain the concepts and components of Time Series
● Describe Time Series in various applications of management

Learning Outcomes
At the end of this unit, you are expected to:
● Elaborate upon the concepts and components of Time Series
● Demonstrate the concepts of Time Series in Marketing, HRM, Finance

Quantitative Methods
216
INTRODUCTION
A time series is a chronologically ordered collection of statistical data. That is in accordance with
its occurrence time. It reflects the dynamic speed with which a phenomenon moves through time.
Most time series in economics, business, and commerce, such as prices, production, consump-
tion, agricultural and industrial production, national income, foreign reserves, investment, sales
and profits of business units, bank deposits and clearings, stock exchange shares, and so on,
are all time series spread over a long period of time. In business and economics, time series are
crucial.

4.1.1 MEANING OF TIME SERIES


Arrangement of statistical data in chronological order in accordance with different time periods
is called time series.

Time series plays a vital role in Business Management for future planning
● E.g.: money in circulation for a decade, bank deposits and clearings for a financial
year, sales and profits of a departmental store in a quarter, agricultural and industrial
production of a calendar year, etc.

A time series is a set of observations taken at specified times, usually at equal intervals.

“A time series may be defined as a collection of reading belonging to different time periods of
some economic or composite variables” - “ Ya-Lun-Chau “
The variable “Time” which is independent variable & and the second is “Data” which is the
dependent variable.
Mathematically, a time series is defined by the functional relationship ü t = ( )
Where U t is the value of the phenomenon (Variable - Dependent)
ΣX 2 = 506 f ( t ) is the function of time (Independent Variable)
● E.g.: population f ( t ) of India in different Years ( t )
The sales (U t ) of a departmental store in different months ( t )
The production (U t ) of a manufacturing unit in different hours ( t )

Quantitative Methods
217
Examples
Table 1
Day No. of Packets of milk sold

Monday 90

Tuesday 88

Wednesday 85

Thursday 75

Friday 72

Saturday 90

Sunday 102

Table 2
Year Population (in Millions)
1921 251
1931 279
1941 319
1951 361
1961 439
1971 548
1981 685
• From Table 1, the sale of milk packets decreases from Monday to Friday then again it
starts to increase.
• Same thing in Table 2, the population is continuously increasing.

Quantitative Methods
218
Self-Assessment Questions

1. Collection of reading belonging to different time periods is called ___.

A). Time Series


B). Relation
C). Correlation
D). Statistic

2. Arrangement of Statistical data in Chronological order is called ___.

A). Statistic
B). Relation
C). Correlation
D). Time Series

3. Money in circulation for a decade is an example of time series ___.

A). Yes
B). No
C). May be
D). Can’t say

4. Mathematically, a time series is defined as ___.

A). Ut ≠ f ( t )
B). Ut ≤ f (t )
C). Ut = f (t )
D). Ut ≥ f (t )

5. Time series have an importance in business and economics.

A). Yes
B). No
C). Maybe
D). Can’t say

Quantitative Methods
219
4.1.2 Components of Time Series
The changes which are being observed in the time series are affected by economic, social,
natural, industrial & political reasons. These reasons are called components of time series and
are classified into 4 types:
• Secular trend or Long-term Movement
• Periodic Changes or Short-term Movement (which are of 3 kinds)
Seasonal Variation
Resulting from the natural forces
Resulting from Man-made conventions
• Cyclic Variations
• Random or Irregular variations

4.1.2.1 Secular trend or Long-term Movement

• A secular trend is defined as an increase or decrease in a time series’ movements.


• For a number of years, a time series data may indicate an upward or negative trend,
depending on factors such as:
a. Population growth.
b. The advancement of technology.
c. Demands of consumers are shifting on a large scale.
• Examples of rising trends are population growth over time, price increases over time,
and increases over time.
• A decreasing trend or downward is when the sales of a commodity diminish over time
due to superior products entering the market.

4.1.2.2 Periodic Changes or Short-term Movement

• The changes in Time series over period of within one year are called Periodic Changes
or Short-term Movement.
E.g.: Sales of a departmental store in every month of the year.
• Periodic Changes are broadly classified as
Seasonal Variations
i. These variations in Time Series are due to the rhythmic forces which
operate in a regular and periodic manner over a span of less than a year.
ii. Seasonal variations can be classified as
1. Resulting from the natural forces
2. Resulting from Man-made conventions
Cyclic Variations
Cyclical variations are recurrent upward or downward movements in a time series,
but the period of cycle is greater than a year. Also, these variations are not regular as
seasonal variation.

Quantitative Methods
220
E.g.: Business Cycle

Fig. 1: Business Cycle

4.1.2.3 Random or Irregular Variations

• Irregular variations are fluctuations in time series that are short in duration due to
Floods, Earthquakes, Wars, etc.

4.1.2.4 Mathematical Model for Time Series

● Additive Model
According to Additive Model, the Time Series can be Expressed as
U t = Tt + St + Ct + R t
Where U t - time series value at time t
Tt – Trend Value
St – Seasonal Variations
Ct – Cyclic Variations
Rt – Random or Irregular Variations

4.1.2.5 Multiplicative Model

According to Multiplicative Model, the Time Series can be Expressed as


U t = Tt St Ct R t
Where U t - time series value at time t
Tt – Trend Value
St – Seasonal Variations
Ct – Cyclic Variations
Rt – Random or Irregular Variations

Quantitative Methods
221
Self-Assessment Questions

6. St means _______ Variations.

A). Trend
B). Cyclic
C). Seasonal
D). Random

7. Number of components of Time Series are _________.

A). 4
B). 5
C). 3
D). 2

8. Ct means _______ Variations.

A). Trend
B). Cyclic
C). Seasonal
D). Random

9. Tt means _______ Variations.

A). Trend
B). Cyclic
C). Seasonal
D). Random

10. Rt means _______ Variations.

A). Trend
B). Cyclic
C). Seasonal
D). Random

Quantitative Methods
222
4.1.3 TREND ANALYSIS
Trend is a long-term moment in time series. Data varied due to long-term changes is denoted
with Trend. Trend can be measured and analysed in different ways.

● Measures of Secular Trend

Various methods used to the measure the secular Trend or long-term moment are:

● Free-Hand Curve Method


● Semi-Average Method
● Moving Average Method
● Least Square Method

4.1.3.1 Free-Hand Curve Method

In this method, the Time Series data is represented in two-dimensional plane (Graph) and draw
a smooth hand curve to understand the tendency of the data. We take “Time” on ‘x’ axis and
“Data” on the ‘y’ axis.

Example

Draw a free-hand curve on the basis of the following data also draw the trend line and estimate
the profit for 1997.

Years 1989 1990 1991 1992 1993 1994 1995 1996


Profit 148 149 149.5 149 150.5 152.2 153.7 153

Solution

Plot the years on X axis and Profits on Y axis then the free-hand smooth curve for the given
data is as follows

Quantitative Methods
223
From the above free-hand smooth curve, it was concluded that the profits are in increasing
order from 1989 to 1996. And the Trend line can be represented with a dotted line.

4.1.3.2 Semi-Average Method

● This method is also similar to the free-hand curve method in which the semi averages are
calculated and represented in the graph.
● In this method the given data are divided into two parts, preferably with the equal number
of years.
● If the data contains even periods, then divide them into two equal parts and find the semi
averages.
● If the data contains odd periods, then ignore the middle period and divide them into two
equal parts and find the semi averages.
● Draw the free-hand curve with the data and trend line using semi averages

Example

Draw the trend line from the following data by Semi-Average Method

Year 1989 1990 1991 1992 1993 1994 1995 1996


Production
150 152 153 151 154 153 156 158
(M.Ton.)

Solution
There are total 8 periods in the data which can be distributed in equal parts.
Now we calculated Average mean for every part.

150 + 152 + 153 + 151


First Part = = 151.50
4
154 + 153 + 156 + 158
Second Part = = 155.25
4

Quantitative Methods
224
From the above graph, it was concluded that production is in increasing order from 1989
to 1996. And the Trend line can be represented with a dotted line.

4.1.3.3 Moving Averages Method

• It is one of the most popular methods for calculating Long Term Trend. This method is
also used for ‘Seasonal fluctuation’, ‘cyclical fluctuation’ & ‘irregular fluctuation’. In this
method we calculate the ‘Moving Average for certain years.
• For example: If we calculate ‘Three year’s Moving Average’ then according to this method

(1) + ( 2 ) + (3) , ( 2 ) + ( 3) + (4) , ( 3) + ( 4 ) + (5) ,.................


=
3 3 3

Where (1),(2),(3),……. are the various years of time series.

Example

Find out the five year’s moving Average:

Year 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996

Price 20 25 33 33 27 35 40 43 35 32 37 48 50 37 45

Solution

Quantitative Methods
225
Example

Calculate the 4 yearly moving averages for the following data

Years 2010 2011 2012 2013 2014 2015 2016


Profit 142 120 102 87 92 102 160

Solution

The 4 yearly moving averages

4 Yearly Moving 4 Yearly Moving Centric moving


Year Profits
totals Averages average
2010 142
2011 120
451 112.75
2012 102 106.50
401 100.25
2013 87 98
383 95.75
2014 92 103
441 110.25
2015 102
2016 160

4.1.3.4 Method of Least squares fit

Method of Least Squares is an important method to determine the trend in the time series data
by fitting Linear of Non-Linear curves.

Linear Trend

Fitting of a straight line (Y= a + bx )

Non-linear Trend

(
 Second degree Parabola y = a + bx + cx 2 )
(
y a0 + a1 x + a2 x 2 +………………+ an x n
 nth degree Polynomial = )
(
 Exponential function y = ae x )
(
 Power function y = ax b )

Quantitative Methods
226
4.1.3.5 Fitting of a straight line ( Y= a + bx )

Let Y= a + bx be the form of straight line

Where Y – Dependent Variable (Data)


X - Independent Variable (Time)
a - Intercept
b - Slope

The Normal equations to determine the values of a and b are


y
= na + bx
xy= a x + b x 2
By substituting the obtained values of a and b in the Straight line and is the best fit for the given
data.

Example

Draw a straight-line trend and estimate trend value for 1996

Year (X) 1991 1992 1993 1994 1995


Production (Y) 8 9 8 9 16

Solution

Given that the form of straight line is Y= a + bx


The normal equations to determine the values of a and b are
Σy = na + bΣ x
Σxy = a Σ x + bΣx 2

Quantitative Methods
227
Now we calculate the value of two constant ‘a’ and ‘b’ with the help of two equation:-

∑ Y NabX ∑
=

∑ XYaXbX ∑ + ∑ 2

Now we put the value of ∑ X , ∑ Y , ∑ XY , ∑ X 2


, & N :-
50 = 5a + 15(b) ---------------------------- (i)
166 = 15a+55(b) ---------------------------- (ii)
Or 5a + 15b = 50 ---------------------------- (iii)
15a + 55b = 166 ---------------------------- (iv)
Equation (iii) Multiply by 3 and substracted (iv)
-10b = -16
b = 1.6
Now we put the value of “b” in the equation (iii)
= 5a + 15(16) = 50
5a = 26
26
a
= = 5.2
5
As according the value of ‘a’ and ‘b’ the trendline:-
Y=a+bx
Y=5.2 + 1.6X
Now we calculate the estimated production for 1996:-

5.2 + 1.6(6) =
Y1996 = 14.8

Example

Given below, seasonal demand for electricity in Ahmedabad (2001-2011). Forecast the demand
for 2012 by fitting a trend line to the data.

Time 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
period
Quantity 11 14 16 13 17 20 23 25 27 31 35

Solution

The form of a straight line is Y= a + bx

The normal equations for solving a and b are

ΣY = na + bΣX

Quantitative Methods
228
ΣXY = aΣX + bΣX 2

The given data can be tabulated as follows

Deviation Trend Values


Quantity
Time from 2000 XY X2
(Y) Y =7.41 +2.28X
(X)
2001 1 11 11 1 Y =7.41+2.28(1) =9.69
2002 2 14 28 4 Y =7.41+2.28(2) =11.97
2003 3 16 48 9 Y =7.41+2.28(3) =14.25
2004 4 13 52 16 Y =7.41+2.28(4) =16.53
2005 5 17 85 25 Y =7.41+2.28(5) =18.81
2006 6 20 120 36 Y =7.41+2.28(6) =21.09
2007 7 23 161 49 Y =7.41+2.28(7) =23.37
2008 8 25 200 64 Y =7.41+2.28(8) =25.65
2009 9 27 243 81 Y =7.41+2.28(9) =27.93
2010 10 31 310 100 Y =7.41+2.28(10) =30.21
2011 11 35 385 121 Y =7.41+2.28(11) =32.49
ΣX = 66 ΣY = 232 ΣXY = 1643 ΣX 2 = 506

n=11

ΣY = na + bΣX

ΣXY = aΣX + bΣX 2

232 = 11a +66b --------------(1)

1643 = 66a +506b--------------(2)

Eq(1)x6 → 66a + 396b = 1392

(2) → 66a + 506b = 1643

- 110b = -251

→ b =-251/-110

→ b = 2.28

Eq (1) → 11a+66(2.28) =232

→ 11a = 232 – 150.48

Quantitative Methods
229
→ 11a =81.52

→ a = 81.52 / 11

→ a =7.41

Hence a =7.41 and b = b = 2.28

The required straight line is Y= a + bx

Y =7.41 +2.28X is the required straight line

Now the demand for the year 2012 is (2012-2000 =12)

Y2012 =7.41+2.28(12)

Y2012 =34.7

Hence the estimated demand for the year 2012 is 34.7 tons

Quantitative Methods
230
Self-Assessment Questions

11. Trend is a long-term movement of Time series. What do you say?

A). No
B). Yes
C). Maybe
D). Can’t say

12. Number of Methods to determine Trend are________.

A). 4
B). 5
C). 3
D). 2

13. Best method to measure the trend is_________.

A). Free-Hand Curve


B). Semi-Average
C). Moving Average
D). Least Square

14. In which method, Time series is divided into Two equal parts?

A). Free-Hand Curve


B). Semi-Average
C). Moving Average
D). Least Square Random

15. Form of a straight line is given by______________.

A). Y = bx
B). Y=a
C). Y= a + bx
D). Y= a + b

Quantitative Methods
231
4.1.4 SEASONAL, CYCLICAL, AND IRREGULAR VARIATIONS
Apart from trend or long-term variations, the time series data fluctuated due to Seasonal, Cyclic
and Irregular variations.

4.1.4.1 Seasonal Variations

A seasonal variation in a time series refers to changes caused by forces that work on a regular,
periodic basis over a period of less than a year. A business or sales manager’s understanding
of such changes, which are prevalent in most economic and business time series, is critical for
planning future production and scheduling purchases, inventory control, personal requirements,
and selling and advertising programmes. The following are the aims for investigating seasonal
patterns in a time series:

• To distinguish between seasonal fluctuations. This entails determining the impact of sea-
sonal swings on the value of a specific phenomenon and
• Removing them. That is, if there were no seasonal ups and downs in the series, the value
of the phenomena would be determined. This is referred to as de-seasonalising the data,
and it is required for the analysis of cyclic variations.

Obviously, time series data for sections of a year, such as monthly, quarterly, weekly, and daily of
the year, must be provided for the study of seasonal fluctuations. Seasonal variations are stud-
ied under the assumption that the seasonal pattern is superimposed on the values of a series
independently. The following are the several seasonal variation approaches -- Method of Simple
Averages:

• Ratio to Trend Method


• Ratio to Moving Average Method
• Link Relative Method

4.1.4.2 Cyclic Variations

The residual approach, which consists of first estimating trend (T) and seasonal (S) components
and then reducing their effects on the provided time series, is an imprecise or primitive method
of assessing cyclic fluctuations. These components (T and S) are eliminated when the given time
series values are divided by TxS , assuming a multiplicative model of the time series.
(Y / TxS )
= TSCI
= / TS CI

4.1.4.3 Irregular Variations

Because of the nature of movements, no formula can be proposed for estimating the irregular
component in a time series, no matter how close it is. In practice, the three components of a
time series, namely Trend (T), Seasonal (S), and Cyclic (C), are obtained, and the irregular
component, which is unaccounted for by these components after eliminating them from the given

Quantitative Methods
232
series, is produced as a residual. The random or irregular component of a time series is calculat-
ed using the multiplicative model.

(Y / TSC )
= TSCI
= / TSC I

Quantitative Methods
233
Self-Assessment Questions

16. In Time Series, T Stands for__.

A). Trend
B). Cyclic
C). Seasonal
D). Random

17. In Time Series, C Stands for __.

A). Trend
B). Cyclic
C). Seasonal
D). Random

18. In Time Series, S Stands for __.

A). Trend
B). Cyclic
C). Seasonal
D). Random

19. In Time Series, I Stands for __.

A). Trend
B). Cyclic
C). Seasonal
D). Random

20. Link Relatives method is used to find _________ variations.

A). Trend
B). Cyclic
C). Seasonal
D). Random

Quantitative Methods
234
Summary

● The unit aims to introduce the concept of Time Series and their applications in
the field of management.
● Any Time series data fluctuate due to Trend, Seasonal, Cyclic and Irregular
variations
● Various methods used to determine the Trend are Free-Hand Curve method,
Semi-Averages method, Moving Averages method and Least Squares meth-
od. Among which Least Squares method is the best method to determine the
Trend.
● Seasonal, Cyclic, and Irregular variations also fluctuate the Time Series apart
from Trend.

Quantitative Methods
235
Terminal Questions

1. Define Time Series and explain the components of Time Series.


2. Draw a free-hand curve based on the following data also draw the trend line and
estimate the profit for 1999 and 2008:
Years 2000 2001 2002 2003 2004 2005 2006 2007
Profit 47 52 68 72 78 70 88 92
3. Draw a free-hand curve based on the following data; also draw the trend line and
estimate the profit for 2018 and 2008:
Years 2010 2011 2012 2013 2014 2015 2016
Profit 142 120 102 87 92 70 60
4. Find the trend line from the following data by Semi – Average Method
Years 2000 2001 2002 2003 2004 2005 2006 2007
Profit 47 52 68 72 78 70 88 92
5. Find the trend line from the following data by Semi-Average Method
Years 2010 2011 2012 2013 2014 2015 2016
Profit 142 120 102 87 92 70 60
6. Find the three yearly and four yearly moving averages for the following data
Years 2000 2001 2002 2003 2004 2005 2006 2007
Profit 47 52 68 72 78 70 88 92
7. Find the four yearly moving averages for the following data
Years 2010 2011 2012 2013 2014 2015 2016
Profit 142 120 102 87 92 102 160
8. A study of demand for the past 10 years data given below; draw the trend and also
estimate demand in the year 2005.
Year 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003
De-
80 84 80 88 98 92 84 88 80 100
mand
9. Given below, seasonal demand for electricity in Ahmedabad (2001-2011). Forecast
the demand for 2012 by fitting a trend line to the data.

Time
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
period

Quantity 11 14 16 13 17 20 23 25 27 31 35

Quantitative Methods
236
Answer Keys

Self-Assessment Questions

Question No Answers

1 A

2 D

3 A

4 C

5 A

6 C

7 A

8 B

9 A

10 D

11 B

12 A

13 D

14 B

15 A

16 A

17 B

18 C

19 D

20 C

Quantitative Methods
237
Glossary
• Time series: A data set that tracks a sample over time.
• Least square method: The process of finding the best-fitting curve or line of
best fit for a set of data points by reducing the sum of the squares of the offsets
(residual part) of the points from the curve.
• Cyclic variations: A type of variation that occurs in a cyclical pattern over a
period of time.

BIBLIOGRAPHY
1. Chung, K. L. (2000). A Course in Probability Theory (3rd ed.). San Diego, CA:
Academic Press.
2. Feller, W. (1968). An introduction to Probability Theory and its applications:
Volume I. John Wiley & Sons.
3. Gupta, S. C., & Kapoor, V. K. Fundamentals of Mathematical Statistics. S. Chand
Publications.

External Resources
1. Rukmangadachari, E. Probability and Statistics. Person Education.
2. Rohatgi, V. K., & Ehsanes Saleh, A. K. (2015). An introduction to probability and
statistics (3rd ed.). doi:10.1002/9781118799635

e-References
• Understanding Trend Analysis and Trend Trading Strategies:
https://www.investopedia.com/terms/t/trendanalysis.asp

• Time Series Analysis:


https://www.tableau.com/learn/articles/time-series-analysis

Quantitative Methods
238
Image Credits

Business Cycle
Fig. 1: https://corporatefinanceinstitute.com/
resources/economics/business-cycle/

Video Links

Topic Link
https://www.youtube.com/
Time Series Analysis
watch?v=BBoUJYT0jxY
“Semi Averages Method” in Time Series https://www.youtube.com/
form Statistics Subject watch?v=VmOZ7_Fjn-s
“Time Series” Chapter Introduction in Sta- https://www.youtube.com/
tistics watch?v=RxhmWTxrTs0
Introducing Time Series Analysis and fore- https://www.youtube.com/
casting watch?v=GUq_tO2BjaU
“Freehand Smooth Curve” in Time Series https://www.youtube.com/
Chapter from Statistics watch?v=CgebqU_I9tE

Keywords

• Seasonal variations
• Irregular variations
• Moving average
• Semi average

Quantitative Methods
239
QUANTITATIVE METHODS

Module - 4
Unit - 2

INDEX NUMBERS

Quantitative Methods
240
Unit Table of Contents
Unit 4.2 Index Numbers

Aim -------------------------------------------------------------------------------------------------------------- 242


Instructional Objectives ------------------------------------------------------------------------------------ 242
Learning Outcomes ----------------------------------------------------------------------------------------- 242
Introduction ---------------------------------------------------------------------------------------------------- 243
4.2.1 Meaning of Index Numbers ---------------------------------------------------------------------- 243
Self-Assessment Questions --------------------------------------------------------------------- 246
4.2.2 Unweighted Index numbers --------------------------------------------------------------------- 247
Self-Assessment Questions --------------------------------------------------------------------- 250
4.2.3 Weighted Index numbers ------------------------------------------------------------------------ 251
Self-Assessment Questions --------------------------------------------------------------------- 255
Summary ------------------------------------------------------------------------------------------------------- 256
Terminal Questions ------------------------------------------------------------------------------------------ 256
Answer Keys -------------------------------------------------------------------------------------------------- 257
Glossary -------------------------------------------------------------------------------------------------------- 258
Bibliography --------------------------------------------------------------------------------------------------- 258
External Resources ----------------------------------------------------------------------------------------- 258
e-References ------------------------------------------------------------------------------------------------- 258
Image Credits ------------------------------------------------------------------------------------------------- 259
Video Links ---------------------------------------------------------------------------------------------------- 259
Keywords ------------------------------------------------------------------------------------------------------ 259

Quantitative Methods
241
Aim
This unit aims to explain the concept of Index Numbers and their applications in
Management.

Instructional Objectives
This unit intends to:
● Explore the concepts of Index Numbers and types of Index Numbers
● Describe Index Numbers in various applications of management

Learning Outcomes
At the end of this unit, you are expected to:
● Analyse the concepts of Index Numbers
● Compare the different types of Index Numbers
● Apply the concepts of Index Numbers in Marketing, HRM, Finance, etc.

Quantitative Methods
242
INTRODUCTION
Index numbers are indicators that show how the level of a phenomenon has changed over time
in any given period (or over a specified period of time) in comparison to its values in a fixed pe-
riod (called the base period for comparison).

• The price of a particular commodity, such as steel, gold, or leather, or the price of a set
of commodities, such as consumer goods, cereals, milk and dairy products, cosmetics,
and so on.
• Volume of trade, factory output, industrial or agricultural output, imports and exports,
stocks and shares, sales and profits of businesses, and so on.
• A country’s national income, the wage structure of workers in various industries, bank
deposits, foreign exchange reserves, people’s cost of living in a specific community,
class, or profession, and so on.

4.2.1 MEANING OF INDEX NUMBERS


• A change in price, quantity, value, or some other metric from one time period to the next
is represented by an index number.
• A simple index number represents the change in one or more variables over time.
• Index numbers are quantitative measures of growth of prices, production, inventory and
other quantities of economic interest.
• An index number indicates how much a variable has changed over time.
• The Index Numbers are calculated by dividing the current value by the base value.

4.2.1.1 Definition

• Index numbers are statistical devices designed to measure the relative change in the
level of a phenomenon (variable or group of variables) with respect to time, geographical
location or other characteristics such as income, profession, etc.
• “Index Number is a statistical device for indicating the relative movement of the data where
measurement of actual movements is difficult or incapable of being made” – Wheldon.
• “Index Number shows by its variations the changes in a magnitude which is not susceptible
either of accurate measurement in itself or of direct valuation in practice” – F. Y. Edgeworth.

4.2.1.2 Characteristics of Index Numbers

• The term “index numbers” refers to customised averages.


• The change in the level of a phenomena is measured by index numbers.
• Index numbers are used to calculate the impact of changes over time.
• To develop appropriate policies.
• They indicate patterns and trends.
• When it comes to deflation, index numbers are extremely useful.

Quantitative Methods
243
4.2.1.3 Classification of Index Numbers

Index Numbers are classified into various types, which can be classified as:

• Price Index
• Quantity Index Number
• Value Index Number
• Composite Index Number

4.2.1.3.1 Price Index Number

• A price index (PI) is a measurement of how prices vary over time, or in other words, a
means to track inflation and deflation.
• A rise in the price level indicates that a certain economy’s currency is losing buying power
(i.e., less can be bought with the same amount of money)

4.2.1.3.2 Quantity Index Number

• A volume index, also known as a quantity index, is a numerical time series measure used
to compare how the output of a certain class of goods and/or services differs over time or
between geographic regions.

4.2.1.3.3 Value Index Number

• A value index is a statistic (ratio) that shows how a nominal value has evolved over time in
comparison to the base year’s value. For each point in time, the index point figure shows
what percentage of its relevant value at the base point in time a given value is at that time.

4.2.1.3.4 Composite Index Number

• A composite index is a statistical tool that combines a number of different stocks, assets,
or indices to indicate overall market or sector performance. Investment analysis, economic
trends, and market forecasting are all done via composite indices.

4.2.1.5 Methods of Constructing Index Numbers

Various methods of constructing Index Numbers can be classified as

Quantitative Methods
244
Simple Aggregative

Unweighted

Simple Average of
Price Relative

Index Numbers

Weighted

Weighted
Weighted
Average of
Price Relatives

Fig. 1 Classification of Index Numbers

Quantitative Methods
245
Self-Assessment Questions

1. ________ numbers are indicators which reflect the relative changes.

A). time series


B). index
C). natural
D). prime

2. In index numbers, the given period is called _______ period.

A). current
B). base
C). real
D). nominal

3. In index numbers, fixed period is called _______ period.

A). current
B). base
C). real
D). nominal

4. There are ______ types of index numbers.

A). 3
B). 2
C). 4
D). 5

5. There are_________ types of constructing index numbers.

A). 3
B). 5
C). 4
D). 2

Quantitative Methods
246
4.2.2 UN-WEIGHTED INDEX NUMBERS
The percentage change in price of a single item or a group of goods between two periods of
time is measured by an unweighted price index Number. In unweighted index numbers, all of the
values studied have equal weight. Unweighted index numbers can be calculated in a variety of
ways.

• Simple Aggregative Method


• Simple Average of Relatives Method

4.2.2.1 Simple Aggregative Method

It is calculated by expressing the current year’s aggregate price of all commodities as a percent-
age of the base year’s aggregate price

P01 =
∑p X 100
∑p 0
Where P01= Index number of the current year.
P1= Total of the current year’s price of all commodities.
P0= Total of the base year’s price of all commodities.

Example

Create the Index number for the year 2008 in Rajasthan based on the data provided.

PRICE (Rs) PRICE (Rs)


COMMODITIES UNITS
2007 2008
Sugar Quintal 2200 3200
Milk Liter 18 20
Oil Liter 68 71
Wheat Quintal 900 1000
Clothing Meter 50 60

Solution

The given Data can be tabulated as follows

PRICE (Rs) PRICE (Rs)


COMMODITIES UNITS
2007 2008
Sugar Quintal 2200 3200
Milk Liter 18 20
Oil Liter 68 71
Wheat Quintal 900 1000
Clothing Meter 50 60

Quantitative Methods
247
ΣP0 = 3236 ΣP1 = 4351

Index Number for 2008 -

=P01
∑=
p 1
x100
4351
= x100 134.45
∑p 0
3236
Hence, the price in 2008 was 34.45% higher than the previous year.

4.2.2.1 Simple Average of Relatives Method

The current year’s price is calculated as a percentage of the previous year’s price. The index
number is calculated by averaging these price relatives. Arithmetic mean, geometric mean, or
even median could be employed as the average.
 P1 
∑ P x100 
P01 =  0 
N
Where N is the Number of Items

Example

From the data given below construct the index number for the year 2008 taking 2007 as by using
arithmetic mean.

Commodities Price (2007) Price (2008)


P 6 10
Q 2 2
R 4 6
S 10 12
T 8 12

Solution

Index number using arithmetic mean and the given data can be tabulated as follows

Price (2007) Price (2008) Price Relative


Commodities
P0 P1 P1/P0*100
P 6 10 10/6*100=166.7
Q 2 2 2/2*100=100
R 4 6 6/4*100=150.0
S 10 12 12/10*100=120.0
T 8 12 12/8*100=150.0

Quantitative Methods
248
 P1 
∑  P  x100 = 868.66
 0 
 P1 
∑  P  x100 868.66
=P01  0
= = 137.34
N 5
Hence there is an increment of 37.34% in the entire commodity price compared with 2007 and
2008.

Quantitative Methods
249
Self-Assessment Questions

6. An unweighted price index number measures the percentage change in price of a


single item or a group of items between two periods of time. What do you say?

A). Yes
B). No
C). Maybe
D). Can’t say

7. In index numbers, P0 represents_______ period.

A). current
B). base
C). real
D). nominal

8. In index numbers, P1 represents_______ period.

A). current
B). base
C). real
D). nominal

9. ________ is the number of items in index number.

A). N
B). P
C). Q
D). R

10. There are________ number of methods of un-weighted index numbers.

A). 3
B). 5
C). 4
D). 2

Quantitative Methods
250
4.2.3 WEIGHTED INDEX NUMBERS
When all commodities are not of equal worth, we give each one a weight based on their relative
importance, and the result is a weighted index number. This index is also known as the base
year weighted index because the base year quantities are used as weights. There are numerous
methods for calculating weighted index numbers, which can be divided into the following
categories:

• Laspeyres’s method
• Paasche’s method
• Dorbish and Bowley’s method
• Fisher’s ideal method

4.2.3.1 Laspeyres’s method

Laspeyres invented this approach in 1871. The weights in this method are determined by the
base’s amounts

P01 =
∑ PQ
1 0
x100
∑PQ
0 0

Where P: Price Q: Quantity Produced

4.2.3.2 Paasche’s Method

Hermann Paasche, a German statistician, created this method in 1874. In order to calculate the
Paasche’s Index number, the current year’s weights are used as the base year.

P01 =
∑ PQ
1 1
x100
∑PQ
0 1

Where P: Price Q: Quantity Produced

4.2.3.3 Dorbish & Bowley’s method

This method combines Laspeyres’s and Paasche’s approaches. The index provided by Dorbish
& Bowley is obtained by taking the arithmetic average of Laspeyres’s and Paasche’s indexes.

 ∑ PQ
+∑ 1 1
1 1 PQ 

P01 =  ∑ P0Q1 ∑ P0Q1 
x100
2

Quantitative Methods
251
4.2.3.4 Fisher’s Ideal Index

The geometric mean of Laspeyres’s and Paasche’s deal index numbers is Fisher’s deal index
number.

 ∑ PQ
1 1 ∑ PQ
1 1

=P01 SQRT  + x100
 ∑ P Q ∑ P Q 
 0 1 0 1 

Example

The price quantity data is listed below, with prices in Rs. per kilogram and production in quintals.
Find the following indexes: (1) Laspeyres’s Index (2) Paasche’s Index (3) Fisher’s Ideal Index.

2002 2007
ITEMS PRICE PRODUCTION PRICE PRODUCTION
BEEF 15 500 20 600
MUTTON 18 590 23 640
CHICKEN 22 450 24 500

Solution

To determine the Laspeyres’s Index, Paasche’s Index and Fisher’s Ideal Index, the given data
can be tabulated as follows

2002 2007
Price Production Price Production
ITEMS ( p1q0 ) ( p0q0 ) ( p1q1 ) ( p0q1 )
( p0 ) (q0 ) ( p1 ) ( q1 )
BEEF 15 500 20 600 10000 7500 12000 9000
MUTTON 18 590 23 640 13570 10620 14720 11520
CHICKEN 22 450 24 500 10800 9900 12000 11000
TOTAL 34370 28020 38720 31520

1. Laspeyresindex:

=P01
∑=
pq 1 0
x100
34370
= x100 122.66
∑pq 0 0 28020

2. Paasche’s Index:

=P01
∑=
pq 1 1
x100
38720
= x100 122.84
∑pq 0 1 31520

Quantitative Methods
252
3. Fisher Idealindex:

P01
∑=
pq ∑pq
1 0
x x100 1 0 34370 38720
= x x100 122.69
∑pq ∑pq
0 0 0 1 28020 31520

From the analysis, it was concluded that as per Laspeyres’s Index 22.66% increment, Paasche’s
Index 22.84% increment and Fisher’s Ideal Index 122.69% increment in the data.

4.2.3.5 Fixed Based Index Numbers

Fixed Based Index Numbers in which prices of the subsequent years are expressed as relatives
of the price of the base year.

• FBI = (Price of Current Year/Price of Base Year) *100


Pon=(Pn/Po)*100.
Example

Find index numbers for the following data taking 1980 as the base year using Fixed Base Index
Numbers by taking 1980 as base year.

Year 1980 1981 1982 1983 1984 1985 1986 1987


Price 40 50 60 70 80 100 90 110

Solution

Since FBI = (Price of Current Year/Price of Base Year) *100


Pon=(Pn/Po) *100

Index nos
Year Price 1980 as bases
Pon=Pn/Po×100
1980 40 40/40×100=100
1981 50 50/40×100=125
1982 60 60/40×100=150
1983 70 70/40×100=175
1984 80 80/40×100=200
1985 100 100/40×100=250
1986 90 90/40×100=225
1987 110 110/40×100=275

Quantitative Methods
253
4.2.3.5 Chain Based Index Numbers

Chain Based Index Numbers in which prices of the subsequent years are expressed as
relatives of the price of the previous year as the base year.

• CBI = Price in the Current Year/Price in the preceding Year×100


Pn-1, n = (Pn/Pn-1) *100.

Example

Find index numbers for the following data taking 1980 as the base year using Chain Base Index
Numbers by taking 1980 as base year.

Year 1980 1981 1982 1983 1984 1985 1986 1987


Price 40 50 60 70 80 100 90 110

Solution

Since CBI = (Price in the Current Year/Price in the preceding Year) ×100
Pn-1,n=(Pn/Pn-1)*100

Index nos
Year Price 1980 as base
Pn-1,n=Pn/Pn-1×100
1980 40 40/40×100=100
1981 50 50/40×100=125
1982 60 60/50×100=120
1983 70 70/60×100=167
1984 80 80/70×100=114
1985 100 100/80×100=125
1986 90 90/100×100=90
1987 110 110/90×100=122.2

Quantitative Methods
254
Self-Assessment Questions

11. ___________number of methods of un-weight index numbers are there.

A). 3
B). 5
C). 4
D). 2

12. ____ is an ideal index number.

A). Laspeyres’s
B). Paasche’s
C). Dorbish and Bowley’s
D). Fisher’s

13. Arithmetic average of Laspeyres’s and Paasche’s index is_______index number.

A). Laspeyres’s
B). Paasche’s
C). Dorbish and Bowley’s
D). Fisher’s

14. Geometric mean of the Laspeyres’s and Paasche’s index numbers is _____index
number.

A). Laspeyres’s
B). Paasche’s
C). Dorbish and Bowley’s
D). Fisher’s

15. The Laspeyres index number was discovered in the year ________.

A). 1820
B). 1920
C). 1881
D). 1871

Quantitative Methods
255
Summary

● The unit aims to introduce the concept of Index Numbers and their applications
in the field of management.
● Index numbers are indicators that represent the relative changes in the level of
a phenomenon through time in comparison to its values in a previous era.
● Price, Quantity, Value, and Composite Index Numbers are the four types of
index numbers.
● There are two types of index number methods: weighted and unweighted index
numbers.

Terminal Questions

1. Define Index Number and explain the classification of Index Numbers


2. Differentiate Weighted and Un-Weighted Index Numbers
3. Using the information below, calculate the index number for the year 2008 based
on the base year 2007.
Commodities Price (2007) Price (2008)
P 6 10
Q 2 2
R 4 6
S 10 12
T 8 12
4. Using the data below, calculate the index number for the year 2008, using 2007 as
the arithmetic mean.
COMMODITIES UNITS PRICE (Rs) 2007 PRICE (Rs) 2008
Sugar Quintal 2200 3200
Milk Liter 18 20
Oil Liter 68 71
Wheat Quintal 900 1000
Clothing Meter 50 60
5. For the following data Find (1) Laspeyres’s Index (2) Paasche’s Index (3) Fisher’s
Ideal Index. (4) Dorbish and Bowley’s Index.
2010 2015
ITEMS PRICE PRODUCTION PRICE PRODUCTION
X 20 50 12 60
Y 12 59 13 64
Z 15 45 14 50
6. Calculate the fixed base and chain base Index Numbers for the following data
Years 1974 1975 1976 1977 1978 1979
Price 18 21 25 23 28 30

Quantitative Methods
256
Answer Keys

Self-Assessment Questions

Question No Answers

1 B

2 A

3 B

4 C

5 D

6 A

7 B

8 A

9 A

10 D

11 C

12 D

13 C

14 D

15 D

Quantitative Methods
257
Glossary
• The Laspeyres Index: It is a price index used to measure the economy’s gen-
eral price level and cost of living, and to calculate inflation.
• Paasche’s Method: Is a composite index number of price arrived at by the
weighted sum method.
• Index Number: The measurement of any change in a variable or variables
across a determined period.

BIBLIOGRAPHY
1. Chung, K. L. (2000). A Course in Probability Theory (3rd ed.). San Diego, CA:
Academic Press.
2. Feller, W. (1968). An introduction to Probability Theory and its applications:
Volume I. John Wiley & Sons.
3. Gupta, S. C., & Kapoor, V. K. Fundamentals of Mathematical Statistics. S. Chand
Publications.

External Resources
1. Rukmangadachari, E. Probability and Statistics. Person Education.
2. Rohatgi, V. K., & Ehsanes Saleh, A. K. (2015). An introduction to probability and
statistics (3rd ed.). doi:10.1002/9781118799635

e-References
● What Is a Price-Weighted Index, and How Does It Work? :
https://www.investopedia.com/terms/p/priceweightedindex.asp

● Unweighted Index Numbers:


https://www.brainkart.com/article/Unweighted-Index-Numbers_39262/

Quantitative Methods
258
Image Credits

Classification of Index Numbers


https://www.flexiprep.com/NIOS-Notes/Senior-
Fig. 1
Secondary/Economics/NIOS-Economics-Ch-9-Index-
Number.html

Video Links

Topic Link
https://www.youtube.com/watch?v=-
Index Numbers
dUe3U0BTb4k
https://www.youtube.com/
Electrostatics L5 | Electric Field
watch?v=2_glZ6bMI9w
https://www.youtube.com/
"Index Numbers" Introduction in Statistics
watch?v=cv5iPhRmSAg
STATISTICS | Index Numbers – Introduc- https://www.youtube.com/
tion watch?v=3P62OdhegsI

Keywords

● Price Index
● Quantity Index
● Weighted Index
● Un-weighted Index
● Fisher’s Ideal Index
● Dorbish and Bowley’s Index

Quantitative Methods
259
Quantitative Methods
260

You might also like