Business Statistics KMBN-104 - Q - Ans
Business Statistics KMBN-104 - Q - Ans
Unit – I
Meaning, Scope, functions and limitations of statistics, Measures of Central tendency –
Mean, Median, Mode, Quartiles, Measures of Dispersion – Range, Inter quartile range,
Mean deviation, Standard deviation, Variance, Coefficient of Variation, Skewness and
Kurtosis.
Ans)
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-2-
Sol. It is a single value represents the entire mass of data. Generally, these are the central part of the
distribution. Characteristics of ideal Average are:
Ans) Coefficient of variation is the measure of relative dispersion which relates the standard deviation
and the mean such that the standard deviation is expressed as a percentage of mean.
When two or more distributions having unequal mean & equal SD are to be compared.
When two or more distributions expressed in different units of measurement are to be compared.
CV is unit less quantity
𝜎
𝑆𝑦𝑚𝑏𝑜𝑙𝑖𝑐𝑎𝑙𝑙𝑦, 𝐶𝑉 (𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛) = 𝜇 *100 %
Series having larger CV is more variable, whereas the series having lesser CV is more consistent.
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-3-
Q.5.What is Statistics? What are the various uses of statistics in the management of an
organization?
OR
Q.5. “Our managers can improve managerial decisions to a great extent, if they are
adequately familiar with the basic tools of statistics and mathematics.” Explain and
illustrate.
1. Marketing: Statistical analysis are frequently used in providing information for making decision in the
field of marketing it is necessary first to find out what can be sold and the to evolve suitable strategy, so
that the goods which to the ultimate consumer.
2. Production: In the field of production statistical data and method play a very important role. Decision
about what to produce? How to produce? When to produce? For whom to produce is based largely on
statistical analysis.
3. Finance: The financial organization discharging their finance function effectively depends very heavily
on statistical analysis.
4. Investment: Statistics greatly assists investors in making clear and valued judgment in his investment
decision in selecting securities which are safe and have the best prospects of yielding a good income.
5. Human Resource: Statistics may be used to handle data generated through human resource for
planning, organizing, staffing.
Tools in Statistics:
Time Series – Used to analyze the trend in data and make prediction based on that trend.
Probability – Used to find chance of success or failure of any project.
Measure of Central Tendency – Used to know the single value of about data. Like mean, mode,
median.
Measure of dispersion – Used to understand variation between data.
Index Number – Used to understand commodity and inflation.
Correlation – Used to understand the relation between variables.
Regression – Used to make forecasting, prediction and estimation.
Hypothesis – Used for research in management.
Decision Theory – Used to help in decision making
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-4-
Ans) Statistics is concerned with the collection, classification (or organization), presentation and analysis
of data which are measurable in numerical terms.
Scope of statistics
Statistical data and techniques of statistical analysis are immensely useful in solving economical
problems such as wages, price, time series analysis, demand analysis
It can be used in medical and actuarial sciences
Business executives are relying more and more on statistical techniques for studying the
preference of the customers.
In production engineering, statistical tools such as inspection plan, control chart etc. are
extensively used to find out whether the product is confirming to the specifications or not.
Statistics are useful to banker, insurance companies, social workers, labour unions, trade
associations, chambers and to the politicians.
Functions of statistics
It simplifies complex data
It provides techniques for comparison
It studies relationships
It helps in formulating policies
It helps in forecasting
It is helpful for common man
Statistical methods merges with speed of computer can make wonders; SPSS,STATA, MATLAB,
MINITAB, MS-Excel etc.
Limitations of statistics
Ans)
Merits Demerits
Mode Used with nominal level data Not representative of all data
Not influenced by extreme scores Depends on group selection
Limited use in statistics
Median Used with ordinal level data May not appear in data
Not influenced by extreme scores Restricted statistical uses
Mean Used with interval level data Influenced by extreme scores
Useful statistical properties May not appear in data
Widely understood Requires interval level data
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-5-
Kurtosis - measures how peaked a distribution is and the lightness or heaviness of the tails of the
distribution. In other words, how much of the distribution is actually located in the tails? A normal
distribution has a kurtosis value of zero (0) and is said to be mesokurtic. A positive kurtosis value means
that the tails are heavier than a normal distribution and the distribution is said to be leptokurtic (with a
higher, more acute "peak"). A negative kurtosis value means that the tails are lighter than a normal
distribution and the distribution is said to be platykurtic (with a smaller, flatter "peak").
Range- It is the difference between the maximum value and the minimum value of data
Inter quartile range = Q3-Q1, It denotes the difference between the third quartile and the first quartile or
the semi – inter-quartile range or quartile deviation = (Q3-Q1)/2
Mean deviation(MD) - It is the average of absolute amounts by which the individual items deviate from
𝐼𝑥𝐼
the mean. MD =
𝑛
Standard deviation (σ) - It is a measure that is used to quantify the amount of variation or dispersion of a
set of data values. Deviations are measured from the mean. It has desirable mathematical properties.
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-6-
Q.10. Discuss various partition values such as median, quartiles, deciles and percentiles
and their uses.
Ans) Partition values divide the same set of observations in different ways. So, we can fragment these
observations into several equal parts.
Median – It is that value of the variable which divides the group into two equal parts, one part
comprising all values greater, and the other part having lesser value than median.
Deciles are those values that divide any set of a given observation into a total of ten equal parts.
Therefore, there are a total of nine deciles. These representation of these deciles are as follows – D1, D2,
D3, D4, ……… D9.
A percentile basically divides any given observation into a total of 100 equal parts. The representation of
these percentiles are given as – P1, P2, P3, P4, ……… P99. A quartile is a type of quartile. The first
quartile (Q1) is defined as the middle number between the smallest number and the median of the data set.
The second quartile (Q2) is the median of the data. The third quartile (Q3) is the middle value between the
median and the highest value of the data set.
Ans) Skewness indicates lack of symmetry in a distribution. When a frequency distribution is elongated
to the right, that is , having a longer tail to the right, it is said to be positively skewed. If the distribution
has a longer tail to the left, it is said to be negatively skewed.
If mean>median>mode , skewness is positive
If mean<median<mode , skewness is negative
Symmetrical distribution is not skewed. (when mean=median=mode)
𝐷1+𝐷9−2𝑀𝑒𝑑𝑖𝑎𝑛
Kelly‘s measure (where skewness = 𝐷9−𝐷1
)
𝑥𝑖 −𝑚𝑒𝑎𝑛
Moment‘s measure (for e.g µ1= 𝑁
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-7-
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-8-
Unit – II
Time series analysis: Concept, Additive and Multiplicative models, Components of time
series, Trend analysis: Least Square method - Linear and Non- Linear equations,
Applications in business decision-making. Index Numbers:- Meaning , Types of index
numbers, uses of index numbers, Construction of Price, Quantity and Volume indices:-
Fixed base and Chain base methods.
Secular trend: The Tendency of the time series data to increase, decrease or stagnate over a
long passage of time. For Ex : Population
Seasonal component: is the variability in the behavioral pattern during different seasons in an year.
For Ex: Sale of AC, Fans.
Cyclical component: is almost synonymous with the business cycle reflecting the upswing and
downswing of the data over extended periods of time. For Ex : Recession
Random or Irregular component: irregular variations caused by random factors and sporadic
causes like strikes, natural disasters and so on.
Simple Index Number: A simple index number is a number that measures a relative change in a
single variable with respect to a base. These types of Index numbers are constructed from a single
item only.
Composite Index Number: A composite index number is a number that measures an average
relative changes in a group of relative variables with respect to a base. A composite index number is
built from changes in a number of different items.
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-9-
Price index Numbers: Price index numbers measure the relative changes in prices of a commodity
between two periods. Prices can be either retail or wholesale. Price index number is useful to
comprehend and interpret varying economic and business conditions over time.
Quantity Index Numbers: These types of index numbers are considered to measure changes in the
physical quantity of goods produced, consumed or sold of an item or a group of items.
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-10-
Proposed Measuring current prices or quantities in Measuring current price or quantity levels relative
for relation to those of a selected base period. to those of a selected base period.
Definition is a form of index number where prices, is a ratio that compares the total purchase cost of
quantities or other units of measure over a specified bundle of current-period commodities
time are weighted according to their values (commodities valued at current prices) with the
value of those same commodities at base-period
in a specified base period.
prices; this ratio is multiplied by 100.
Formulae
Factor Reversal Test – Not Satisfied Factor Reversal Test – Not Satisfied
Q.5. Differentiate between Time reversal and factor reversal tests. (OR)
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-11-
Ans)
Unit Test: This test requires that the index number should be independent of unit of measurement.
Except for simple (unweighted) aggregate index all other formula satisfy this test.
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-12-
Q.7. Discuss the methods of trend analysis in time series. Also elaborate the method of least
square for linear and non-linear equation.
Sol. The following methods are used for analyzing the trend:
𝒀=𝒂+𝒃𝒙
Normal Equations are
𝒀=𝒂+𝒃 𝒙
𝒙𝒀 = 𝒂 𝒙 + 𝒃 𝒙𝟐
When n is odd
x = (X-middle year) / Interval in X
When n is Even
x = 2 (X-Average of two middle year) / Interval
Method of semi average
This method divides the data into two parts, preferably equal, and averaging the data in each part. In
this way we obtain two points on graph. The line obtained by joining these points is the required trend
line.
It is a series of successive average of m terms at a time until we exhaust the whole time series.
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-13-
Parabolic
Quadratic
Hyperbolic
Exponential
Logarithmic
Q8) Write the applications of time series and index numbers in management.
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-14-
UNIT III
Correlation Analysis: Rank Method & Karl Pearson's Coefficient of Correlation and
Properties of Correlation. Regression Analysis: Fitting of a Regression Line and
Interpretation of Results, Properties of Regression Coefficients and Relationship between
Regression and Correlation.
Multiple Correlations
It comes under multivariate analysis.
Here we establish relationship between two or more variable simultaneously.
Here we measure X1 with a joint effect on X2 and X3.
R1.23 – Multiple correlation coefficient of X1 on X2 and X3. Here X1 is dependent variable and X2 & X3 are
independent.
R2.13 – Multiple correlation coefficient of X2 on X1 and X3. Here X2 is dependent variable and X1 & X3 are
independent.
R3.12 – Multiple correlation coefficient of X3 on X1 and X2. Here X3 is dependent variable and X1 & X2 are
independent.
Multiple correlation coefficient is always positive.
It is always lying between 0 and 1.
Coefficient of multiple correlation is larger than either of the correlation r12, r13 or r23.
Q.2. Define Regression. Give important uses and applications of regression analysis.
Sol. Regression analysis is a statistical process for estimating the relationships among variables. It
includes many techniques for modeling and analyzing several variables, when the focus is on the
relationship between a dependent variable and one or more independent variables. More specifically,
regression analysis helps one understand how the typical value of the dependent variable changes when
any one of the independent variables is varied, while the other independent variables are held fixed. It is
of two types y on x and x on y.
Y on X
Here Y is dependent variable and X is independent variable. Its standard equation is:
Y=a+bX
X on Y
Here X is dependent variable and Y is independent variable. Its standard equation is:
X=a+bY
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-15-
Uses of regression:
1. To model some phenomena in order to better understand it and possibly use that understanding
to affect policy or to make decisions about appropriate actions to take. Basic objective: to
measure the extent that changes in one or more variables jointly affect changes in another.
2. To model some phenomena in order to predict values for that phenomenon at other places or
other times. Basic objective: to build a prediction model that is consistent and accurate. Example:
where are real estate values likely to go up next year
3. You can also use regression analysis to test hypotheses. Suppose you are modeling residential
crime in order to better understand it, and hopefully implement policy to prevent it.
4. Decision making
5. Comparison
6. Prediction and Estimation
7. Relationship between variables.
8. Estimating a variable based on other variable
9. Business forecasting
10. Linear and non-linear regression
11. Line of best fit
We can use the technique of correlation to test the statistical significance of the association (or
relationship between variable). In other cases we use regression analysis to describe the relationship
precisely by means of an equation that has predictive value. We deal separately with these two types of
analysis - correlation and regression - because they have different roles.
Q.3. Define the term correlation. Discuss various methods of measuring correlation, types
and their uses.
Sol. Correlation
Measure of Correlation
1. Scatter Diagram Method It is a graphical method to find the correlation between variables.
Here the pair of the observations are plotted on a 2-D space. After joining these points we can
have the idea about the relationship between variables.
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-16-
2. Karl-Pearson‟s coefficient of correlation (r) The value of r lying between -1 and +1 i.e., -
1≤r ≤+1 Coefficient of correlation is independent of change origin and scale. Coefficient ‗r‘ is
symmetric rxy=ryx The Probable error of ‗r‘ is used to interpreting its estimated value. There
should be sufficient number of items in the series. Correlation does not necessarily mean cause &
effect relationship.
Both the correlated variables may be affected by third variable. The correlation may be due to
random or chance factor. There may be a situation of nonsense or spurious correlation b/w two
variables.
ρ = 6 ∑D2 / N (N2-1)
4. Concurrent Deviation Method This is the simplest method in which only the direction of
change is taken into consideration rather than magnitude of variation. It gives a general idea
about the correlation between variables quickly.
Types of Correlation:
Multiple Correlation: Under Multiple Correlation three or more than three variables are
studied.
Partial correlation: analysis recognizes more than two variables but considers only two
variables keeping the other constant.
Linear correlation: Correlation is said to be linear when the amount of change in one
variable tends to bear a constant ratio to the amount of change in the other.
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-17-
Non Linear correlation: The correlation would be non linear if the amount of change in one
variable does not bear a constant ratio to the amount of change in the other variable.
Uses of correlation
OR
S. No Correlation Regression
1 It studies relationship between two or more It uses to estimate/forecast value
variable. of one variable based on other
variable.
2 Methods are:- Methods are:-
Scatter Diagram Method Methods of least square
Karl Pearson‘s Coefficient of Regression coefficient
Correlation ( r ) method
Spearman‘s Coefficient of Rank
Correlation
Concurrent Deviation Method
3 The value of r lying between -1 and +1 i.e., Both the regression coefficients
-1≤r ≤+1 could not more than one
simultaneously.
6 ‗r‘ is a single numerical value which depicts Regression coefficient depicts rate
strength of relationship. of change.
7
8 Value of [r]= (+)(-)1 is means it is perfect byx and bxy may be used to derive
correlation. regression equation of y on x & x
on y.
The relation between r, byx and bxy is r = ±√ byx bxy
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-18-
Sol. Karl-Pearson‘s method discusses the relationship between the quantitative variable where as
Spearman‘s coefficient suitable for qualitative variable like, rank given to the participant in any
contest by two judges and we want to measure the relationship between rank given by these
judges.
R = 1- (6 ∑ D2 ) / N (N2 – 1)
R = 1- (6 ∑ D2 ) + AF / N (N 2 – 1)
This method is simpler to understand and easier to apply compared to Karl Pearson‘s
correlation method.
This method is useful where we can give the ranks and not the actual data.
(qualitative term)
This method is to use where the initial data in the form of ranks.
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-19-
UNIT IV
Probability: Theory of Probability, Addition and Multiplication Law, Baye’s Theorem,
Probability Theoretical Distributions: Concept and application of Binomial; Poisson and
Normal distributions.
Multiplication law in probability applies to combination of events. When the events have to
occur together then we make use of the multiplication law of probability. Now two cases arise:
whether the events are independent or dependent.
Baye‟s Theorem
If an event E can occur only if any one of the set of exhaustive and mutually exclusive events E1,
E2,…En occurs. The probabilities of P (E1), P (E2),……..P (En) and conditional probabilities P (E/Ei) for
an event A to occur are known. The conditional probability P (Ei/E) is given by
P (Ei). P (E/Ei)
P (Ei/E) = ∑P (Ei). P (E/Ei)
Q.3. What is meant by theoretical distribution? Define and compare the properties of the
following distribution: a) Binomial Distribution, b) Poisson Distribution, c) Normal
Distribution
a) BINOMIAL DISTRIBUTION
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-20-
A binomial distribution is very different from a normal distribution, and yet if the sample size is
large enough, the shapes will be quite similar.
The key difference is that a binomial distribution is discrete, not continuous. In other words, it is
NOT possible to find a data value between any two data values.
Properties-
Statistical independence is assumed.
Each trail has constant probability
When p, probability of occurrence is very small and n is very large.
The only parameter is λ=np
The mean and variance of PD is λ.
Each trial has only two possible outcomes (Dichotomy)
It is a limiting case of Binomial distribution
The normal (z) distribution is a continuous distribution that arises in many natural processes.
"Continuous" means that between any two data values we could (at least in theory) find another
data value. For example, men's heights vary continuously and are the result of so many tiny
random influences that the overall distribution of men's heights in America is very close to
normal. Another example is the data values that we would get if we repeatedly measured the
mass of a reference object on a pan balance—the readings would differ slightly because of
random errors, and the readings taken as a whole would have a normal distribution.
The bell-shaped normal curve has probabilities that are found as the area between any
two z values.
Not all natural processes produce normal distributions.
Properties
The normal curve is symmetrical about the vertical axis through mean.
Mean (µ) & SD(σ) are known as the parameter of the distribution.
The curve is Asymptotic to X-axis.
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-21-
The problems related to Normal distribution can be solved by using the properties of
Normal Curve.
The random variable X should be transform to Standard Normal Variable ‗Z‘ using
Z = (X-µ)/σ
After the transformation the probability (or area) can be found using Normal distribution
table.
The total area under the normal curve is 1, which is divided into two equal halves through
vertical axis.
x n-x -λ x
P(x)=nCx p q P(x)=e λ / x!
P(x)
Used for smaller values of n Used when n is large and p Used for any value of n and x
is small (limiting case of
BD)
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-22-
A normal curve is a bell-shaped curve which shows the probability distribution of a continuous
random variable. Moreover, the normal curve represents a normal distribution. The total area
under the normal curve logically represents the sum of all probabilities for a random variable.
The bell-shaped normal curve has probabilities that are found as the area between any
two z values.
Not all natural processes produce normal distributions.
Properties
The normal curve is symmetrical about the vertical axis through mean.
Mean (µ) & SD(σ) are known as the parameter of the distribution.
The problems related to Normal distt can be solved by using the properties of Normal
Curve.
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-23-
The random variable X should be transform to Standard Normal Variable ‗Z‘ using
Z = (X-µ)/σ
After the transformation the probability (or area) can be find using Normal distribution
table.
The total area under the normal curve is 1, which is divided into two equal halves through
vertical axis.
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-24-
UNIT V
Hypothesis Testing: Null and Alternative Hypotheses; Type I and Type II errors; Testing of
Hypothesis: Large Sample Tests, Small Sample test, (t, F, Z Test and Chi Square Test)
Concept of Business Analytics- Meaning types and application of Business Analytics, Use of
Spread Sheet to anlayze data-Descriptive analytics and Predictive analytics.
Q.1. Define null hypothesis, critical region and two sided test, used in testing of statistical
hypothesis.
Sol.
Null Hypothesis: In statistical inference of observed data of a scientific experiment, the null hypothesis refers to a
general or default position: that there is no relationship between two measured phenomena, or that a potential
medical treatment has no effect. Rejecting or disproving the null hypothesis – and thus concluding that there are
grounds for believing that there is a relationship between two phenomena or that a potential treatment has a
measurable effect – is a central task in the modern practice of science, and gives a precise sense in which a claim
is capable of being proven false.
Example
Given the test scores of two random samples of men and women, does one group differ from the other? A possible
null hypothesis is that the mean male score is the same as the mean female score:
H0: μ1 = μ2
where:
A stronger null hypothesis is that the two samples are drawn from the same population, such that the variance and
shape of the distributions are also equal.
Critical Region
The critical region CR, or rejection region RR, is a set of values of the test statistic for which the null hypothesis is
rejected in a hypothesis test. That is, the sample space for the test statistic is partitioned into two regions; one region
(the critical region) will lead us to reject the null hypothesis H0, the other will not. So, if the observed value of the
test statistic is a member of the critical region, we conclude "Reject H0"; if it is not a member of the critical region
then we conclude "Do not reject H0".
Two-tail Test
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-25-
In statistical significance testing, a one-tailed test or two-tailed test are alternative ways of computing thestatistical
significance of a data set in terms of a test statistic, depending on whether only one direction is considered extreme
(and unlikely) or both directions are considered extreme. Alternative names are one-sided and two-sided tests; the
terminology "tail" is because the extremes of distributions are often small, as in the normal distribution or "bell
curve", pictured above right.
If the test statistic is always positive (or zero), only the one-tailed test is generally applicable, while if the test
statistic can assume positive and negative values, both the one-tailed and two-tailed test are of use.
Figure: A two-tailed test corresponds to both extreme negative and extreme positive directions of the test statistic,
here the normal distribution.
OR
Z-test and t-test are basically the same; they compare between two means to suggest whether both
samples come from the same population. There are however variations on the theme for the t-test. If you
have a sample and wish to compare it with a known mean (e.g. national average) the single sample t-test
is available. If both of your samples are not independent of each other and have some factor in common,
i.e. geographical location or before/after treatment, the paired sample t-test can be applied. There are also
two variations on the two sample t-test, the first uses samples that do not have equal variances and the
second uses samples whose variances are equal.
1. Z-test is a statistical hypothesis test that follows a normal distribution while t-test follows a Student‘s t-
distribution.
2. A t-test is appropriate when you are handling small samples (n < 30) while a Z-test is appropriate when
you are handling moderate to large samples (n > 30).
3. t-tests are more commonly used than Z-tests.
4. Z-tests are preferred than t-tests when standard deviations are known
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-26-
Pearson's chi-squared test (χ2) is the best-known of many chi-squared tests (Yates, likelihood
ratio, portmanteau test in time series, etc.) – statistical procedures whose results are evaluated by
reference to the chi-squared distribution. Its properties were first investigated by Karl Pearson in 1900. In
contexts where it is important to improve a distinction between the test statistic and its distribution, names
similar to Pearson χ-squared test or statistic are used.
Uses of chi-square:
Test of goodness of fit
Test of independence
Test of variance/SD of a single population.
The procedure of the test includes the following steps:
1) Calculate the chi-squared test statistic, , which resembles a normalized sum of squared
deviations between observed and theoretical frequencies (see below).
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-27-
2) Determine the degrees of freedom, d.f., of that statistic, which is essentially the number of
frequencies reduced by the number of parameters of the fitted distribution.
3) Compare to the critical value from the chi-squared distribution with d degrees of freedom,
which in many cases gives a good approximation of the distribution of .
Accept H0 Reject H0
Sol. When the information about the population is known then we use parametric test and if there is no
knowledge about the population or parameters, then we use non-parametric tests.
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-28-
Sol. ANOVA is a statistical technique used to evaluate the variances between three or more sample
means. This helps to make inferences to judge whether the samples are from population having a same
mean or not.
The F-test is based on F-distribution is called so in honor of great statistician R.A Fisher.
This test is suitable for test of significance two sample estimates of variance.
Since F test is based on the ratio of two variances, hence it is also known as variance ratio test.
This test checks hypothesis about the fact that the dispersions of two random variables X and Y
which are represented by samples xS and yS are equal. The test works correctly under the
following conditions:
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-29-
If X and Y have a normal distribution, the F-statistic will have F-distribution with NX -1 and NY-1 degrees
of freedom. To define the significance level which corresponds to the value of F-statistic high-precision,
F-distribution approximation is used.
Q.9. Discuss the concept of business analytics with its meaning, types and applications.
Sol. Business Analytics is the use of data, information technology, statistical analysis,
quantitative methods, and mathematical or computer-based models to help managers gain
improved insight about their business operations and make better, fact-based decisions.
Descriptive analytics or data mining are at the bottom of the big data value chain, but they can
be valuable for uncovering patterns that offer insight. A simple example of descriptive analytics
would be assessing credit risk; using past financial performance. Descriptive analytics can be
useful in the sales cycle, for example, to categorize customers by their likely product preferences
and sales cycle.
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM
-30-
Finance
It is of utmost importance to the finance sector. Data Scientists are in high demand in
investment banking, portfolio management, financial planning, budgeting, forecasting,
etc.
For example: Companies these days have a large amount of financial data. Use of
intelligent business analytics tools can help use this data to determine the products‘
prices. Also, on the basis of historical information BAs can study the trends on the
performance of a particular stock and advise the client on whether to retain it or sell it.
Marketing
Studying buying patterns of consumer behaviour, analysing trends, help in identifying the
target audience, employing advertising techniques that can appeal to the consumers,
forecast supply requirements, etc.
For example: It is used to gauge the effectiveness and impact of a marketing strategy on
the customers. Data can be used to build loyal customers by giving them exactly what
they want as per their specifications.
HR Professional
HR professionals can make use of data to find information about educational background
of high performing candidates, employee attrition rate, number of years of service of
employees, age, gender, etc. This information can play a pivotal role in the selection
procedure of a candidate. For example: HR manager can predict the employee retention
rate on the basis of data given by business analytics.
CRM
It helps one analyse the key performance indicators, which further helps in decision
making and make strategies to boost the relationship with the consumers. The
demographics, and data about other socio-economic factors, purchasing patterns,
lifestyle, etc., are of prime importance to the CRM department.
For example: The company wants to improve its service in a particular geographical
segment. With data analytics, one can predict the customer‘s preferences in that particular
segment, what appeals to them, and accordingly improve relations with customers.
Manufacturing
It can help us in supply chain management, inventory management, measure performance
of targets, risk mitigation plans, improve efficiency in the basis of product data, etc.
For example: The Manager wants information on performance of machinery which has
been used past 10 years. The historical data will help evaluate the performance of the
machinery and decide whether costs of maintaining the machine will exceed the cost of
buying new machinery.
Credit Card Companies Credit card transactions of a customer can determine many
factors: financial health, life style, preferences of purchases, behavioral trends, etc.
For example: Credit card companies can help the retail sector by locating the target
audience. According to the transactions reports, retail companies can predict the choices
of the consumers, their spending pattern, and preference over buying competitor‘s
products, etc. This historical as well as real-time information helps them direct their
marketing strategies in such a way that it hits the dart and reaches the right audience.
Questions for Business Statistics and Analytics (KMBN-104) Compiled by Dr. Ritesh Singhal, AKGIM