Mathematics For Management - Statistics Section
Mathematics For Management - Statistics Section
Mathematics For Management - Statistics Section
This document is authorized for use only by RAJESH KUMAR NAYAK. Copy or posting is an infringement of copyright.
Welcome to the pre-assessment test for the Mathematics for Management tutorial. This test will allow you to assess your knowledge of Mathematics for Management.
All questions must be answered for your exam to be scored.
one question to the next, select one of the answer choices or, if applicable, complete with your own choice and click the “Submit” button. After submitting your
To advance from
answer, you will not be able to change it, so make sure you are satisfied with your selection before you submit each answer. You may also skip a question by pressing the forward
advance arrow. Please note that you can return to “skipped” questions using the “Jump to unanswered question” selection menu or the navigational arrows at any time. Although
you can skip a question, you must navigate back to it and answer it - all questions must be answered for the exam to be scored.
After completion, you can review your answers at any time by returning to the exam.
Good luck!
During your business degree program, you will use mathematics in many situations. In your economics courses, you will have to determine how demand is related to price. You
might even use basic calculus to come up with a profit-maximizing price. In your statistics course, you might be expected to know the basic laws of probability. In your finance
courses, you will need to understand the mathematics behind valuing cash flows. Most of your professors will expect you to know how to solve simple equations and do basic
manipulations of algebraic formulas. You may not have used algebra, calculus, probability, and statistics for five or ten years. If you are an undergraduate English or music major
now entering a graduate business program, you may have never studied calculus or basic probability and statistics.
The purpose of our course is to help level the playing field by giving you the analytic background you need to hit the ground running and complete a top MBA program successfully.
We will try to make the concepts as interesting and easy to learn as possible. You may find it useful to refer to the Mathematics for Management Concept Summary while taking the
course. Let's get started!
The analysis of data is crucial to business. In finance class, you will analyze returns on stocks and other investments. In your operations and marketing classes, you will analyze
monthly demand for products that are being sold. This section of the course begins by introducing you to the basics of data analysis.
Summation Notation
Suppose you want to add up the first 100 even positive integers. You could write a lengthy addition operation that specifies all 100 digits — i.e., 2 + 4 + 6 + ... + 198 + 200. A less
cumbersome, more elegant way to represent the operation is with the symbol ∑, which means summation.
The notation dictates that for each of the first 100 positive integers i, find 2i; then add the results together. The only values of i for which you determine 2i are those from 1
through 100 — they are, respectively, the lower and upper limits of the summation. The summation itself, or sigma, of all the 2i calculations (2 + 4 + 6 + ... + 198 + 200) is your
answer: 10,100.
It means that to find the average of n numbers, add up the n numbers and divide the sum by n. For example, if x1 = 3, x2 = 5, and x3 = 4, then
3/10/22, 10:38 AM Mathematics for Management: Statistics Section
(1) Evaluate ∑
(2) Smalltown Bagels bakes n types of bagels. Today the shop is planning to bake xi type i bagels, which each cost ci dollars to produce.
a. In summation notation, write an expression for the total cost of baking today's bagels.
c1 = $1.20, x1 = 100
c2 = $1.50, x2 = 50
c3 = $2.00, x3 = 50
Summarizing data often yields important managerial insights. There are two main ways to summarize data:
1. using a bar graph, or histogram, that gives a graphical summary of the data
2. using descriptive statistics such as mean, median, mode, and standard deviation
3/10/22, 10:38 AM Mathematics for Management: Statistics Section
First, divide the data into 5 to 10 categories, or bin ranges, of equal size. In this case, you might create seven bins: one for girls up to and including 58 inches tall, another for
girls over 58 inches up to and including 60 inches tall, a third for girls over 60 inches up to and including 62 inches tall, and so on. For example, 13 girls fall in the range of
heights over 60 inches up to and including 62 inches (see the pink shaded cells). Note: There are other ways to treat bin ranges; we are using the convention used by the
Histogram tool in Microsoft Excel.
Next, create a frequency table that identifies how many data points, or observations, fall into each bin range.
Please download the file histogramdata.xlsx.
(1) The Salaries worksheet contains the annual salaries (in thousands of dollars) for the employees of the Smalltown tourist bureau. With bin ceilings of 40, 50, 60, 70, 80, and
90, construct a bar graph of employee salaries.
(2) The Microsoft worksheet gives a sample of daily percentage returns on Microsoft stock. Use Excel to summarize these data with a histogram. For your bin ranges, use upper
boundaries of -20%, -15%, -10%, -5%, 0%, 5%, 10%, and 15%.
It's often practical to summarize data with a single number that typifies the data set. For example,
3/10/22, 10:38 AM Mathematics for Management: Statistics Section
What is the typical number of ounces in a can of Coca-Cola?
What is a typical family income in Smalltown?
What is the typical number of points that a team scores in a game?
This section focuses on three measures of central tendency for a data set: mean, median, and mode.
The mean is simply the average of the n numbers. It is usually written as x-bar and expressed as
To compute the mean, simply add up all of your observations and divide by the number of observations.
The median is the halfway mark between the lower and upper extremes of the list of numbers in a data set. To find the median, first order the numbers from smallest to
largest. If n is odd, the median is the (n + 1)/2 smallest number. For example, if the data set includes 9 numbers, calculate (9 + 1)/2 = 5 to find that the median is the
fifth-smallest number (the one in the middle). Of the eight other numbers, four are smaller than the median and four are larger. If n is even, the median is the average of
the n/2 smallest number and (n + 2)/2 smallest numbers. For example, if the data set includes 10 numbers, calculate 10/2 = 5 and (10 + 2)/2 = 6. The median is,
therefore, the average of the fifth- and sixth-smallest numbers. Five of the numbers are smaller than the median and five are larger; the median sits in between these two
The mode is the most frequently occurring number in a data set. A data set can have more than one mode (for the numbers that occur most frequently may be identical in
their frequency). If no number occurs more than once in a data set, the data set has no mode.
3/10/22, 10:38 AM Mathematics for Management: Statistics Section
If another employee were hired at a salary of $80,000, there would be two modes: $30,000 and $80,000.
If, instead, one of the two employees making $30,000 were to leave, there would be no mode.
3/10/22, 10:38 AM Mathematics for Management: Statistics Section
For large data sets, calculating the mean, median, and mode is difficult to do manually. Excel's AVERAGE, MEDIAN, and MODE functions make it simple.
Please download the file Colts.xlsx. The file contains the number of yards gained on all passing plays attempted by the 2006 Super Bowl Champion Indianapolis Colts. You can
use Excel to compute the mean, median, and mode of the number of yards gained on a passing play.
(1) Ten geography majors at the University of North Carolina had the following starting salaries (in thousands of dollars): 20, 25, 30, 28, 35, 20, 20, 25, 40, and 757. Find the
mean, median, and mode of these salaries. Which seems to be the best measure of a typical geography major's starting salary?
(2) Find the mean, median, and mode for both data sets in the file Histogramdata.xlsx.
The mode is rarely used as a measure of central location. If a shoe store could only stock one size, it would probably stock the modal shoe size. In most situations, however, we
use the mean or median as a measure of central location for a data set. In general, we use the mean as a measure of central location unless extreme values greatly distort the
mean. The U.S. government reports family income for the country as a whole as a median, not a mean. A football team's offense is assessed in terms of the average, not the
median, points scored per game. Why use the median in the first situation and the mean in the second? The answer is that people with large incomes distort, or skew, the mean
family income; the median is not subject to that distortion. Before identifying precisely when to use the mean or median as a measure of central tendency, let's return to the topic
of histograms and define the concept of skewness.
A data set is symmetric if the data set's histogram has a single peak at the center and "looks the same" to the left and right of the most likely value of the data. The following
histogram displays IQs of students at Smalltown High School.
3/10/22, 10:38 AM Mathematics for Management: Statistics Section
A symmetric data set's mean, median, and mode are approximately equal because the peak is at the center and the declines to the left and to the right of the peak occur at the
same rate. For example, nearly as many people have IQs around 95 as have IQs around 115.
The histogram shows that the most common income range is $30,000 to $50,000. Some people earn more than $300,000, whereas some earn $10,000 or less. Because the data
extend farther to the right of the peak than to the left, family incomes in Smalltown are positively skewed.
The most common category is "more than 280 days." Because the data extend much farther to the left of the highest bar than to the right, days from conception to birth is
negatively skewed.
3/10/22, 10:38 AM Mathematics for Management: Statistics Section
If the data you are analyzing are not skewed, use the mean as the measure of central tendency. In cases of great skewness, use the median as the measure of central tendency to
avoid distortion by extreme values.
You can usually assess skewness by simply eyeballing a histogram. To be precise about measuring skewness, apply the Excel SKEW function to a data set.
If SKEW > +1, the data are positively skewed and the median is the better measure of central tendency.
If SKEW < -1, the data are negatively skewed and the median is again the better measure of central tendency.
If SKEW is between -1 and +1, the data are relatively symmetric and the mean is the better measure of central tendency.
Please download file Skewness.xlsx. Let's compare the mean and the median as measures of central tendency for the IQ, income, and conception-to-birth data sets. In the cell
range D3:F3, the skewness for each data set has been computed — e.g., using the formula =SKEW(D8:D657) for IQs in cell D3. The median, mode, and mean for each data set
have been computed using the MEDIAN, MODE, and AVERAGE functions, respectively.
Click the column titles below to choose the appropriate skew behavior.
The data reveals the measure of central tendency that is best for each data set.
IQs are symmetric, and the mean (100.04) and the median (100) are virtually identical.
Income is positively skewed, and the mean (67.745) is larger than the median (48).
Days from conception to birth is negatively skewed, and the mean (259.9) is smaller than the median (269.5).
3/10/22, 10:38 AM Mathematics for Management: Statistics Section
(1) Please download file Income.xlsx. The file contains data that are representative of the income of U.S. families (adjusted for inflation) during the years 1975, 1985, 1995, and
2005. Does it appear that Americans were becoming better off as the decades passed?
(2) For the Colts' passing data, what measure of central location would you use?
Measures of Variability
Sarah Lopez Clooney is trying to determine in which of two stocks to invest a client's money. For each of the last six years, the annual percentage returns (expressed as a
decimal) for the stocks were as follows:
For each stock, the mean and median return for the last six years is .2. Therefore, the stocks are identical with respect to "typical" value. If you assume (naively) that the past is a
good predictor of the future, these two stocks seem to be equally good investments. Most investors, however, would choose Stock 1, because its annual returns are more
consistent than those on Stock 2. In this segment, you will learn how to use variance and standard deviation to measure the dispersion, or spread, of the data set about its mean.
If you divide by n instead of n - 1, the sample variance is the average squared deviation of each data point from the average of the data. The reasons why you should divide by n - 1
instead of n are complex enough to defer them to your statistics class.
The kind of data set that has the least spread about its mean is, not surprisingly, one in which all points have the same value and, thus, all equal the mean. Such a data set has a
sample variance of zero.
3/10/22, 10:38 AM Mathematics for Management: Statistics Section
For any data set, Excel makes it easy to find the sample variance and standard deviation. Use the VAR function to find the sample variance and the STDEV function to compute
the sample standard deviation.
Please download the file Samplevariance.xlsx. In the file, the VAR and STDEV functions have been used to determine the sample variance and standard deviation for each stock.
For example, for Stock 1 in cell C13, the sample variance has been computed with the formula =VAR(C7:C12), and in cell C14 the sample standard deviation has been computed
with the formula =STDEV(C7:C12).
(1) The heights (in inches) of the members of Smalltown High School girls' basketball team are 68, 70, 64, 62, and 68. Compute the sample variance and sample standard
deviation of these heights. Use Excel to verify your computations.
William Edwards Deming (1900-1993) was an American quality-control guru who stressed the importance of understanding "normal variation" in a business process. When a
data set has a symmetric histogram (skewness between -1 and +1), you can usually gain insight into the "normal range of variation for a data set" by relying on the following rule
of thumb involving the sample mean x-bar and sample standard deviation S:
68% of the data points are within S of the mean (between x-bar − S and x-bar + S).
95% of the data points are within 2S of the mean (between x-bar − 2S and x-bar + 2S).
99.7% of the data points are within 3S of the mean (between x-bar − 3S and x-bar + 3S).
Any data point that is more than 2S from the mean is designated an unusual observation or outlier. Deming showed how identifying the cause of "unfavorable" outliers can help
you prevent them from occurring again. Let's now apply these ideas to the distribution of IQs. The graph would look like this:
Cells E3:E5 reflect computations of the mean (0.055), standard deviation (0.122), and skewness for the monthly returns. The skewness of .104 indicates that the Cisco returns
are symmetric, so you would expect the rule of thumb to be approximately valid for this data set.
Computing the limits for the rule of thumb in cells E7:E12 reveals that
3/10/22, 10:38 AM Mathematics for Management: Statistics Section
99.7% of the monthly returns should be between .055 ± 3 (.122) — i.e., between -31% and 42%.
95% of the monthly returns should be between .055 ± 2 (.122) — i.e., between -19% and 30%.
68% of the monthly returns should be between .055 ± .122 — i.e., between -7% and 18%.
According to the example, "normal variation" for monthly Cisco returns is between -19% and 30%. Therefore, a month in which Cisco returned, say, 28% or -15% would not be
surprising. Any month during which Cisco returned less than -19% or more than 30% would be an outlier.
Highlighted in gray are Cisco monthly returns that fell within one standard deviation of the mean. In light and dark orange are returns that fell within two standard deviations in
either direction of the mean. Finally, the dark orange bars represent returns that fell more than 2S from the mean. No returns fell more than 3S from the mean. Of the 130
monthly returns that constitute our data set, 9, or 6.9%, deviated from the mean by more than 2S. Thus, 6.9% (close to the rule of thumb prediction of 5%) of returns were more
than 2S from the mean. Of the 130 returns, 43, or 33% (close to the rule of thumb prediction of 32%), were more than S from the mean.
3/10/22, 10:38 AM Mathematics for Management: Statistics Section
The file Cisco.xlsx also contains monthly stock returns for GM and Microsoft. Use these data to answer the following questions.
(1) You would expect 95% of the Microsoft monthly returns to be between __________ and ________.
(2) You would expect 68% of the GM monthly returns to be between __________ and __________.
So far in our study of statistics we have discussed how to use measures of central tendency and variability to summarize a data set. We now turn our attention to studying how
to measure the strength of the relationship between two data sets. For example, how is the price of a house related to the size of the house? How are the returns on two stocks
related? How is a high school senior's SAT score related to his college GPA? The relationship between two data sets is usually measured by the covariance and correlation
between the two data sets.
Given n points (x1, y1), (x2, y2), ...(xn, yn), the covariance between data sets X and Y is given by
Covariance(X,Y) = ∑
Suppose that X and Y tend to go up and down together. That is, when X is larger than average, then Y is usually larger than average and when X is smaller than average, then Y is
usually smaller than average. Then most of the terms in the numerator of our covariance formula will be positive and the covariance will be positive. Conversely, suppose that
when X is larger than average, then Y is usually smaller than average and when X is smaller than average, then Y is usually larger than average. Then most of the terms in the
numerator or our covariance formula will be negative and the covariance will be negative. Therefore, if X and Y "covary" in the same direction, their covariance will be positive,
whereas if X and Y covary in opposite directions, their covariance will be negative.
In summary, a positive covariance indicates that X and Y tend to go up or down together whereas a negative covariance indicates that X and Y tend to move in opposite directions
(relative to their averages). Note that covariance only measures the strength of a linear relationship and is not useful for detecting nonlinear relationships
between variables. Therefore, covariance is a measure of linear association between two variables.
3/10/22, 10:38 AM Mathematics for Management: Statistics Section
When you graph these five points in the x-y plane it becomes clear that bigger houses tend to sell for a higher price.
Note that for houses 1 and 2, both size and price are below average, whereas for houses 4 and 5, size and price are above average. For house 3, size is average and price is slightly
above average. Therefore, you expect that the covariance between home size and price will be positive.
= 2500 square feet and
= $299,000
Covariance(X,Y) =
3/10/22, 10:38 AM Mathematics for Management: Statistics Section
Or Covariance (X,Y) = 80,625,000 sq. ft. dollars.
The positive covariance indicates that home size and home price tend to go up and down together. As you will now see, however, it is difficult to interpret the magnitude of the
For example, 1500 square feet is 1.5 thousand square feet, whereas the $140,000 home price is 1.4 (in units of $100,000). You can see that in the numerator of each term of the
covariance, our home size will be divided by 1000 and each home price will be divided by 100,000. This means that each term in the numerator or the covariance is divided by
(1000)(100,000), or 100 million. Therefore, the covariance will now be the original covariance of 80,625,000 divided by 100,000,000. That yields a covariance of .80625,
measured in units of (thousands of square feet) × (hundreds of thousands of dollars). Since covariance depends on the units in which the data are measured, interpreting the
magnitude of a covariance is difficult. We now turn our attention to developing the correlation coefficient (called r), which is a unit-free measure of the strength of a linear
relationship between two variables.
The Pearson correlation (usually denoted by r) is a unit-free measure of the degree of linear association between two data sets X and Y. Given n points (x1, y1), (x2, y2), ...(xn, yn),
the covariance between data sets X and Y is given by
r = Correlation(X,Y) = Covariance
It can be shown that for any set of n points, -1 ≤ r ≤ 1. The correlation r is a unit-free measure of the degree of linear association between the data sets X and Y. Values of r may be
interpreted as follows:
Values of r near -1 indicate a strong negative linear relationship between X and Y. When X is larger than average, Y is almost always smaller than average; when X is
smaller than average, Y is almost always larger than average.
Values of r near -.5 indicate a moderate negative linear relationship between X and Y. When X is larger than average, Y tends to be smaller than average; when X is
smaller than average, Y tends to be larger than average.
Values of r near 0 indicate a weak linear relationship between X and Y. When X is larger than average, Y has little or no tendency to be larger or smaller than average.
Similarly, when X is smaller than average there Y has little or no tendency to be larger or smaller than average.
Values of r near +.5 indicate a moderate positive linear relationship between X and Y. When X is larger than average, Y tends to be larger than average; when X is smaller
than average, Y tends to be smaller than average.
Values of r near +1 indicate a strong positive linear relationship between X and Y. When X is larger than average, Y is almost always larger than average; when X is
smaller than average, Y is almost always smaller than average.
3/10/22, 10:38 AM Mathematics for Management: Statistics Section
The correlation between home size and home price is r = .90, which indicates a strong positive linear relationship between home size and home price. This high positive
correlation is consistent with the fact that the data points are tightly scattered about a straight line that has a positive slope.
3/10/22, 10:38 AM Mathematics for Management: Statistics Section
Note that the correlation of r = .06 is near 0. That result is reflected in the weak linear relationship shown in the graph.
Let's find Correlation(Home Size, Home Price). Simply compute SHome Size and SHome Price.
Since Mean Size = 2500 sq. ft. and Mean Price = $299,000, we find that
SPrice = 140,000
= $105,498.82.
Note that the units of the numerator are sq. ft. dollars. These are also the units of the denominator. Therefore, the correlation is unit-free.
Covariance(X,Y) = ∑
If the values of X are in range 1 of our spreadsheet and the values of Y are in range 2 of our spreadsheet, then the Excel function COVAR(range1, range2) computes as ∑
This is called the population covariance. In most uses of covariance, you should to divide by n - 1, which yields the sample covariance. To convert Excel's covariance to a
sample covariance, multiply the result of the COVAR function by n/(n - 1). In the current example, you can obtain the sample covariance by multiplying the result of
the COVAR function by 5/4 = 1.25.
3/10/22, 10:38 AM Mathematics for Management: Statistics Section
In cell E23, compute the sample covariance (80,625,000 sq. ft. dollars) by entering the formula
(1) Annual returns on Hot Cakes Amalgamated and Bridges Consolidated stocks for the last five years are given below.
Find the covariance and correlation between the Hot Cakes and Bridges annual returns.
(2) Please download the file Nfldata.xlsx. The file contains the points scored by each NFL team and punts attempted during the 2008 season. Compute the covariance and
correlation between punts and points scored.
(3) For the data in the file Nfldata.xlsx, each team played 16 games. Compute points scored per game and punts attempted per game. Now compute the covariance and
correlation between these two data sets.
(4) There is a moderate negative correlation between points scored and punts attempted. Therefore, the SportsCenter anchors will sometimes say that the less you punt, the more
points you score. Hence you should never punt! What is wrong with this argument?
3/10/22, 10:38 AM Mathematics for Management: Statistics Section
You may find it useful to refer to the Mathematics for Management Concept Summary while taking the course. This .pdf document is available in the Briefcase as well.
Appendix B: Exercise Solutions
As you work through the exercises at the end of each section, you may find it helpful to check your answers for accuracy. Below are links to spreadsheets that contain the
answers to each exercise presented in the tutorial. The answer sheets are organized by chapter for your convenience. You can also download these items from the Briefcase at any
Algebra - algebraanswers.xlsx
Calculus - calculusanswers.xlsx
Statistics - statisticsanswers.xlsx
Probability - probabilityanswers.xlsx
Finance - financeanswers.xlsx
Welcome to the final exam for the Mathematics for Management tutorial. This test will allow you to assess your knowledge of Mathematics for Management.
All questions must be answered for your exam to be scored.
one question to the next, select one of the answer choices or, if applicable, complete with your own choice and click the “Submit” button. After submitting your
To advance from
answer, you will not be able to change it, so make sure you are satisfied with your selection before you submit each answer. You may also skip a question by pressing the forward
advance arrow. Please note that you can return to “skipped” questions using the “Jump to unanswered question” selection menu or the navigational arrows at any time. Although
you can skip a question, you must navigate back to it and answer it - all questions must be answered for the exam to be scored.
After completion, you can review your answers at any time by returning to the exam.
Good luck!
Copyright Harvard Business School Publishing. Copying or posting is an infringement of copyright. or 617-783-7860.