Econ 1006 Summary Notes 1

Required Reading:
Ref. File 1: Section 1.13
Ref. File 3: Introduction and Sections 3.1 to 3.4, 3.7


(i) Undertake the required reading from the reference

files each week. (It may be necessary to re-read
some sections more than once) – Approximately 4
hours per week.
(ii) Carefully study lecture material and take notice of
advice given in lectures.
(iii) Attempt tutorial exercises before tutorials and work
out where you have difficulties, which hopefully can
be resolved in tutorials.
(iv) Make a conscious effort to keep up with the material


This subject gives an introduction to the basics of statistics

as used in many areas of business and economics. It also
includes an introduction to basic calculus.

1.1 How Can We Define Statistics?

There are various definitions of statistics, to be found for

example in dictionaries or textbooks. However there are
common elements to be found in most definitions of

Statistics, for our purposes encompasses the following

major activities:

(i) Collection and description of information, or data -

“descriptive statistics”. We will normally be dealing
with a subset of a larger collection or set of data. The
subset is called a sample, the larger set a population.

(ii) Using sample data to make inferences about a

population - “statistical inference”.

1.2 Why Study Statistics?

(i) (Major) It can be useful. It can help us to make

decisions in the face of uncertainty.

(ii) People are bombarded with statistics all the time.

Often statistics is used in ways that are not
warranted. It is important not to be fooled by people
who misuse statistics.

(iii) It is important to have a clear understanding of the

strengths and limitations of statistical analysis.

1.3 Structure of the Unit

 Descriptive Statistics:
How we summarise the characteristics of raw data
(using graphs, summary measures, etc.)

 Probability Theory and Probability Distributions

(“deductive statistics”):

Rules (or axioms) for calculating probabilities of

certain things (called events) happening.

Probability theory can be considered part of

descriptive statistics.

Here we will be concerned about making probability

statements about a given population.

 Sampling Theory and Sampling Distributions (the basis

of “inductive statistics”):

Here we will be concerned with making probability

statements about characteristics of samples, given
assumptions about the population from which the
sample was drawn.

 Point and Interval Estimation:

Point Estimation - Here we will be concerned about
producing a particular estimate (a number), based on
sample data, of a characteristic of a population.

(For example, using the average height of a sample of

women in Australia as an estimate of the average
height of all women in Australia)

Interval Estimation - Here we will not give an

estimate of a population characteristic, but rather a
range in which we are confident (to some degree) the
true value of the population characteristic is.

 Hypothesis Testing:
Under this heading we will be looking at ways of
testing hypotheses about characteristics of
populations, based on sample data.

This is clearly an example of decision-making (i.e.

rejecting or accepting the hypothesis) under
uncertainty about the population characteristics.

 Regression Analysis:

(Especially relevant for Accounting, Finance and

Economics students (forms the basis of Econometrics)

In this case we will be concerned with estimating

linear relationships between different variables, i.e.
linear equations.

(For example, the relationship between a firm’s

advertising expenditure and its sales revenue)

 Introduction to Differential Calculus



2.1 Some Basic Definitions Relating to Data

(Required Reading: Ref. File 3 - Introduction and Sections
3.1, 3.2)

(i) Elementary Units and Frames:

Statistical data normally represents measurements or

observations of a certain characteristic or variable (e.g.
height) of interest of each member of a set of objects or

Each object (or person) for which the characteristic is or

can be measured is called an elementary unit (e.g. a person
in Australia).

The set or listing of all possible elementary units is called a


(ii) Population/Sample:

A statistical population is the set of measurements or

observations of a characteristic of interest for all
elementary units in a frame. (For example, the heights of
all males in Australia, not the males themselves)

A population may comprise a finite or infinite number of

elements (observations), depending on the context.

A statistical sample is a subset of a population.


Note: There is nothing intrinsic about elements in a

population that makes them a population. It is purely a
matter of how we choose to define a population. For
example, say we define a population to be the set of heights
of all people in Australia. Then the set of heights of people
in the university would represent a sample of the

Hence whether we are talking about a population or a

sample depends on how the population has been defined.

(iii) Parameters/Statistics:

For our purposes -

The numerical characteristics which describe a population
(e.g. the average height of all women in Australia) are
called parameters of the population.

The numerical values calculated from sample data are

called sample statistics. These sample statistics can be
thought of as describing or characterizing the sample.

(iv) Qualitative and Quantitative Variables:

Populations may be quantitative or qualitative. Data from

quantitative populations is called quantitative or interval
data. Data from qualitative populations is called
qualitative, nominal or categorical data.

Data from a quantitative population can be expressed

numerically in a meaningful way. The variable (or

characteristic) associated with a quantitative population is

called a quantitative variable.

Examples of quantitative variables: height of an

individual, income of a household, number of cars owned
by a household.

Data from qualitative populations cannot be expressed

numerically in a meaningful way. The variable (or
characteristic) associated with a qualitative population is
called a qualitative or categorical variable.

Examples of qualitative variables: gender of an individual,

hair colour of an individual, brand of car driven by an

Note: Just because we assign a numerical code to a

qualitative variable does not mean the variable is
quantitative. (For example, if a variable is gender, we
could code males 0 and females 1, but this coding conveys
no meaning in itself)

(v) Discrete and Continuous Quantitative Variables:

A discrete quantitative variable can assume only certain

discrete numerical values (on the number line); i.e. there
are gaps between the various values. Depending on the
variable, there could be a finite or infinite number of these
discrete values.

Examples: number of children in a family, number of days

an individual works during a year.

A continuous quantitative variable can assume any value

in a specific range or interval. The interval can be of finite
or infinite width.

Example: height or weight of an individual.

Note: By definition there are an infinite number of values

a continuous variable can take.

2.2 Frequency Distributions

(a) Introduction

Suppose we have a set of raw statistical data (i.e.

observations on some variable (or characteristic) for a
collection of elementary units). At this stage we will make
no distinction as to whether we are talking about a
statistical population or sample.

In studying the data it is often useful to initially group the

raw data into different classes or categories. A frequency
distribution for a set of data lists the number of
observations or ‘data points’ in each class used for
grouping (the class frequencies). The classes of a
frequency distribution must be mutually exclusive (an
observation cannot fall into two classes) and exhaustive
(any observation must belong to a class).

(b) Frequency Distributions for Quantitative Data

Each class of a frequency distribution of quantitative data

usually has a lower and an upper limit, although
sometimes it is necessary or convenient to have open-
ended classes, i.e. classes which have either an upper or
lower limit but not both.

Example 2.1:
Suppose we have data on the number of children in 100
households as follows:

Class Frequency
0 to under 2 children 30
2 to under 4 children 55
4 to under 6 children 13
6 or more children 2

The last class is open-ended. The class width of the other

classes is 2.

The class width is the difference between successive lower

class limits or upper class limits.

Note: An open-ended class has no class width.

General Advice for Forming Frequency Distributions:

 The number of classes should generally be between
5 and 20. (Although there are only 4 in the above
simple example)

 Class widths are ideally equal, but this may not

always be possible, and open-ended classes may be
 Class limits should be chosen such that the class
midpoint is close to the average of observations in
the class. This is because in calculating summary
statistics based on grouped data the midpoint is
used as representative of all observations in the

(c) Relative, Cumulative and Cumulative Relative

Frequency Distributions

A relative frequency distribution shows the proportion of

all observations falling in each class. It is obtained by
dividing the class frequencies ( f i ) by the total number of
observations in the data (‘n’).

A cumulative frequency distribution shows, for each class

i, the total of the first i frequencies.

A cumulative relative frequency distribution shows, for

each class i, the total of the first i relative frequencies.

For the previous example we have

Class (i) Frequency Cumulative. Relative Cumulative.

(fi) Frequency Frequency Rel. Freq.
0 to under 2 30 30 0.30 0.30
2 to under 4 55 85 0.55 0.85
4 to under 6 13 98 0.13 0.98
6+ children __2 100 0.02 1.00
100 1.00

2.3 Histograms

Histograms give us a convenient way of visualising the

distribution of observations over classes. They take the
form of a series of adjacent (contiguous) rectangles, one
for each class, with the base of each rectangle centred over
the corresponding class midpoint.

In a frequency histogram the areas of the rectangles are

proportional to the class frequencies, with the factor of
proportionality the same for all classes. Thus if all the
classes have the same width, each rectangle will have the
same base width and the class frequencies can be
represented by the rectangle heights.

In a relative frequency histogram the areas of the

rectangles are proportional to the relative frequencies.

Similarly cumulative and cumulative relative frequency

histograms can be defined.

Note: Frequency and relative frequency histograms will

have the same shape.

Example 2.2:
Consider the following distribution

Class Frequ. Rel. Freq. Cum. Freq.

0.5 to under 2.5 10 0.1 10
2.5 to under 4.5 30 0.3 40
4.5 to under 6.5 50 0.5 90
6.5 to under 8.5 10 0.1 100

Frequency Histogram



0.5 2.5 4.5 6.5 8.5

Relative Frequency Histogram




0.5 2.5 4.5 6.5 8.5

Cumulative Frequency Histogram





0.5 2.5 4.5 6.5 8.5


2.4 Shapes of Distributions

The frequency or relative frequency histogram gives us a

representation of the shape of the distribution of the data
being analysed.

(That is, how the data is distributed over the possible


There are several terms commonly used to describe the

shapes of distributions.

A distribution is described as negatively skewed (skewed

to the left) if it has the following shape
A Distribution that is Skewed to the Left

Relative Frequency

Variable Value

A distribution is positively skewed (skewed to the right) if

it has the following shape.
A Distribution that is Skewed to the Right

Relative Frequency

Variable Value

A distribution is symmetric if it has the following shape.

A Symmetric Distribution

Relative Frequency

Variable Value

The above are all examples of unimodal distributions. A

bimodal distribution has two peaks.

Note that for a multimodal distribution, the peaks need

not be the same height.

2.5 Bivariate Frequency Distributions

Often it is of interest to classify observations of elementary

units according to two variables (characteristics). This
allows one to gauge the relationship between the two

Example 2.3:
Consider the final results of 50 students in a particular
subject. Each student’s final grade and gender are
recorded, allowing the derivation of the following
bivariate frequency distribution.

Gender HD Dist. Credit Pass Fail Row
Male 5 4 10 6 2 27
Female 2 3 11 2 5 23
Column 7 7 21 8 7 50

Each combination of grade and gender is represented by a

cell in the bivariate frequency distribution, which contains
the frequency of that combination in the data.

The row totals represent, in this example, the marginal

frequencies of females and males in the class (27 and 23,

The column totals represent the marginal frequencies of

final grades.

Marginal frequencies, represented by the row and column

totals, each refer to one variable only.

We can express the information in a bivariate frequency

distribution as a relative frequency distribution by
dividing each entry in the distribution by the total number
of observations.

Example 2.4:
For the previous example, the bivariate relative frequency
distribution is given by (dividing each entry by 50)

Gender HD Dist. Credit Pass Fail Row
Male 0.10 0.08 0.20 0.12 0.04 0.54
Female 0.04 0.06 0.22 0.04 0.10 0.46
Col. 0.14 0.14 0.42 0.16 0.14 1.00

The row and column totals in the above table are called
the marginal relative frequencies.

Note: Knowledge of the row and column totals, i.e. the

respective univariate distributions, does not inform us
about any relationship between the variables.



In this section we shall look at important ways of

summarising data from both populations and samples.
We shall be concerned with measures of the

 ‘centre’ of a frequency distribution

 ‘dispersion’ of values in a frequency distribution

3.1 Summation Notation

Suppose we have ‘n’ numbers. By labelling the numbers

(1,2 ,3 ,...,n) , we can represent the numbers by

x i , i  1,...,n

The sum of the numbers can be denoted

 xi  x 1  x 2  ........ x n
i 1

x i
is a shorthand way of writing the sum.
i 1

Theorem (Basic Properties of Summation Notation)

Given ‘c’ is some constant and a1 , a 2 ,...,an are ‘n’
n n
(i)  ca i  c a i
i 1 i 1

 n 
(ii)  (a i  c)    a i   nc
i 1  i 1 
 n 2 n
(iii)  (a i  c)    a i   2c a i  nc 2

i 1  i 1  i 1

 n 2 n
(iv)  (a i  c)    a i   2c a i  nc 2

i 1  i 1  i 1

Example 3.1:
Consider the following four labelled numbers.

a1  1 , a 2  3 , a 3  2 , a4  1

Use property (iii) of the above theorem to calculate


 (a i  1)2 .
i 1

(See video for solution)


3.2 Measures of Central Tendency

For each measure considered there are population and

sample versions. We will suppose here there are N values
in the population and ‘n’ values in a sample.

Note that at this stage we are only concerned with

quantitative variables, and we assume the population
contains a finite number of values.

Definition (Mean of a Finite Quantitative Population)

If x1 , x 2 , x 3 , .......,x N represents a finite population of ‘N’
quantitative data points, then the mean of this population
is given by

x1  x 2  ...  x N 
Population mean     i 1
(  is the Greek letter ‘mu’)

Definition (Mean of a Sample from a Quantitative

If x1 , x 2 , x 3 , .....,xn represents a particular sample of size
‘n’ from a quantitative population, then the mean of this
sample is given by

x1  x 2  .....  x n 
Sample mean  x   i 1
n n

Definition (Mode of a Set of Data)

The mode is the data value that occurs most frequently in
a set of data (population or sample).

Note: The mode need not be unique.

Definition (The Median of a Set of Data)

If quantitative data is arranged in ascending or
descending order, the middle value of data is called the
median. If there is an even number of data points, the
median is typically taken to be the arithmetic average of
the two middle values.

We can of course talk of population and sample medians.

Example 3.2:
Consider the following set of data, which we can assume to
be a sample from a population.

1 1 5 4 12 4
3 1 2 7 6 6
5 1 1 5 8 9
10 2 4 2 6 30

n  24, x1  1, x 3  5, x11  6 , etc. (if we label across rows

then down)

(See video for solution)

Comparison of the Mean, Median and Mode

The mean takes account of all observation values therefore

it can be affected by extreme values or outliers, i.e. values
which differ greatly from the majority of values.

In the previous example the outlier 30 pushes the mean to

the right of the majority of the data. If it were omitted the
mean would be approximately 4.57.

The median and mode are unaffected by extremely high or

low values.

In the previous example, even if x24 were 1,000,000 instead

of 30 the median and mode would be unchanged.

The mode may not represent a “central” value in the

distribution, as in the above example, but it may be useful,
for example, for qualitative data.

If the frequency (or relative frequency) distribution is

perfectly symmetric and unimodal, the mean, median and
mode will coincide.
Symmetric Distribution
Relative Frequency

Variable Value

If the distribution is skewed to the right (positively

skewed) and unimodal, mode < median < mean.
Distribution that is Skewed to the Right

Relative Frequency

Variable Value
Mode Mean

If the distribution is skewed to the left (negatively skewed)

and unimodal, mean < median < mode.
Distribution that is Skewed to the Left

Relative Frequency

Variable Value
Mean Mode

This gives us a way of deciding whether a distribution is

skewed to the left or right.


 A statistical population is a set of measurements or

characteristics of elementary units of interest.

 Once a population is defined, a sample is a subset

from the population.

 Parameters are numerical characteristics of a


 Sample statistics are numerical characteristics of a


 A frequency or relative frequency distribution

describes how data is distributed over different
classes or categories.

 A histogram shows graphically a frequency, relative

frequency or cumulative frequency distribution (the
areas of the ‘contiguous’ rectangles are proportional
to the frequencies or relative frequencies).

 The mean is affected by ‘extreme’ values; the median

and the mode are not affected by ‘extreme’ values.

 The population mean is denoted  : the sample mean

is denoted x .

 The median divides a set of quantitative data into two

equal halves.

