Ecs Notes
Ecs Notes
Ecs Notes
Statistics
Is a science that involves collecting, summarising , analysing and interpretation of data for the
purpose of making informed decisions.
Descriptive Statistics
Descriptive statistics deals with methods of organizing, collection, characterisation, summarizing, and
presenting data in a convenient and informative way. I,e MEAN, RANGE, MEDIAN,
A descriptive measure of a population is called a parameter.
Inferential Statistics
Is a body of methods (estimation process) used to draw conclusions or inferences about
characteristics of populations based on sample data.
PAGE: 7, 8, 9
A population is the collection of all items, elements, or objects that we wish to study. A population does
not necessary mean a group of people living in a specific area. For instance. population of South Africa.
In Statistics, a population is referred to a group of all individuals, elements, items of interest to a statistics
practitioner. population is the entire set of observations under study -
A sample is a fraction or a subset of population
When making conclusions about a population based on a sample, the conclusions and estimates may not be
perfect. To minimise the level of uncertainty when making decisions, 2 reliable measures are used in statistical
inference: the confidence interval and the level of significance.
A sample is a set of data drawn from the studied population. A descriptive measure of a sample is
called a statistic.
We use statistics to make inferences about parameters.
Parameter is a numerical value that describes or summarises a population, that is, describes the
characteristics of a population while a
Statistic is a numerical value that describes the characteristics of a sample or summarises a sample.
Types of variable and information
There are two types of data, namely quantitative and qualitative data.
Quantitative data generates numerical variables and they are usually reported numerically.
A quantitative variable can be either discrete or interval. A discrete variable is countable and are referred to as
a whole number while an interval variable can assume any given value within a given interval.
Qualitative data generates categorical variables. – options, variety. Qualitative data are usually summarized in
graphs and bar charts.
Statistics practitioner
A person who uses statistical techniques properly.
Examples of statistics practitioners include the following:
1. a financial analyst who develops stock portfolios based on historical rates of return;
2. an economist who uses statistical models to help explain and predict variables such as inflation
rate,
unemployment rate, and changes in the gross domestic product; and
3. a market researcher who surveys consumers and converts the responses into useful information.
Hierarchy of Data
Levels of Measurement
Interval
Data are real numbers, such as heights, weights, incomes, and distances. We also refer to this type
of data as quantitative or numerical.
Values are real numbers.
All calculations are valid.
Data may be treated as ordinal or nominal
Ordinal
Data appear to be nominal, but the difference is that the order of their values have meaning. The
difference between nominal and ordinal types of data is that the order of the values of the latter
indicate a higher rating.
Ordinal level of measurement presumes that one classification is ranked higher than another. The
items or object differ form one to the other one but have more or less of a characteristic than
another. In this level of measurement the order of the variables is meaningful
Values must represent the ranked order of the data.
Calculations based on an ordering process are valid.
Data may be treated as nominal but not as interval.
Nominal pg:4 (21)
Data are categories. For example, responses to questions about marital status produce nominal
data. The values of this variable are single, married, divorced, and widowed. Nominal data are also
called qualitative or categorical.
Nominal data is count or compute the percentages of the occurrences of each category. Determining
frequencies are permitted.
Nominal scale applies to names. This measurement scale is used for objects or elements which
consists of names
Values are the arbitrary numbers that represent categories.
Only calculations based on the frequencies or percentages of occurrence
are valid.
Data may not be treated as ordinal or interval.
Macroeconomics
Is a major branch of economics that deals with the behaviour of the economy as a whole.
Macroeconomists develop mathematical models that predict variables such as gross domestic
product, unemployment rates, and inflation
The range
The range is the difference between the largest value and the smallest value. To compute the range,
we need to identify the lowest and the highest values in the distribution
The variance
The variance and its related measure, the standard deviation measure the amount of variation of the
data around the mean. They measure " how far each data value is far from the mean"
n
1
2
s= ∑ ( x −x ) 2
n−1 i−1 1
The standard deviation
The sample standard deviation is simply the positive square root of the variance which is symbolised
by s√ variance
The coefficient of variation
The coefficient of variation of a distribution is the standard deviation of the data set divided by their
mean. It is a relative measure of dispersion, which is expressed as a percentage and symbolised by
mean
CV.cv =
standard deviation
Measures of central tendency are measures of location. They are typical values that describes or
summarise the distribution. The most used measures of location are the arithmetic mean, the
median and the mode
The arithmetic mean (the average) is the sum of all the values in the data set divided by the
number of observations.
The median is the centre of the distribution when data is arranged in ascending or descending
order. It divides the data set in two parts, n halves below and above for an ordered data set. It
is that value with 50% of the observations less or equal to it and 50% of the observations
above or equal to it.
Steps to follow when computing the median
1. Arrange data in a numerical order; starting from the lowest data to the highest
2. Count the number of observations (n)
n+1
3. Determine the median position :
2
4. Read the median value from the ordered data
The mode is the most frequent value in the data set
NB: The mode is seldom the best measure of central location. For ordinal and nominal data the mean is not an
appropriate measure of location but the mode is the appropriate measure of location for nominal and ordinal
data.
The measures of variability describes the amount of variation or spread in a data set. There are also
called measures of dispersion. The most used measures of variability are the range, the variance, the
standard deviation and the coefficient of variation
The range is the difference between the largest value and the smallest value. To compute the
range we need to identify the lowest and the highest values in the distribution,
X Maximum - X Minimum
Defining events
A simple event is an individual outcome of a sample space.
An event is a list or set of one or more simple events in a sample space.
The probability of an event is the total or sum of the probabilities of the simple events that constitute
the event
Marginal probability
Marginal probabilities are computed by adding across or down columns, are so named
because they calculated in the margins of the table.
Conditional probability
The conditional probability is the probability of an event A given information about the
occurrence of the event B.
P ( A∧B)
P (A/B) =
P (B)
Where P (B) > 0
The probability of event B given event A is given by:
P ( A∧B)
P (B/A) =
P( A)
Where P (A) > 0
If the probability of a particular outcome is equal to 0 that outcome
is impossible, while a probability outcome of 1 implies that it is a certain outcome
Independent events
Two events A and B are independent if P (A/B) = P (A) or P (B/A) = P (B)
Read the statement above as follows: If the probability of an event A, given that another event B had
taken place, is the same as the direct probability of the event A, then the event B has no effect on the
occurrence of A. The probability of event A is independent of whether event B took place or not.
Example
If the probability of passing an assignment is 0.7 the probability of falling the exam is 1 – 0.7 = 0.3
Multiplication rule
The joint probability of any two events A and B is given by
†We have created an Excel spreadsheet that does the calculations for this case. See Appendix 1 for
instructions on how to download this spreadsheet from Cengage’s website plus hundreds of data sets
and much more
To access these materials, go to www
.cengagebrain.
com and enter this book’s ISBN (9781337093453) in the search field.
There you’ll find the following available for download:
Random variable
A random variable is function or rule that assigns a number to each outcome of an experiment. There
are two types of random variables, discrete and continuous.
A discrete random variable is a variable that can take on a countable number of values, in other
words, a discrete random variable can assume a countable number of possible outcomes.
Example
The number of accidents that occur on N1 highway every one hour is a random variable
The delivery time of parcels to clients.
A continuous random variable is random variable which can take on any value over a given intervals
of values.
A probability distribution is a table, or a formula that describes the values of a random variable.