Lecture - MODULE 1 LESSON 1

Module 1

What this module is about

This module deals with the definition of statistics and terms used in the study of
statistics. It will also discuss the basic statistical concepts. The history and importance
of the study of statistics, summation rule, rounding off numbers, ratios, frequencies,
proportions and percentages will also be discussed.
As we go along in the discussion and exercises, you will appreciate more the
importance of statistics in daily life. Enjoy learning this module and appreciate the
discussion and examples.
What you are expected to learn
This module is designed for you to:
1. define statistics, sample, and population.
2. give the history and importance of the study of statistics
3. use the rules of summation to find sums
4. Enumerate the rules in rounding numbers.
5. Compute ratios, frequencies, proportions and percentages
What you will do

Lesson 1
Definition of Terms Related to Statistics

Statistics is a branch of mathematics that deals with the collection, classification,

description, and interpretation of data obtained by the conduct of surveys and
experiments. Its fundamental purpose is to describe and draw inferences about the
numerical properties of a population.
STATISTICS refer to numerical observations of almost any kind.
- refers to the science that deals with the collection, tabulation or presentation,
analysis, and interpretation of numerical or quantitative data.
Collection of data - refers to the process of obtaining numerical measurements,
Tabulation or presentation of data - refers to the organization of data into tables, graphs
or charts, so that logical and statistical conclusions can be derived from the collected

Analysis of data - pertains to the process of extracting from the given data relevant
information from which numerical description can be formulated.
Interpretation of data - refers to the task of drawing conclusions from the analyzed data.
- normally involves the formulation of forecasts or predictions about larger groups
on the data collected from small groups.
Two important terms that you should understand in studying statistics are population
and sample.
In statistics, population does not only mean a group of people. Population may also
mean a defined group or aggregates of objects, animals, materials, measurements,
“things”, “events” or “happenings” of any kind. Thus, a sack of rice, a whole pizza pie, or
a set of weights and heights are considered population.
Since it would be impractical to study the whole population as in the case of a sack of
rice, then it is necessary to just take a sample of the population. Thus, a handful of rice
is a sample of the population in a sack of rice. Thus, sample is defined as any
subgroup of the population drawn by some appropriate method from the population. It
should be a representative of the population, that is, the sample will show the properties
of the population.
Originally, statistical data took the forms of

1. Figures on tax returns

2. Population
3. Births
4. Deaths
5. Trade
6. Others which were considered important information to a political state
Today, the use of statistics has extended to such things as:

1. Theater attendance
2. Basketball results
3. Car sales in in a month
4. Heights
5. Weights
6. So many others that can be expressed numerically


Statistics has general applicability:

1. It is an essential tool in education, government, business, economics,
medicine, psychology, sociology, sports and others.

2. One of the most common exposures of the youth today to statistics is in the
existing world of sports- in basketball, for instance. After each quarter of a game,
the newscaster would report numerical figures and their averages to millions of
thrilled basketball fans watching the game on television. These figures normally
consist of points made out of so many attempts from the field or from the foul
line. These “statistics” would eventually decide whether a player deserves to be
paid more or is being paid more than he deserves.

3. In education, statistical tools are used to get information on enrollment,

finance,physical facilities, and so on. Such data are needed for intelligent
administration and management.

4. Statistics are gathered for the purpose of providing government heads with
data necessary to guide them in managing the affairs of the State. From earlier
times, most civilized countries have compiled large-scale “statistics” in order to
ascertain the manpower and material strength of the nation. These data are
needed for military and fiscal reasons. A large amount of organized records on
the movement of population, cost of living, taxes, wages, and material resources
is necessary for intelligent policy-making and administration.
Methods for the statistical design of experiments are valuable to
researchers in medicine and the physical sciences. Causes and effects of factors
which affect experiments are best evaluated using statistical techniques.

5. Psychologists are able to understand the human person better if they

are able to systematize, analyze, and interpret data on intelligence scores,
aptitudes, personality trait ratings and attitudes.

6. In sociology, Statistics is used in the study of the conditions of the

society in which man lives. Observations, when properly analyzed and
interpreted, may effect positive action toward the improvement of society.

7. In business and economics, statistics plays an important role in the

exploration of new markets for a product, forecasting of business trends, control
on the quality of goods produced, and improvement of personnel relations.
Decisions and policies for efficient business and economic management must
be based on data which have been properly analyzed and interpreted.
Everyday life is influenced more and more by decision based on
quantitative information.

A good survey research paper relies on the precision of the methods and
procedures of conducting the study. This includes reliability of the selected subjects
or respondents of the study. The validity of information gathered out of the
distributed questionnaires and the accuracy of measurements used in answering the
research questions and other observations.

A study which was conducted in the entire population assures us of 100%

reliability since the responses are obtained from all members of the population.
This means that data was collected by a complete enumeration method or the
so-called census taking. However, it is impossible for many types of research to
conduct a survey to all members of the population especially if the population
size is infinite or finite but very large. To minimize the time and cost involved in
conducting the survey to a large population, it has been accepted that the
information about the population will be based only from a small portion of the
population, called sample. On the other hand considering only the responses of a
small portion of the population may result into some possible biases due to
improper selection of the samples and errors due to the manner of measuring the
desired observations since the selected sample may not have equally
represented the characteristics of the entire population.

It also normally involves the formulation of forecasts or predictions about larger

groups based on the data collected from small groups.

1. Descriptive statistics
2. Inferential statistics
Descriptive Statistics is concerned with the gathering, classification, and presentation of
data and the collection of summarizing values to describe group characteristics of the
The summarizing values most commonly used in descriptive statistics are the
measures of central tendency, of variability, and of skewness and kurtosis.
Inferential Statistics demands a higher order of critical judgment and mathematical
methods. It aims to give information about large groups of data without dealing with
each and every element of these groups. It uses only a small portion of the total set of
data in order to draw conclusions or judgments regarding the entire set.
Topics included in the study of Inferential Statistics:

1. the testing of hypothesis using the z-test

2. The t- test
3. Simple linear correlation
4. Analysis of variance
5. The chi-square test

6. Regression analysis
7. Time series analysis
Statistics dates back to the beginnings of recorded history; As early as 3800
B>C> there were records on population in Babylonia; The same was true of China in
3000 B<C< Al most five thousand years ago the Sumerians counted their citizens for
taxation purposes, and at various times later the Egyptians conducted their inquiries into
the occupations of their people.
In Biblical times, censuses were undertaken by:

1. Moses in 1491 B.C.

2. By David in 1017 B.C.
3. Indian literature dating back from the reign of the northern Hindustan Kin Asoka (270
- 230 B.C.) also describes methods of taking censuses.
4. The Athenians and other classical Greeks took censuses in times of stress, carefully
counting the adult male citizens in wartime and the general populace when the food
supply was endangered.
5. The Romans registered adult males and their property for military and administrative
6. Servinus Tullius who ruled as the sixth King of Rome from 578 to 534 B.C. is given
credit for instituting the gathering of population data. Two thousand years ago, each
male in the Roman empire had to return to the city of his birth to be counted and taxed.
Thus the Bible gives an account of the return of Joseph and Mary to Bethlehem for such
In the Middle Ages, registrations on land ownership and on manpower for wars were
made. In the 13th century, tax lists of Paris included the registration of those who were
subject to tax;
In England, William the Conqueror required the compilation of information on population
and resources. This compilation, “The Domesday Book” is the first landmark in British
statistics. Later births, deaths, baptisms, and marriages had to be registered.
It was Achenwall (1719 - 1772) who first introduced the word “statistiks” in a
preface to a statistical work. Zimmerman and Sinclair introduced and popularized the
name “statistics” in their books.
In the 16th century, European mathematicians and gamblers suspected that
games of chance such as rolled dice, playing cards, and tossed coins followed certain

Girotamo Cardano, an Italian Mathematician, physician, and gambler wrote

“Liber de Ludo Aleae” in which appeared the first known study of the principles of
Another gambler, Chavalier de Mere, made a proposal to Blaise Pascal in the
famous “Problem of Points,” a work which marked the beginning of the mathematics of
Laplace’s “Theories Analytique des Probabilities” of 1812 further supported and
stabilized the said theory.
In the 18th century, statistics was used in the study entitled “Political Arrangement
of the Modern States of the Known World.” The description of the work was at first
verbal. Gradually, an increasing proportion of numerical data was used in the
description of the work.
In the 19th century, a Belian astronomer named Quetelet applied the theory of
probability to anthropological measurements and expanded the same principle to the
physiological and psychological, physical, and chemical fields. After studying with the
best known mathematicians of his day, Quetelet established a central commission for
Statistics which became a model for similar organizations in other countries.
Francis Galton,(1822 - 1911) and Karl Pearson (1857 - 1936) also contributed
much to the field of statistics, Galton developedthe use of percentiles. A cousin of
Charles Darwin, Galton became deeply interested in the problem of heredity to which he
also applied statistical tools.
Pearson made many statistical discoveries, too. Both Galton and Pearson
contributed greatly to the development of the correlation theory.
In the twentieth century, the most prominent figure in the foield of Statistics was
Sir Ronald Fisher (1890 - 1962). Fiaher made contributions from 1912 - 1962, and many
of these contributions have great impact on contemporary statistical procedures. One of
these is the Fisher’s test used in the analysis of variance in Inferential Statistics.
Shortly before the second world war, the number of applications of statistical
methods in the social sciences began to increase. The number of surveys of all kinds
increased, and the need to interpret data in mathematics, business and the social
sciences made it necessary for workers to have at least a basic understanding of
Statistics. Today, students, housewives, policy makers , businessmen, and workers in
other fields of human behavior are expected to have at least a basic knowledge of
Statistics. Statistical literacy has become a necessity in today’s modern world.
Although crude and incomplete, several estimates of the population of the
Philippines were made during the Spanish period, the earliest dating back to about 1570

when Legazpi conquered the islands. The people were estimated to be a million in
In 1576, Hernando Riquel, a government notary, also attempted to estimate the
population in connection with the list of encomiendas he prepared. An account of the
socio-economic conditions of the peoiple was written in 1582 by Miguel de Loarca in
“Relacion de las Islas Filipinas.” This included some details about the size of the
islands, the encomiendas therein, the officials in the Spanish settlements, and the
tributes collected.
By order of Governor and Captain General Gomez Dasmariňas, a second
estimate was made in the year 1591 which was based on the number of encomiendas.
There were approximately 667,617 people as there were 166,904 encomiendas, each
encomienda representing four persons. No further estimates based on the encomienda
were made because the system closed in 1600.
Other estimates of the population were based mostly on church records because
people were distributed among religious orders by parishes. Births, deaths and
marriages were made the bases of population estimates. In 1799, the Christian
population was 1,502,574 as compiled by Buzeta.
Another source of information concerning the population was the number of
“cedulas” sold. A “cedula” was a per capita tax which was obligatory upon all males
between 18 and 60 years old.
In 1877, civil censuses were taken by the Spanish authorities. This formed the
basis of the estimated population for 1896.
During the American regime, data collection became more systematized. This
was marked by the creation of a statistical unit in the Bureau of Customs to collect,
tabulate and disseminate statistics on imports and exports. Although no statistical units
were formally created in other government offices during that time, informal data were
collected and compiled for administrative purposes.
The Bureau of Agriculture, which was created in 1902, compiled data on

1. the number of farms,

2. the extent of irrigated areas, and
3. land put into cultivation
The Bureau of Labor, which was created in 1908, furnished data on the number of labor
organizations and members. It also compiled statistics on labor cases.
Vital registration likewise improved during this period. Section 961 of the Revised
Ordinance of the City of Manila provided for the registration of births and deaths.
Section 2214 of the Revised Administrative Code of the Philippines required physicians

to report births and deaths they have attended to. They made these reports to the
municipal secretary.
In 1925, a survey on the educational system was made by a board of
distinguished educators headed by Dr. Paul Monroe.
During the Commonwealth regime, all statistical activities were centralized in the
Bureau of Census and Statistics. This agency which was created on August 19, 1940
had the following functions:

1. To prepare and conduct periodic censuses on population, housing, agriculture,

fisheries, industry, business, and other sectors of the economy.
2. To prepare and conduct statistical surveys, researches, and studies on all
aspects of socioeconomic conditions.
3. To collect and process for statistical purposes data and records from the
different departments, bureaus, offices and agencies of the government.
4. To conduct researches and studies on census in cooperation with national or
local statistical organizations.
5. To develop a well- integrated, consolidated, and coordinated program of up-to-
date statistical collection, production, analysis, and publication f or the use of the government
and the public.

6. To maintain an efficient system of civil registration.

At present, statistics is a reliable means of describing accurately the values of
economic, political, social, psychological, biological, and physical data. Statistics serves
as a tool to correlate and analyze collected data. It is no longer confined to gathering
and tabulating data. Now, it is also a process of interpreting the information that serves
as a basis for preparing plans.
Population - refers to the totality of objects, individuals, or reactions that can be
described as having a unique combination of qualities.
- in statistical investigations, it is defined by naming its unique properties.
- consists of numerical value associated with objects or individuals
- ordinarily, we use the term population to refer to the people within a specified
time and and geographical area, or more precisely, to facts and figures associated with
these people

1. the graduating students of a particular school


2. The employees of a company

3. The cars produced by a particular manufacturer
4. The depositors in a bank
5. The ages of graduating students
6. The IQ scores of employees
7. The selling prices of cars
8. The amounts of savings of depositors
9. Births
10. Deaths
11. Sex
12. Religion
13. Language
14. Occupation
15. Family size
The process of collecting, tabulating, compiling, and publishing data pertaining to
each and every unit of a whole set of objects or persons is called census-taking,
When the population is too big, census-taking is impractical, if not impossible.

1. if the population being considered is the length of life of all the batteries produced by
company, then we would have to use up all batteries in order to get data on their length
of life.
2. Imagine that we are interested in the daily food consumption quantity of all families in
Makati. So much time, energy, and oney will be wasted just to get the needed
information from each and every family.
This method is time consuming and requires too much effort and money
This is where statistical methods and techniques could come in. When the mass
of data is too great to be handled in its entirety, the sampling method is used. This is the
method of getting facts from a small but representative cross-section pf the population.
This representative part of the population is called sample.
The sample is used to describe the population from which it was taken.

A subscript is a number or a letter representing several numbers placed at the

lower right of a variable. It is used to specify the item referred to.
If we have five numbers representing the ages of five students, and let x represent the
age, will let X1 stand for the age of the first student, X2 , stand for the age of the second
student, X3 , for the third student, X4 , for the fourth student, and X5 for the fifth student.
Sometimes we would like to summarize in just one term the idea that there are
five students with their corresponding ages. Here, instead of a numerical value, we may
use a letter subscript. We would then write the symbol as X i (read “X sub i”) where I
stands for the numbers 1, 2, 3,.., n.
Xi stands for X1 , X2 , X3 , . . . , Xn
The summation symbol ∑ ( Greek capital letter “sigma”) is used to denote that
the subscripted variables are to be added.
The Summation Process

The study of statistics involves the collection of data or measurement. Thus, there is
always a need to add several numbers. The Greek capital letter sigma, Σ is used in the
process. The symbol Σ, read as the sum of tells you to add certain numerical values.

Example 1: Consider the scores obtained by 10 students in a 50-items mathematics


Student No. Score

1 36
2 28
3 46
4 65
5 26
6 38
7 52
8 47
9 39
10 35

For convenience, variables will be used to present the data.

Let x = score obtained by each student
xi = different values or observations of x
Xi is read as “x sub i” where i is a subscript which indicates the position of each
value in the series.
In the given data, there are 10 observations denoted as x 1, x2, x3, x4, x5, x6, x7, x8, x9,
x10. 10

Hence, ∑ = x1+ x2+ x3+ x4+ x5+ x6+ x7+ x8+ x9+ x10.

The symbol ∑ ix is read as “the sum of 10 observations x 1 to x10 ”.


To substitute the data: ∑ ix = 36 + 28 +46 + 65 + 26 + 38 + 52 + 47 + 39 + 35


= 41250
For large observations, say 50, the summation will be expressed as: ∑ xi = x1+ x2+ x3

+ …..+x50. In general, ∑ xi = x1+ x2+ x3 + …..+xn.

If all the given values of a variable are to be used in finding the sum, the limits of the
summation are usually omitted, as ∑xi = ∑x

Example 2: Given are the ages of the first 4 shoppers at a newly opened convenience
store in the neighborhood –-- 12, 24, 30, 45.

1. What will x represent in the information given?

2. What will the subscript i represent?
3. Write an expression for the sum.
4. What are the lower and upper limits of the expression?
5. Write the formula for the summation and find the sum of the given information.

The following rules are to be observed in rounding off numbers

Rule 1: When the digits to be eliminated or replaced by 0 are greater than 5, 50, 500
and so forth, add 1 to the last digit to be retained.
672.52 to the nearest unit/ones locate the number in the position in the units
So the number to be drop is 52 which is greater than 50 therefore add 1
to the preceding number.
= 673
25,387 to the nearest hundred
Locate the number in the hundred position
The number in the hundred position is 3 and the number you are going to
drop is 87. Since the number to be drop is greater than 50 then add
1 to the preceding number. Add zero as place holder
= 25,400
2. When the digits to be eliminated or replaced by 0 are less than 5, 50, 500 and so
forth, do not change the value of the last digit to be retained.
5,412 to the nearest tens
Locate the number in the tens position. So the number in the tens position is 1
and the number to be eliminated is 2 which is less than 5.
= 5, 410
Rule 3: When the digits to be eliminated or replaced by zero are exactly 5, 50, 500 and
so forth do not change the last digit to be retained if it is even; add 1 if it is odd.
935 to the nearest tens
Locate the digit in the tens position. Three (3) is the digit in the tens position
and the number to be eliminated is exactly 5. Since the number in the tens digit is odd
then add 1 to this number.
= 940
7,250 to the nearest hundreds

The number in the hundreds digit is 2 and the number to be eliminated is

exactly 50. Since the number in the hundreds digit is even then retain 2 and place 0 as
place holder.
= 7,200
.A variable is an observable characteristics of a person or objects which is capable of
taking several values or of being expressed in several different categories.
Kinds of Variables:
1. Continuous Variable
A variable which may take any value within a specified range of values.
Weight height
2. Discontinuous/discrete variable
A variable that can take specific values only.
Values have breaks, gaps or jumps.
Number of BSA students enrolled in Statistical Analysis
Family size
Four levels or types of measurement:
1. Nominal measurements
Most limited type of measurement.
Merely used to differentiate classes or categories for purely classification or
identification purposes.
Sex ( male, female) the two groups formed can be identified by using numbers
like 1 for male group and 0 for female group or vice versa. These numbers are merely
used for identification purposes. We cannot give meaning to the magnitude or size of
such numbers.
Although numbers may be used to designate categories or groups, these
numbers have very few of the usual properties of numbers. We cannot use the four
fundamental operations on these numbers because these numbers are merely labels or
codes for categories.

2. Ordinal Measurements
These do not only classify but also order the classes.
Expressed in ranks is possible if different degrees of an attribute or property are
Ranks 1, 2, 3 given by judges to the 3 finalist in a beauty contests
However we are usually unable to determine the degree of difference between
any consecutive ordinal measures.
We cannot determine by just how much the beauty contest winner is more
beautiful than the second place winner and to what degree the second place winner is
more beautiful than the third placer.
3. Interval Measurements
Has the attributes of ordinal measure plus one more: it can differentiate between
any two classes in terms of degrees of differences.
Mental ability scores
Achievement scores
Temperatures in degrees Celsius
82º C is bigger than 80º C
68º C is lower than 72º C by 4º C
Addition and subtraction have meanings
Zero point of the interval scale is arbitrary and does not reflect the absence of the
4. Ratio Measurement
Differs from the interval measurement only in one aspect. It has a true zero point
which indicates a total absence of the property being measured.
Length (0 length means no length at all)
Number of children in a family

Ratios of the numbers assigned in the type of measurement reflect ratios in the
amounts of the property being measured.
If Lea is 180 centimetres tall and Lyka is 90 centimetres tall, we say that Lea is
twice as tall than Lyka. Their heights can be expresses in the ratio 2:1 (two is to one)
Multiplication and division have meanings.

Work activity:
Define frequency, percentage and proportion and give examples

