Basic Concepts of Statistics and Data Collection

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 23

Basic Concepts of Statistics and Data

Statistics is the branch of mathematics which deals with collection, organization,
presentation, analysis, and interpretation of data.


 Descriptive Statistics – focuses

on the task of collecting, processing,
and presenting data

 Inferential Statistics - focuses on

the analysis and interpretation of
data; makes conclusion (predict,
estimate, compare)
POPULATION – a complete SAMPLE – a sub-collection of
collection of all elements to be elements drawn from a population.
studied. This usually represents The sample will be basis of
all subjects under study. generalization on behalf of the

Note: In obtaining the sample, you should consider every element in the
population and the scope of the study.
Example: A certain aspiring politician wants to know how many people in his city
would vote for him if he would run in the next election. It would be nearly
impossible to ask all the voters in his city about their opinion. Therefore, he asks a
small group, around 500 people, chosen at random, for their opinion.

What is considered a population in this situation?

Answer: The population will be registered voters in the city. Since
the situation talks about election, specific population will be registered
voters of that specific area.

What is considered a sample in this situation?

Answer: The sample taken from the group will be 500 people who are
registered voters in the city. It is impossible to interview all voters
registered in the city, the research will be time-consuming. They cannot ask
for opinion of a citizen who is not yet registered because their opinion will
be invalid in the election proper.

Data refers to any facts or information a researcher works.

DATA – represents
differences in QUANTITA
quality, character, DATA – nu
or kind. Things are in nature; v rical
grouped according which yield
to some common values. The
se are
properties and the data that c
an be
number of members measured a
of the group is counted.
recorded. Example:
Example: Gender of a person
(categorized as Male represent h
eight in
or Female), Civil numerical v
Status (Single, Age, Daily
Married, etc.), Color Allowance (
in Php)
of skin, eyes, hair
Quantitative data can be further classified as Discrete or Continuous.

a. DISCRETE – Quantitative values can be counted using integral values.

Results from either a finite number of possible values or countable number of
possible values.

Example: Number of students in the class

b. CONTINUOUS – Quantitative values assume over an interval or

intervals. Result from infinitely many possible values that can be
associated with points on a continuous scale in such a way that there are
no gaps or interruptions.
Example: Money in the Bank (decimal values are considered in this case.)
• Categorical data and numbers that are simply
used as identifiers.
• Classifies data into names, labels or
categories in which no order or ranking can be
• Gender. Categorized as Male or Female.
• Jersey Number. The jersey number is only
used to identify the player. It does not mean
that if the number in the jersey is high, the
player is very valuable in court.
• ID Number. The ID number is used to assign
the identity of an individual in a certain
• Political Party
• Classifies data into categories that
can be ordered or ranked, but precise
differences between the ranks do not
• Performance Evaluation.
(Excellent, Very Good, Good, Poor)
Excellent represents the highest rating
while Poor is the lowest.
• Socio-economic status. (Rich,
Middle, Poor)
• Pain scale.
• Have a precise difference between measures but the zero value is arbitrary and
does not imply an absence of the characteristic being measured.
• Temperature. If the temperature falls at zero degrees, it does not imply that
there is no temperature in an area still zero indicates a measure.

• Based on a standard scale which have a fixed zero point in which the zero value
denotes the complete absence of the characteristic being measured.
• Money. If a person declared that he has only Php 0 on his pocket, it simply
implies that the person has no money at all.
Complete the table by identifying if the following is a qualitative or quantitative and its
corresponding level of measurement (Nominal, Ordinal, Interval, Ratio). If the data is
quantitative, identify if it is DISCRETE or CONTINUOUS.


A teacher rates some project as


The club to which student is a


Average income of middle-class



Ranks of personnel in the


When evaluating a program, there are alternative ways to get the information you
need in addition to collecting the data yourself. Data that you retrieve first-hand is
known as primary data. Alternatively, data that is retrieved from pre-existing
sources is known as secondary data.

 Primary data sources include information collected and processed directly by

the researcher, such as observations, surveys, interviews, and focus groups.

 Secondary data sources include information that you retrieve through pre-
existing sources such as research articles, Internet or library searches. Pre-
existing data may also include examining existing records and data within the
program such as publications and training materials, financial records,
student/client data, and performance reviews of staff, etc.
There are different methods used to collect or obtain data for statistical analysis:
1. SURVEYS – this method solicits information from
the respondents.
a. Interview – This method is referred to as the direct
method of gathering data because this requires a face
– to – face inquiry with the respondents.
b. Questionnaires – This method is referred to as the
indirect method of gathering data because this makes
use of written questions to be answered by the

2. OBSERVATION – method is done by using the five

senses. Can produce qualitative (e.g., narrative data) and
quantitative data (e.g., frequency counts, mean length of
interactions, and instructional time).
3. DOCUMENTS AND RECORDS – Consists of examining
existing data in the form of databases, meeting minutes, reports,
attendance logs, financial records, newsletters, etc. This can be an
inexpensive way to gather information but may be an incomplete
data source.

4. EXPERIMENTS – Experiments involve collecting

data on groups of people (from small groups to
samples of communities or larger units) that have
experienced differential exposure to some variable
or variables (the "condition"). Experiments are of
several kinds.
 In data collection, we gather data from our desired sample in our study. There are different
methods of gathering sample from a population. There are two types of sampling:
Probability and Non-Probability Sampling techniques.

 PROBABILITY SAMPLING refers to each population element has a known

(non-zero) chance of being chosen for the sample.

a. Simple random sampling. obtained by

assigning numbers to each member of the
population and randomly picking up some of these
numbers like in lottery.
Example: All names are written in a paper and
placed inside a box. The researcher will pick n
number of papers as members of the sample.
b. Systematic random sampling. With
systematic random sampling, we create a list of
every member of the population. From the list,
we randomly select the first sample element from
the first k elements on the population list.
Thereafter, we select every kth element on the

c. Stratified sampling. With stratified sampling, the

population is divided into groups, based on some
characteristic. Then, within each group, a probability
sample (often a simple random sample) is selected.
In stratified sampling, the groups are called strata.

As an example, suppose we conduct a survey to Grade 7

students. We might divide the population into groups or
strata, based on sections. Then, within each stratum, we
randomly select survey respondents.
d. Cluster sampling. With cluster sampling,
every member of the population is assigned to
one, and only one, group. Each group is called
a cluster. A sample of clusters is chosen, using
a probability method (often simple random
sampling). Only individuals within sampled
clusters are surveyed.
This type of sampling technique involves large
group of population (e.g. Population of the
whole city.). The whole population is divided
into clusters (e.g. Barangays of a certain city),
Cluster sampling works when a cluster is
selected randomly from the population.

(Note: Cluster is larger than a stratum/strata)

 NON – PROBABILITY SAMPLING offers two potential advantages -
convenience and cost. The main disadvantage is that non-probability
sampling methods do not allow you to estimate the extent to which
sample statistics are likely to differ from population parameters. Only
probability sampling methods permit that kind of analysis. We do not know
the probability that each population element will be chosen, and/or we
cannot be sure that each population element has a non-zero chance of
being chosen.
1. Voluntary sample. A voluntary sample is made up of
people who self-select into the survey. Often, these
individuals have a strong interest in the main topic of the
Suppose, for example, that a news show asks viewers to
participate in an on-line poll. This would be a volunteer
sample. The sample is chosen by the viewers, not by the
survey administrator.
2. Convenience sample. A convenience
sample is made up of people who are easy
to reach.
Consider the following example. A pollster
interviews shoppers at a local mall. If the
mall was chosen because it was a
convenient site from which to solicit survey
participants and/or because it was close to
the pollster's home or business, this would
be a convenience sample.

You might also like