Quantitative Data Analysis

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 35

Quantitative Data

Presented by:
Sheu-Tijani Aminu Opeyemi & Dona Kuswoyo

 Data Preparation and Organisation; data scoring, code book and types of
data scoring
 Prerequisite of statistical programming tools
 Procedure Inputting data into the system
 Descriptive statistic; mean, median, mode (measure of centre), variance
standard deviation (measure of dispersion), Z score and percentile rank
 Inferential statistic;
 Report findings or results
 Conclusion
Meaning of data preparation and
 Data preparation incorporates the process of scoring the data and
creating a codebook, determining the types of score to use, selecting a
computer program, inputting the data into the statistical programming
tool for analysis and clearing the data.
Scoring data

 Data scoring: it is an act of assigning

numerical values or scores to each
response category for each question on the
checklist or instrument used in collecting
 A survey question for instance on “ school
feeding should be retained and maintained
to improve students’ learning participation
requires a participant to choose from set of
the (Linkert scale) options:
 Strongly Agree
 Agree
 Undecided
 Disagree
 Strongly Disagree
Types of score

 Single item score: it is an individual score assigned to each question for

each participant individual in research
 Sum score: It is the sum of all scores of an individual over several
questions that measure the same variable
 Net or Difference score: they are scores in a quantitative study that
represent a difference or change for each individual. It is commonly
used in experimental research.
What is a codebook?

 A codebook is a list of various

questions that indicate how a
researcher intend to code or
score responses from
Inputting data
Inputting data : means transferring data from the responses on
instruments to a computer file for analysis
Which statistical tools do I need?

The following are guides on way to choose an appropriate statistical tool for data
 Choosing a program software with details on how to use it like ones with tutorial
guide to learn key features and practice them using sample data sets
 Flexibility: chose a program software that is quite easy to use
 Inclusivity: choose a program software that has many statistical package and has
the tool you need for your analysis
 Capacity: choose one that can contain or run large data
 Choose one that can display output with use of graph and tables for reports
 Cost consideration: find a free one or one with affordable cost
 Choose one that is very common (this may help you find someone to guide you if
you stuck along the line.
Clean and Accounting for Missing
 Clean the data refers to the  Missing data are data which are
process of inspecting the data for not found in the database
scores ( or values) that are out side because our participant do not
the accepted range. Example, supply it.
having 6 in a Linkert scale
response data which is usually  Too handle this, one should have
form 5 to 1 or a participant enter 3 a good instrument which
for categorical scale like gender participant can easily and
which usually has legitimate value eagerly provide response.
of 1 for female, 2 for male.
 This can be corrected through
inspection of the data grid or using
frequency distribution if we large
Descriptive statistic

 Descriptive statistic is used to summarize the overall trends or

tendencies of a statistical data, provide an understanding of how varied
the scores may be and provide insight into where one score stands in
comparison with others. These three ideas are central tendency,
variability and relative standing test. Creswell (2012).
 Descriptive statistic are used to to describe the characteristics of the
sample or population in totality. ( Koul, 2019).
Measure of Central Tendency

 Measure of central tendency are summary numbers that represent a

single value in a distribution of scores (Vogt, 2005) cited in Creswell.
 They are expressed as an average score( the mean), the middle of a set
of scores (the median), and the most frequently occurring score (the
mode). (Creswell, 2012).

A mean is the total of the

scores divided by the
number of scores. To
calculate it, we all of the
scores and then divide
the sum by the number
of scores.
It is most popular statistic
used to describe
responses of all
participants to items on
an instrument.

 The median score divides the

scores, rank-ordered from top to
bottom or vice-versa, in half.
 Fifty percent of the scores lie above
the median and fifty percent lie
below the median.
 To find it, we arrange all the scores
in a sequential order and then
determines what score is half way
between all of the scores.
 The usefulness of median in
research is limited , but it is usually
reported by researchers. (Creswell,

 Mode is the most frequently

appear score in a list scores. It is
used when researchers want to
know the most common score in
an array of scores on a variable.
 It is commonly used in a
categorical variable or data.
Measure of Variability
 Measure of variability refers to the spread of the scores in a distribution.
(Creswell,2012). Range, Variance , and standard deviation , all indicate
the amount of variability in a distribution of scores.
 Variability helps us to see how dispersed the responses are to items on
an instrument.

 Range of scores is the difference between the highest and the lowest
scores to items in an instrument.

 Variance is the dispersion of score around the mean. To calculate

variance, we follow this step:
 Find the difference between the mean and the raw score for each
 Square this value for each individual
 Sum these squared scores for all individuals
 Divide by the total number of individuals (Creswell, 2012).
 It does not precise information unlike standard deviation (Creswell,
Standard Deviation

Standard deviation is the

square root of variance.
It provide precise
information and is mostly
used as an indicator of the
dispersion or spread of the
scores. (Creswell, 2012).
Percentile rank

 A percentile rank is the percentage of participants in the distribution

with scores at or below a particular score. It is used to determine where
in a distribution of scores, an individual’s score lies in comparison with
other scores. Standardize exams like (GRE) uses percentile to reports
performance of candidates. To calculate any required percentile, we
follow similar procedure for calculating median as follow:
 Multiply the required percent by the number of participant or score or
frequency, this gives the exact location of the score the required scored.
 Then arrange the scores in sequential order top-down or vice-versa
 Identify the score occupying the location. If the percentile value is
decimal, we round up the value to locate the required percentile score.
Z Score

 A Z score, is a standard score which is used to compare scores from

different scales. It involves the transformation of a raw scores into a
score with relative meaning.
 A Z score has a mean of 0 and standard deviation of 1
 To calculate it, we subtract each individual score from mean score, then
divide the result by the standard deviation value.
 So the formula is:
 Z score = raw score – mean/sd. Where sd is standard deviation.
Inferential Statistic`
Sheu-Tijani Aminu Opeyemi & Dona Kuswoyo
 Inferential statistics allows us to make predictions and generalizations
about a population based on sample data. It plays a crucial role in
hypothesis testing and helps to determine the reliability of results.


• Used to • ANOVA • Chi-square

compare the expands on t- tests assess
means of two tests to relationships
groups, t-tests compare between
determine if means among categorical
the observed three or more variables,
differences groups, useful validating if
are for identifying distributions
statistically variations differ from
significant. within groups. expectations.


• Parameters are numerical values

representing characteristics of a
population, while statistics are
numerical values derived from samples.
These concepts are foundational for
• A population encompasses all members
inferential of a
defined group, while a sample refers to a
subset of that population selected for analysis.
Understanding the distinction is vital for
accurate inference.
The table summarizes key types of inferential statistical tests,
their purposes, and the nature of the data they handle. This
comparison aids in selecting appropriate tests.


t-Test Compare means of two groups Continuous
Analyze variances across several
ANOVA Continuous

Chi-Square Evaluate relationships between

categorical variables

Reporting the statistical findings

 When reporting statistical findings or results, the following tools are

often used :
 Tables that summarize statistical information
 Figures (charts, pictures, drawings) that portray variables and their
 Detailed explanations about the statistical results (verbal)
Guides for using table to present
 A table is summary of quantitative data organized into rows and
 Use one table for each statistical test
 Present the data into rows and columns with simple and clear headings
 Indicate the level of statistical findings or results or values for
descriptive and inferential in the table
 Reporting notes that qualify, explain or provide additional information in
the tables, which can be helpful to readers.

 A figure is a summary of quantitative information presented as a chart,

graph, or picture that shows relations among scores or variables.
(Creswell, 2012).
 Tables are preferred to figures (APA, 2010) because they convey more
information in a simple form.
Types of figure

 Bar chart: it depicts trends and distributions of data

 Scatterplots : it shows comparison of two different scores and how the
scores regress or differ from the mean. It is useful in identifying outliers
 Line graphs: it displays interaction between two variables in an
 Charts: it portrays the complex relationships among variables in
correlational research
Interpreting the results

 Interpretation of the results involve detailed summary of the major

findings and presentation of broader implications of the research for
distinct audiences.
 A summary is a statement that reviews the major conclusions to each of
the research questions or hypotheses
 An implications are the suggestions for the importance of the study for
the different audiences.

• .


• Inferential statistics are

applied in fields like
healthcare for clinical
trials, marketing
research for consumer
APPLICAT behavior, and social
sciences for survey
IONS analysis. These
applications guide
informed decisions and


40% 20% 10% 30%


Understanding population Basic statistical tests Real-world applications Choosing a representative

parameters is crucial for provide insights into data enhance the relevance of sample is vital for reliable
accurate inferential relationships. statistical findings. inferences.


You might also like