Quantitative Data Analysis

Quantitative Data
Analysis
Presented by:
Sheu-Tijani Aminu Opeyemi & Dona Kuswoyo
Overview
 Data Preparation and Organisation; data scoring, code book and types of
data scoring
 Prerequisite of statistical programming tools
 Procedure Inputting data into the system
 Descriptive statistic; mean, median, mode (measure of centre), variance
standard deviation (measure of dispersion), Z score and percentile rank
 Inferential statistic;
 Report findings or results
 Conclusion
Meaning of data preparation and
organization
 Data preparation incorporates the process of scoring the data and
creating a codebook, determining the types of score to use, selecting a
computer program, inputting the data into the statistical programming
tool for analysis and clearing the data.
Scoring data
 Data scoring: it is an act of assigning

numerical values or scores to each
response category for each question on the
checklist or instrument used in collecting
data.
 A survey question for instance on “ school
feeding should be retained and maintained
to improve students’ learning participation
requires a participant to choose from set of
the (Linkert scale) options:
 Strongly Agree
 Agree
 Undecided
 Disagree
 Strongly Disagree
Types of score
 Single item score: it is an individual score assigned to each question for

each participant individual in research
 Sum score: It is the sum of all scores of an individual over several
questions that measure the same variable
 Net or Difference score: they are scores in a quantitative study that
represent a difference or change for each individual. It is commonly
used in experimental research.
What is a codebook?
 A codebook is a list of various

questions that indicate how a
researcher intend to code or
score responses from
instruments
Inputting data
Inputting data : means transferring data from the responses on
instruments to a computer file for analysis
Which statistical tools do I need?
The following are guides on way to choose an appropriate statistical tool for data
analysis:
 Choosing a program software with details on how to use it like ones with tutorial
guide to learn key features and practice them using sample data sets
 Flexibility: chose a program software that is quite easy to use
 Inclusivity: choose a program software that has many statistical package and has
the tool you need for your analysis
 Capacity: choose one that can contain or run large data
 Choose one that can display output with use of graph and tables for reports
 Cost consideration: find a free one or one with affordable cost
 Choose one that is very common (this may help you find someone to guide you if
you stuck along the line.
Clean and Accounting for Missing
Data
 Clean the data refers to the  Missing data are data which are
process of inspecting the data for not found in the database
scores ( or values) that are out side because our participant do not
the accepted range. Example, supply it.
having 6 in a Linkert scale
response data which is usually  Too handle this, one should have
form 5 to 1 or a participant enter 3 a good instrument which
for categorical scale like gender participant can easily and
which usually has legitimate value eagerly provide response.
of 1 for female, 2 for male.
 This can be corrected through
inspection of the data grid or using
frequency distribution if we large
data
Descriptive statistic
 Descriptive statistic is used to summarize the overall trends or

tendencies of a statistical data, provide an understanding of how varied
the scores may be and provide insight into where one score stands in
comparison with others. These three ideas are central tendency,
variability and relative standing test. Creswell (2012).
 Descriptive statistic are used to to describe the characteristics of the
sample or population in totality. ( Koul, 2019).
Measure of Central Tendency
 Measure of central tendency are summary numbers that represent a

single value in a distribution of scores (Vogt, 2005) cited in Creswell.
 They are expressed as an average score( the mean), the middle of a set
of scores (the median), and the most frequently occurring score (the
mode). (Creswell, 2012).
Average/Mean
A mean is the total of the

scores divided by the
number of scores. To
calculate it, we all of the
scores and then divide
the sum by the number
of scores.
It is most popular statistic
used to describe
responses of all
participants to items on
an instrument.
Median
 The median score divides the

scores, rank-ordered from top to
bottom or vice-versa, in half.
 Fifty percent of the scores lie above
the median and fifty percent lie
below the median.
 To find it, we arrange all the scores
in a sequential order and then
determines what score is half way
between all of the scores.
 The usefulness of median in
research is limited , but it is usually
reported by researchers. (Creswell,
2012).
Mode
 Mode is the most frequently

appear score in a list scores. It is
used when researchers want to
know the most common score in
an array of scores on a variable.
 It is commonly used in a
categorical variable or data.
Measure of Variability
 Measure of variability refers to the spread of the scores in a distribution.
(Creswell,2012). Range, Variance , and standard deviation , all indicate
the amount of variability in a distribution of scores.
 Variability helps us to see how dispersed the responses are to items on
an instrument.
Range
 Range of scores is the difference between the highest and the lowest
scores to items in an instrument.
Variance
 Variance is the dispersion of score around the mean. To calculate

variance, we follow this step:
 Find the difference between the mean and the raw score for each
individual
 Square this value for each individual
 Sum these squared scores for all individuals
 Divide by the total number of individuals (Creswell, 2012).
 It does not precise information unlike standard deviation (Creswell,
2012).
Standard Deviation
Standard deviation is the

square root of variance.
It provide precise
information and is mostly
used as an indicator of the
dispersion or spread of the
scores. (Creswell, 2012).
Percentile rank
 A percentile rank is the percentage of participants in the distribution

with scores at or below a particular score. It is used to determine where
in a distribution of scores, an individual’s score lies in comparison with
other scores. Standardize exams like (GRE) uses percentile to reports
performance of candidates. To calculate any required percentile, we
follow similar procedure for calculating median as follow:
 Multiply the required percent by the number of participant or score or
frequency, this gives the exact location of the score the required scored.
 Then arrange the scores in sequential order top-down or vice-versa
 Identify the score occupying the location. If the percentile value is
decimal, we round up the value to locate the required percentile score.
Z Score
 A Z score, is a standard score which is used to compare scores from

different scales. It involves the transformation of a raw scores into a
score with relative meaning.
 A Z score has a mean of 0 and standard deviation of 1
 To calculate it, we subtract each individual score from mean score, then
divide the result by the standard deviation value.
 So the formula is:
 Z score = raw score – mean/sd. Where sd is standard deviation.
Inferential Statistic`
By:
Sheu-Tijani Aminu Opeyemi & Dona Kuswoyo
 Inferential statistics allows us to make predictions and generalizations
about a population based on sample data. It plays a crucial role in
hypothesis testing and helps to determine the reliability of results.
TYPES OF TESTS
• ANOVA (ANALYSIS OF
• T-TESTS VARIANCE) • CHI-SQUARE TESTS
• Used to • ANOVA • Chi-square

compare the expands on t- tests assess
means of two tests to relationships
groups, t-tests compare between
determine if means among categorical
the observed three or more variables,
differences groups, useful validating if
are for identifying distributions
statistically variations differ from
INFERENTIAL STATISTICS 24
significant. within groups. expectations.
KEY CONCEPTS
• PARAMETERS AND STATISTICS
• Parameters are numerical values

representing characteristics of a
population, while statistics are
numerical values derived from samples.
• POPULATION VS. SAMPLE
These concepts are foundational for
• A population encompasses all members
inferential of a
analysis.
defined group, while a sample refers to a
subset of that population selected for analysis.
Understanding the distinction is vital for
accurate inference.
SUMMARY OF TESTS
The table summarizes key types of inferential statistical tests,
their purposes, and the nature of the data they handle. This
comparison aids in selecting appropriate tests.
TEST TYPE PURPOSE DATA TYPE

t-Test Compare means of two groups Continuous
Analyze variances across several
ANOVA Continuous
groups
Chi-Square Evaluate relationships between

Categorical
categorical variables
26
Reporting the statistical findings
 When reporting statistical findings or results, the following tools are

often used :
 Tables that summarize statistical information
 Figures (charts, pictures, drawings) that portray variables and their
relationships
 Detailed explanations about the statistical results (verbal)
Guides for using table to present
results
 A table is summary of quantitative data organized into rows and
columns.
 Use one table for each statistical test
 Present the data into rows and columns with simple and clear headings
 Indicate the level of statistical findings or results or values for
descriptive and inferential in the table
 Reporting notes that qualify, explain or provide additional information in
the tables, which can be helpful to readers.
Figures
 A figure is a summary of quantitative information presented as a chart,

graph, or picture that shows relations among scores or variables.
(Creswell, 2012).
 Tables are preferred to figures (APA, 2010) because they convey more
information in a simple form.
Types of figure
 Bar chart: it depicts trends and distributions of data

 Scatterplots : it shows comparison of two different scores and how the
scores regress or differ from the mean. It is useful in identifying outliers
 Line graphs: it displays interaction between two variables in an
experiment
 Charts: it portrays the complex relationships among variables in
correlational research
Interpreting the results
 Interpretation of the results involve detailed summary of the major

findings and presentation of broader implications of the research for
distinct audiences.
 A summary is a statement that reviews the major conclusions to each of
the research questions or hypotheses
 An implications are the suggestions for the importance of the study for
the different audiences.
DEFINITION &
PURPOSE
• UNDERSTANDING INFERENTIAL STATISTICS
• .
• REAL-WORLD USES
• Inferential statistics are

applied in fields like
healthcare for clinical
trials, marketing
research for consumer
APPLICAT behavior, and social
sciences for survey
IONS analysis. These
applications guide
informed decisions and
policies.
CONCLUSION & INSIGHTS
40% 20% 10% 30%
POPULATION INSIGHTS TESTING IMPORTANCE APPLICATION IMPACT SAMPLE SELECTION
Understanding population Basic statistical tests Real-world applications Choosing a representative

parameters is crucial for provide insights into data enhance the relevance of sample is vital for reliable
accurate inferential relationships. statistical findings. inferences.
statistics.
35

Quantitative Data Analysis

Uploaded by

Copyright:

Available Formats

Quantitative Data Analysis

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Quantitative Data Analysis

Uploaded by

Copyright:

Available Formats

Quantitative Data

 Data scoring: it is an act of assigning

 Single item score: it is an individual score assigned to each question for

 A codebook is a list of various

 Descriptive statistic is used to summarize the overall trends or

 Measure of central tendency are summary numbers that represent a

A mean is the total of the

 The median score divides the

 Mode is the most frequently

 Variance is the dispersion of score around the mean. To calculate

Standard deviation is the

 A percentile rank is the percentage of participants in the distribution

 A Z score, is a standard score which is used to compare scores from

• Used to • ANOVA • Chi-square

• PARAMETERS AND STATISTICS

• Parameters are numerical values

TEST TYPE PURPOSE DATA TYPE

Chi-Square Evaluate relationships between

 When reporting statistical findings or results, the following tools are

 A figure is a summary of quantitative information presented as a chart,

 Bar chart: it depicts trends and distributions of data

 Interpretation of the results involve detailed summary of the major

• Inferential statistics are

40% 20% 10% 30%

POPULATION INSIGHTS TESTING IMPORTANCE APPLICATION IMPACT SAMPLE SELECTION

Understanding population Basic statistical tests Real-world applications Choosing a representative

You might also like