0% found this document useful (0 votes)
10 views75 pages

Psyc 103 (Stats)

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 75


Going by the Numbers:

Statistics in Psychology
Experimental vs. Non-Experimental

• Non-Experimental:
• Correlation- allows us to determine association
• Cannot determine causality!

• Experimental:
• Variables are manipulated so that we can determine
the effect of one variable on another (e.g. drug trials)
Experimental Terminology

• Experimental Group:
• The group of people who are given a treatment (a
variable is manipulated for this sample)
• Control Group:
• The group in an experiment who are not given a
Experimental Terminology

• Independent Variable:
• The variable that is manipulated for the experimental
• Dependent Variable:
• The variable that is being measured (the variable that
is affected by changes in the other)
More Basic Concepts

 Variable: A condition or characteristic that can

have different values

 Value: A possible number or category that a

variable can have

 Score: A particular person’s value on a variable

Some Basic Terminology

Term Definition Examples

Variable Condition or characteristic that Stress level,
can have different values age, gender,
Value Number or category 0, 1, 2, 3, 4,
25, 85, female,
Score A particular person’s value 0, 1, 2, 3, 4,
on a variable 25, 85, female,
Module 56:
Descriptive Statistics

Statistics: branch of mathematics that focuses

on organization, analysis, and interpretation of
a group of numbers (i.e., data)
Method of distinguishing patterns from
E.g., lottery, coin flip, etc.
i.e., Is the difference between groups an effect
or just random chance
What does Statistics help us to

• Allows us to determine the relationship between

variables with some degree of certainty
• Variable: a condition or characteristic that can have
different values; something of interest

• Research begins as a question about the relationship

between two variables
Alcohol consumption  car crashes
The Two Branches of
Statistical Methods

 Descriptive statistics
• Summarize/organize a group of numbers from a
research study

 Inferential statistics
• Draw conclusions/make inferences that go
beyond the numbers from a research study
• Use sample to make general conclusions
More on Inferential Statistics

• Goal is to draw conclusions about a population of

interest, by collecting data on a sample
• Population: Entire group of individuals of interest
• University Students
• Turkish people
• Men who brush their teeth with their non-dominant
• Sample: the particular participants selected to be
studied from the population

External Validity: how well the sample represents

the population
Ideal Research Design

1. Participants in the experimental and control groups

are identical
2. Both groups are exposed to identical situations
(except for manipulation of the independent variable)
• If not, there may be a confounding variable
• Coke vs Pepsi
3. Sample studied represents the intended population
4. Measurement of dependent variable is accurate and
appropriate for what it’s supposed to be measuring
Types of Measurement

 Numeric (quantitative) variables:

1. Equal-interval variables
• e.g., GPA, age, class size
• Discrete
• Continuous
2. Ratio Scale  absolute zero
3. Ordinal (rank order) variables
• e.g., position finished in a race
Types of Measurement

 Nominal (qualitative) variables

4. Categorical
• e.g., gender, religion
Types of Measurement
Type Definition Example
Equal-interval Numeric variable in which differences Stress level, age
between values correspond to differences
in the underlying thing being measured
Ordinal Numeric variable in which values Class standing,
correspond to the relative position position finished in
of things measured a race
Nominal Variable in which the values are Gender, religion
Frequency Distributions

• After collecting data, first task:

• Organize and simplify the data so that it is possible
to get a general overview of the results

• This is the goal of descriptive “statistics”

• One method for simplifying and organizing data:

• Frequency distribution
Organizing Data: Frequency

 Provide a listing of individuals having each of the

different values for a particular variable
 e.g., stress ratings of 151 students:
Organizing Data: Frequency Tables

• Frequency: Number of scores with a particular


• Frequency Distribution: The pattern of frequencies

over different values
Organizing Data:
Steps for Making a Frequency
1. Make a list down the page of each possible
value, from highest to lowest
2. Go one by one through the scores, making a
mark for each next to its value on the list
3. Make a table showing how many times each
value on your list is used
4. Figure the percentage of scores for each value
Organizing Data: A Frequency Table

Rating Frequency Percent
10 14 9.3
9 15 9.9
8 26 17.2
7 31 20.5
6 13 8.6
5 18 11.9
4 16 10.6
3 12 7.9
2 3 2.0
1 1 0.7
0 2 1.3
Steps for Making a

1. Make frequency table

2. Put the values along the bottom of the page, from
left to right, from lowest to highest
3. Make a scale of frequencies along the left edge of
the page that goes from 0 at the bottom to the
highest frequency for any value
4. Make a bar above each value with a height for the
frequency of that value
Organizing Data: Frequency Graphs

Shapes of Frequency
• Unimodal, Bimodal, and
Shapes of Frequency

• Symmetrical and Skewed Distributions


• Skewed to the left= Negatively skewed;

tail (side with fewer scores) to the left
• Ceiling effect
• Skewed to the right= Positively skewed;
tail (side with fewer scores) to the right
• Floor effect
Normal Distributions

Normal Curve: normal distributions often approximate a

bell-shaped curve that is unimodal and symmetrical

• We talked about how to begin to make sense of

a group of scores
• Frequency Tables and Graphs

• What are the main statistical techniques for

describing a group of scores with numbers?
How can we do this?

1. Describe group of scores in terms of a

typical/average/most representative/etc. value

2. Describe how spread out the numbers are in

a group of scores
Central Tendency

• Measures of Central Tendency are those which

tell us the typical value in a group of scores
• e.g. What is the typical height of a Bilkent
• Mean
• Median
• Mode
Central Tendency: Benefits
Capture a great deal of
information in a single score

Example 2:

45F = Average high temperature in

December (Ankara, Turkey)

83F = Average high temperature in

December (Kona, HI)
Central Tendency: Mean
• Sum of all the scores divided by the number of scores

 X
• S = the sum of
• X = each individual score
• S x = the sum of all scores
• N = # of scores
Central Tendency: Mean - Example

• Mean # of dreams per week: 7,8,8,7,3,1,6,9,3,8

• ΣX = 7+8+8+7+3+1+6+9+3+8 = 60
• N = 10
• Mean = 60/10 = 6
Central Tendency: Mode

• Most common single number in a distribution

• Bimodal distribution
• Mode of 7,8,8,7,3,1,6,9,3,8 =
• What type of variable makes the mode a good
• Nominal variables
• Why?
Mode: Common Error
X f

• The most frequent score 4 4

• Single score that is most 3 10

common 2 14
1 6
0 6
Mistakenly report frequency of
the most common score, rather
than the score itself
Central Tendency: Mode

The mode as the high point in a distribution’s histogram,

using the # of dreams/week example
The Mode

n = 50
Cat = 17
Dog = 28
Fish = 3
Other = 2

What is the mode for this distribution?

The answer isn’t a number
Central Tendency: Median

The middle score when all scores are arranged from

lowest to highest
Median of 1, 8, 3, 4, 7
 1, 3, 4, 7, 8
 Median = 4
Median of 7,8,8,7,3,1,6,9,3,8
1 3 3 6 7 7 8 8 8 9

Median is the average (mean) of the 5th and 6th
scores, so the median is 7
The Median

Midpoint of the distribution not the midpoint of the

scale (e.g. 0-100)
Divides the set of scores
into two equal-sized not halfway between the
groups lowest and highest
When To Use Each Measure

• Mean is most commonly used in research

• Vulnerable to distortion when sample size is small or
has extreme values
• Mode is used for nominal variables
• Not that vulnerable to extreme scores
• Median is preferable to mean when extreme scores
are in data set
• All three converge for large samples
• Normal distribution
Consider the Following…

1st group of #s
1, 2, 3
2nd group of #s
-3, 0, 9
3rd group of #s
-3, -2, 0, 1, 4, 4, 9

Rank order the 3 measure of CT for a

distribution of scores involving a floor effect
• Mode: Most frequent score
• Always corresponds to an actual score
• Can have multiple modes
• Nominal data
• Median: Middle score
• Not influenced by extreme scores
• Ordinal data
• Mean: Average score
• Represents every score in the distribution
• Distorted by extreme scores
• CT + variability = foundation for inferential statistics
Variability or Dispersion

• In addition to being able to describe the typical

score, we also want to be able to describe how
much the scores differ or vary from each other

• How much do the scores in our sample vary from

the measure of central tendency on average?

 What’s our next step?


• Range is the simplest measure of variability?

• Difference between highest and lowest score
• Age at Bilkent
• Based entirely on extreme scores
Standard Deviation
• SD: average deviation of a set of scores from
the center of the distribution
Sum of Deviation Scores
• Deviation Score is the extent to which each individual’s score deviates
from the mean

We want to know the average or the standard deviation score

S (X – M) ?
Average Deviation Scores?

S (X – M) = 0/N
• The sum of all deviation scores will ALWAYS be 0!!!
• We need to square each deviation score and obtain
Sum of Squares (SS)
Sum of Squared Deviation Scores

• Sum of all squared deviations from the mean:

S (X- M)2 = SS
Variance: Average Sum of Squared
Deviation Scores

• Formula for the Variance


 (X  M)

Variance: Measure of Spread

• The average of each score’s squared

difference from the mean
• Steps for computing the variance:
1. Subtract the mean from each score
2. Square each of these deviation scores
3. Add up the squared deviation scores
4. Divide the sum of squared deviation scores by
the number of scores
Standard Deviation: Measure of Spread

• Most common way of describing the

spread of a group of scores
• Approximately the average amount that
scores deviate from the mean
• Steps for computing the standard
1. Figure the variance
2. Take the square root
Measures of Spread
The Standard Deviation
• Formula for the standard deviation:

SD  SD 2

 (X  M)


• The association (or co-relation) between scores on two

or more variables

• For example:
• Optimistic people are healthier
• Babies look at more beautiful women longer
• More attractive people earn more money
• Students with higher attendance get higher grades
Scatter Diagrams (Scatterplots)

• Draw the axes and decide which variable goes on which axis
• The predictor or “causal” variable goes on the x-axis

• Mark the values on each axis

• Choose the range of values that covers all of the possible scores

• Mark a dot for each person’s pair of scores

Conservative/ Government
Liberal Services
2 5
5 6
4 7
6 4
5 4
3 2
6 7
4 5
5 4
6 5
4 5
5 6
4 3
5 2
3 2
6 4
7 6
5 6
6 6
6 6
4 6
4 4
4 5
Conservative/ Government
Liberal Services
2 5
5 6
4 7
6 4 8
5 4

Government Services
3 2
6 7 6
4 5
5 4
6 5 4
4 5
5 6
4 3 2
5 2
3 2
6 4
7 6
0 1 2 3 4 5 6 7 8
5 6
6 6 Conservative/Liberal Scale
6 6
4 6
4 4
4 5
Conservative/ Government
Liberal Services
2 5
5 6
4 7
6 4 8
5 4

Government Services
3 2
6 7 6
4 5
5 4
6 5 4
4 5
5 6
4 3 2
5 2 1
3 2
6 4
7 6 0 1 2 3 4 5 6 7 8
5 6
6 6 Conservative/Liberal Scale
6 6
4 6
4 4
4 5
Conservative/ Social
Liberal Responsibility
4 5
5 6
4 7
6 4 8
5 4

Government Services
3 2
6 7 6
4 5
5 4
6 5 4
2 5
5 6
4 3 2
5 2
3 2
6 4
7 6
0 1 2 3 4 5 6 7 8
5 6
6 6 Conservative/Liberal Scale
6 6
4 6
4 4
4 5
Conservative Standard of
Liberal Living
4 5
5 3
4 3
6 4
5 4
3 3
6 2
4 5
5 4
6 3
2 4
5 5
4 3
5 4
3 4
6 3
7 2
5 4
6 1
6 1
4 3
4 5
4 5
Conservative/ Standard of
Liberal Living
4 5
5 3
4 3
6 4 8
5 4
3 3

Standard of Living
6 2 6
4 5
5 4
6 3 4
2 4
5 5
4 3 2
5 4
3 4
6 3 0
7 2
0 1 2 3 4 5 6 7 8
5 4
6 1 Conservative/Liberal Scale
6 1
4 3
4 5
4 5
Conservative/ Standard of
Liberal Living
4 5
5 3
4 3
6 4 8
5 4
3 3

Standard of Living
6 2 6
4 5
5 4
6 3 4
2 4
5 5
4 3 2
5 4
3 4
6 3 0
7 2
0 1 2 3 4 5 6 7 8
5 4
6 1 Conservative/Liberal Scale
6 1
4 3
4 5
4 5
Liberal Gun Control
4 7
5 3
4 6
6 6 8
5 4
3 6
6 7 6

Gun Control
4 7
5 5
6 5 4
2 4
5 6
4 4 2
5 4
3 5
6 6 0
7 4
0 1 2 3 4 5 6 7 8
5 5
6 6 Conservative/Liberal Scale
6 6
4 4
4 6
4 6
Bivariate Correlation

• Correlation Coefficient: a statistic that indicates the

degree to which two variables are related to one

How is bivariate correlation
• Pearson Correlation Coefficient (r): - 1.00 < r < +1.00
1. Valance (sign, direction)
Tells us the direction of the relationship.
(+): When V1 increases, V2 increases OR When V1 decreases, V2 decreases
(-): When V1 increases, V2 decreases OR When V1 decreases, V2 increases

2. Magnitude (size, strength): The numerical value ignoring the sign,

strength of the relationship
Cohen (1988, 1992):
Large if r > .50, Moderate if .50 > r >.30, Small if r <.30

Strong Positive Association
r = .53

• There is a positive
association between
reading and writing
scores of students.
• As students writing
scores increase
their reading scores
also increase.

Strong Negative Association

• There is a negative
association between
age and weekly internet
usage in hours
• Younger people use
more internet weekly
where as older people
use less internet

Zero Association or Zero
• There is zero
association between
Grades on Psyc/ 103 assignment

grades students get

on their psyc 103
assignments and the
number of friends
they have

Number of close friends

Correlation Coefficients
• r Pearson’s correlation coefficient
• Direction of the correlation
-1 < r < 0 negative linear correlation
0 < r < +1 positive linear correlation
• Degree of the correlation
The further the r value is from 0 the stronger the correlation
• Which correlation is stronger?
r = 0.32 or r = -0.46
Interpreting Correlations

• Changes in one variable relate to changes in the other

• Correlation does NOT mean causation!

• Three possible directions of causality:

• X causes Y
• Y causes X
• A third factor causes X and Y
• Example: Prejudice and Between Group Contact
Prejudice and Contact

• Measures for prejudice and between group contact

• Calculate the correlation (e.g. r = -0.27).

• Explanations?
• prejudice causes people to limit their between group contact
• people who have low between group contact become prejudice
• fear of unfamiliar causes both prejudice and contact level
Correlation does not mean

You might also like