Analysis of Variance: Randomized Blocks: Farrokh Alemi Ph.D. Kashif Haqqi M.D

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 35

Analysis of Variance: Randomized Blocks

Farrokh Alemi Ph.D. Kashif Haqqi M.D.

Go to Table of Content

Additional Reading
For additional reading see Chapter 13 in Michael R. Middletons Data Analysis Using Excel, Duxbury Thompson Publishers, 2000. Example described in this lecture is based in part on Chapter 14, Sections 3 through 5 of Keller and Warracks Statistics for Management and Economics. Fifth Edition, Duxbury Thompson Learning Publisher, 2000. Read any introductory statistics book about Analysis of Variance
Go to Table of Content

Which Approach Is Appropriate When?


Analysis of Variance described here expands single factor ANOVA to multiple factors and analysis of more than 2 matched groups of populations. Choosing the right method for the data is the key statistical expertise that you need to have. You might want to review a decision tool that we have organized for you to help you in choosing the right statistical method.
Go to Table of Content

Do I Need to Know the Formulas?


You do not need to know exact formulas. You do need to know where they are in your reference book. You do need to understand the concept behind them and the general statistical concepts imbedded in the use of the formulas. You do not need to be able to do Analysis of Variance by hand. You must be able to do it on a computer using Excel or other software.
Go to Table of Content

Table of Content
Objectives Randomized Block Design Repeated Measure Design Sources of Variance Test Statistic An Example Assumptions Results of ANOVA Understanding Blocking ANOVA with replication Factorial Experimental Design An Example How to Analyze Data From Factorial Designs? Take Home Lesson

Go to Table of Content

Objectives
To learn the assumptions and the interpretation of Analysis of Variance for randomized block design. To learn assumptions and the interpretation of Analysis of Variance for multifactor models. To use Excel to do Analysis of Variance.
Go to Table of Content

Single and Multiple Factors


The ANOVA we discussed so far applies to one single factor (one quantitative response variable). We have seen in paired matched studies how making sure that the same or similar subjects receive the treatments reduces variations and allows more informative tests. We now extend the ANOVA model described earlier to situations where more than 2 populations are matched or in our new terminology to situations were there is randomized block designs.
Go to Table of Content

Randomized Block Design


If the subjects who receive a particular treatment are the same, or essentially the same, then we have a randomized block design. For example, if different treatment is provides to patients in low, medium and high severity then severity is used to create a block design. A block design removes differences among the experimental subjects within a particular treatment and therefore reduces the variations in response variable.
Go to Table of Content

Repeated Measure Design


Is a special form of randomized block design when the same subjects receive different treatments. For example, surveying same patients at monthly intervals is a repeated measure design. The same patients receive different treatment. Repeated measures reduces variation due to differences of subjects across treatment programs.

Go to Table of Content

Sources of Variance
In randomized block design we partition the total variation in the data (i.e. the difference between each observation and the grand mean) into three sources:
Sum of square treatment, SST Sum of square of errors, SSE Sum of square of blocks, SSB SS(Total) = SST + SSB + SSE
Go to Table of Content

Calculation of Sources of Variance


Formula
Degrees of freedom
n-1

SS(total)

Sum across all observations of square of the difference between observations and the grand mean.

SST
SSB SSE

Sum across treatments of (b * squared difference of mean of treatments and grand mean)
Sum across block of (k * squared difference of mean of blocks and the grand mean) SS(total)-SST-SSB

k-1

b-1 n-k-b-1

b is number of blocks, k is number of treatments, n is number of observations

Go to Table of Content

Calculation of Mean Sources of Variance


Formula
Degrees of freedom
k-1

MSS

SST/k-1

MSB
MSE

SSB/b-1
SSE/(n-k-b-1)

b-1
n-k-b-1

b is number of blocks, k is number of treatments, n is number of observations

Go to Table of Content

Test Statistic
Test statistic for treatment is MST/MSE distributed as an F distribution with k-1 and n-k-b-1 degrees of freedom. Test statistics for effect of blocks is MSB/MSE distributed as an F distribution with b-1 and n-k-b-1 degrees of freedom.

Go to Table of Content

An Example in Health Care


200 Patients at a Nursing home were followed for seven months. Each month we recorded their daily living activity score (measured on an interval scale). Sample of data are shown or download full data. Did patients daily living activity change over time?

Patient 1 2 3 4 196 197 198 199 200

Month 1 65 90 30 72 75 90 67 60 80

Month 2 40 85 30 52 58 67 25 48 95

Month 7 110 100 70 94 69 84 31 83 120

Go to Table of Content

Displaying the data


We need to see if the apparent changes in some months are real or due to random chance

Daily Living Activity Score

100 80 60 40 20 0 Month Month Month Month Month Month Month 1 2 3 4 5 6 7

Go to Table of Content

Components of ANOVA
Response variable is daily living activity score. Treatment are the months. The experimental plan is randomized block design. We use two factor ANOVA without replication. (With replication is used when measures are repeated for different levels of the same factor).

Go to Table of Content

What Are the Null Hypotheses?


Means for each patients are the same and means for all 7 months are equal: 1=2=3=4=5= 6=7

Go to Table of Content

What Are the Alternative Hypotheses?


At least two months have different means. At least two patients have different means.

Go to Table of Content

Assumptions
The variable of interest is quantitative. The problem is to compare 2 or more means. The experimental plan is a blocked randomized design. Treatment observations are distributed according to a Normal distribution. The variance of the samples are equal.
Go to Table of Content

Verifying Assumptions
Response variable is quantitative. The Problem is comparison of seven means. Assumption of blocked sample design is appropriate as repeated measures are used.
Same subjects are rated across the seven months.

Go to Table of Content

Verifying Assumptions (Continued)


Samples have Normal distribution. Month one data is shown. Other months were also Normal but not displayed.
Histogram for Month 1
50
Frequency

40 30 20 10 0
0 19 37 56 75 94 2 11 M or e

Bin

Go to Table of Content

Verifying Assumptions (Continued)


Equality of variances will be examined after the ANOVA is done.

Go to Table of Content

Excel Setup For ANOVA


Prepare data so that columns correspond to treatment and rows to blocks. Select tools, data analysis, ANOVA without replication. Include as input the column corresponding to blocks and all treatment columns.

Go to Table of Content

Results of ANOVA
First part shows averages and variances for each block (in this case patients). First 10 patients are shown in this slide. There are 7 observations per patient over the 7 months. Means differ but are differences significant.

SUMMARY 1 2 3 4 5 6 7 8 9 10

Count 7 7 7 7 7 7 7 7 7 7

Sum 430 638 265 527 501 498 440 652 560 508

Average 61.42857 91.14286 37.85714 75.28571 71.57143 71.14286 62.85714 93.14286 80 72.57143

Variance 677.2857 230.8095 365.4762 281.5714 163.619 523.4762 321.8095 326.4762 310.3333 269.9524

Go to Table of Content

Results of ANOVA (Continued)


Next, treatment data are described. Assumption of equal variances are met as variances are in the same range. Means differ but are differences significant.

SUMMARY Month 1 Month 2 Month 3 Month 4 Month 5 Month 6 Month 7

Count 200 200 200 200 200 200 200

Sum Average 13825 69.125 13919 69.595 14075 70.375 13559 67.795 14123 70.615 15104 75.52 16347 81.735

Variance 462.9742 502.1718 506.2758 540.7065 483.7455 484.6227 481.6128

Go to Table of Content

Result of ANOVA (Continued)


Next, sum of square table is shown. Rows correspond to patients, columns to months. Note total variation = SST+SSB+SSE. Note mean sum of square is calculated by dividing sum of squares by degrees of freedom.

Source of Variation SS Rows 209834.6 Columns 28673.73 Error 479125.1 Total 717633.5

df

MS 199 1054.445 6 4778.955 1194 401.2773 1399

Go to Table of Content

Result of ANOVA (Continued)


Test statistic for rows is 2.6 and larger than the critical value. Probability of observing this high an F value is 0. Reject the hypothesis that patients had same means. Similarly, reject the hypothesis of same means across the months.

ANOVA Source of Variation F P-value F crit Rows 2.627722 1.04E-23 1.187531 Columns 11.90936 5.14E-13 2.106162

Go to Table of Content

Understanding Blocking
Blocking is the extension of matched pair design to more than 2 populations Blocking reduces variation and improves our ability to detect differences in treatment. You can see this in the formula for total sum of square = SST+SSB+SSE In the absence of blocking SSB will be added to SSE
Go to Table of Content

ANOVA with replication


It is possible to have multiple blocks. For each possible block and treatment combination there may be multiple observations (replicated measures). How would we use ANOVA for these circumstances?

Go to Table of Content

Factorial Experimental Design


In designing data collection it is important to create as much efficiency as possible. The most optimal design is a factorial experimental design (typically analyzed using ANOVA with replication or multiple regression).

Go to Table of Content

How to Create Factorial Designs?


For each factor (or block), take two levels the maximum and the minimum. Examine all possible combinations of the factors. For a 3 factor model, this will lead to two to the power of 3, or 8, possible combinations. For a four factor model this leads to 2 to power of 4 possible combinations or 16 combinations. Measure the response variable for all possible combinations with replication.
Go to Table of Content

A Factorial Design for 3 Factors


Note there are eight unique cases. No case has the same level of the three factors. The combination was created by repeating every 4 cases for factor one, every 2 cases for factor two and every case for factor three.

Factor 1 Minimum Minimum Minimum Minimum Maximum Maximum Maximum Maximum

Factor 2 Minimum Minimum Maximum Maximum Minimum Minimum Maximum Maximum

Factor 3 Response Minimum Maximum Minimum Maximum Minimum Maximum Minimum Maximum

Go to Table of Content

An Example In Health Care


Three factors are assumed to affect consumer satisfaction: waiting time, travel time and bed side manner. Design an experiment to understand the relative influence of the three factors.
Satisfaction Bed side ratings of n manner patients Poor Good Poor Good Poor Good Poor Good

Waiting time Short Short Short Short Long Long Long Long

Travel time Short Short Long Long Short Short Long Long

Go to Table of Content

How to Analyze Data From Factorial Designs?


Data can be analyzed using ANOVA with replications, if for each combination of factors there are repeated measures. Excel provides a method for analyzing 2 factors with all levels of the factors specified. This is a limited method of analysis. An easier, more generalized approach, is to analyze the data using Multiple regression. A concept we introduce later.

Go to Table of Content

Take Home Lesson


Experimental design affects the method of the analysis. An effective approach is block randomized design (an extension of matched pair t-test). In these circumstances we use two factor ANOVA without replication. An optimal design is factorial experimental design. In these circumstances an ANOVA with replication is appropriate.
Go to Table of Content

You might also like