Repeated Measures Designs

The experimental design with the same participants being tested in all condictions is referred to as repeated measures design or randomized block design. Also, sometimes this kind of design is called within-subjects design. See the below table

Treatment 1 Treatment 2 Treatment 3
\(X_{11}\) \(X_{12}\) \(X_{13}\) \(\mu_{1.}\)
\(X_{21}\) \(X_{22}\) \(X_{23}\) \(\mu_{2.}\)
. . . .
. . . .
. . . .
\(X_{n1}\) \(X_{n2}\) \(X_{n3}\) \(\mu_{n.}\)
\(\mu_{.1}\) \(\mu_{.2}\) \(\mu_{.3}\) \(\mu\)

As the same subjects takes on the tests in all conditions, the subject himself/herself is also a scoure of the variation of scores. Thus, we make it as a factor to build up the linear model. There are two types of the linear models for analyzing the data in the repeated measures design. The first one is called the additive model. In this model, the teatement and the subject variable are treated as independent of each other.

Additive Model

Additive Model: \(X_{ij}=\mu+\alpha_{j}+\pi_{i}+\epsilon_{ij}\)

With \(\alpha_{j}=\mu_{j}-\mu\) and \(\pi_{i}=\mu_{i}-\mu\), the total sum of squares can be partitioned to the components cooresponding to treatment effect, subject effect, and residual.

\(SST=SS_{treat}+SS_{subject}+SS_{residual}\)

The total degrees of freedom \(df_{total}\) can be decomposed to \(df_{treat}+df_{subject}+df_{residual}\). For a treatment of \(p\) levels and \(n\) subjcts in each cell,

\(np-1=(p-1)+(n-1)+(p-1)(n-1)\).

The expected value of \(MS\) for each term in the additive model can be seen in the table below.

Source numerator \(df_{1}\) denominator \(df_{2}\) F
\(A\) \(E(MSA)=\sigma^2_{\epsilon}+n\sigma^2_{\alpha}\) \(p-1\) \(E(MS_{residual})=\sigma^2_{\epsilon}\) \((p-1)(n-1)\) \(F=MSA/MS_{residual}\)
\(S\) \(E(MSS)=\sigma^2_{\epsilon}+p\sigma^2_{\pi}\) \(n-1\) \(E(MS_{residual})=\sigma^2_{\epsilon}\) \((p-1)(n-1)\) \(F=MSS/MS_{residual}\)

Numerical Example

See Table 14.3 as an example. The data variable are created as follow. Note we use a variable s to encode the subject’s number. The commend s=gl(9,1,45) creates a factor by repeating for 5 times the number series of 1, 2, 3, \(\cdots\), 9. The variable w is the independent variable. The variable h is obviously the dependent variable, as it is a numerical variable. With the variable s, R knows which scores are from the same subject. For example, we can selectively show the first subject’s scores in these 5 conditions by subset( ). You will see that the scores on the rows 1, 10, 19, 28, and 37 are actually made by Subject #1.

dta<-data.frame(s=gl(9,1,45),
                w=gl(5,9,45),
                h=c(21,20,17,25,30,19,26,17,26,
                    22,19,15,30,27,27,16,18,24,
                    8,10,5,13,13,8,5,8,14,
                    6,4,4,12,8,7,2,1,8,
                    6,4,5,17,6,4,5,5,9))
subset(dta,s==1)
##    s w  h
## 1  1 1 21
## 10 1 2 22
## 19 1 3  8
## 28 1 4  6
## 37 1 5  6

As usual, let’s plot the means of all experimental conditions first.

means<-with(dta,tapply(h,w,mean))
ses<-with(dta,tapply(h,w,sd))/sqrt(table(dta$w))
library(ggplot2)
ggplot(data.frame(means=means,ses=ses,x=1:5),aes(x=x,y=means))+
  geom_point()+geom_line()+
  geom_errorbar(aes(ymin=means-ses,ymax=means+ses),width=0.1)+
  scale_x_continuous(name="Week",breaks=1:5,labels=c("B-week1","B-week2",
                                          "B-week3","B-week4",
                                          "B-week5"))+
  ylab("Duration")

The visual inspection suggests a significant main effect of treatment. We use aov() to verify this hypothesis. According to the linear model, we separate the sum of squares total (\(SST\)) to three parts: experimental treatment (\(SS_{treat}\)), subject (\(SS_{subject}\)), and error (\(SS_{error}\)).

res.add.aov<-aov(h~w+s,dta)
summary(res.add.aov)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## w            4 2449.2   612.3   85.04  < 2e-16 ***
## s            8  486.7    60.8    8.45 4.25e-06 ***
## Residuals   32  230.4     7.2                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Compute SSTotal
sd(dta$h)^2*44
## [1] 3166.311
# Compute the sum of the squares of treatment effect, subject effect, and error
sum(summary(res.add.aov)[[1]][,2])
## [1] 3166.311

Nonadditive Model

The second model is called the nonadditive model, in which the subject variable is not thought to be independent of the treatment. Thus, in addition to the subject effect, we also add the interaction of the treatment and the subject variable into the linear model.

Nonadditive Model: \(X_{ij}=\mu+\alpha_{j}+\pi_{i}+(\pi\alpha)_{ij}+\epsilon_{ij}\).

However, the interaction is not actually computable. This is because in each cell crossed by the treatment and the subject, there is only one subject score and it is impossible to compute the variance of one score. According to this model, the residual part actually consists of the random error and the interaction of treatment and subject.

The expected value of \(MS\) for each term in the nonadditive model can be seen in the table below.

Source numerator \(df_{1}\) denominator \(df_{2}\) F
\(A\) \(E(MSA)=\sigma^2_{\epsilon}+n\sigma^2_{\alpha}+\sigma^2_{\pi\alpha}\) \(p-1\) \(E(MS_{residual})=\sigma^2_{\epsilon}+\sigma^2_{\pi\alpha}\) \((p-1)(n-1)\) \(F=MSA/MS_{residual}\)
\(S\) \(E(MSS)=\sigma^2_{\epsilon}+p\sigma^2_{\pi}\) \(n-1\) \(E(MS_{residual})=\sigma^2_{\epsilon}+\sigma^2_{\pi\alpha}\) \((p-1)(n-1)\)

Therefore, in the nonadditive model, the subject variable’s effect cannot be estimated. When using R to do the ANOVA for the nonlinear model, we have to define the error term ourselves. For instance, with a treatment A and subject variable S and a dependent variable y, the analysis with the nonlinear model can be done by using aov(y~x+Error(s/x),data). See the below demonstration. The subject variable’s effect now is not tested. As shown in below, the \(SS_{subject}\) is isolated from the other effects and there is no any statistical testing for it. In most cases, the nonadditive model is assumed.

res.aov<-aov(h~w+Error(s/w),dta)
summary(res.aov)
## 
## Error: s
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals  8  486.7   60.84               
## 
## Error: s:w
##           Df Sum Sq Mean Sq F value Pr(>F)    
## w          4 2449.2   612.3   85.04 <2e-16 ***
## Residuals 32  230.4     7.2                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can also compute the effect size of the treatment effect \(\omega^2\).

stats1<-summary(res.aov)[[1]]
SSBL<-unlist(stats1)[2]
stats2<-summary(res.aov)[[2]]
SSA<-stats2[[1]][["Sum Sq"]][1]
SSR<-stats2[[1]][["Sum Sq"]][2]
SST<-SSBL+SSA+SSR
DFm<-stats2[[1]][["Df"]][1]
MSr<-stats2[[1]][["Mean Sq"]][2]
W2.A<-(SSA-DFm*MSr)/(SST+MSr)
W2.A
##    Sum Sq 
## 0.7626884

Also, according to the visual inspection, whether the relaxation training is taken on would influence the duration of headaches. We can test for this hypothesis with the contrast between the first two means and the last three ones.

dta$c1<-rep(c(-3,-3,2,2,2),each=9)
dta$c2<-rep(c(1,-1,0,0,0),each=9)
dta$c3<-rep(c(0,0,1,-1,0),each=9)
dta$c4<-rep(c(0,0,1,1,-2),each=9)
summary(aov(h~c1+c2+c3+c4+Error(s/(c1+c2+c3+c4)),dta))
## 
## Error: s
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals  8  486.7   60.84               
## 
## Error: s:c1
##           Df Sum Sq Mean Sq F value  Pr(>F)    
## c1         1 2388.2  2388.2   320.6 9.7e-08 ***
## Residuals  8   59.6     7.4                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Error: s:c2
##           Df Sum Sq Mean Sq F value Pr(>F)
## c2         1    0.5     0.5   0.038  0.849
## Residuals  8  104.0    13.0               
## 
## Error: s:c3
##           Df Sum Sq Mean Sq F value  Pr(>F)   
## c3         1  56.89   56.89   18.88 0.00246 **
## Residuals  8  24.11    3.01                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Error: s:c4
##           Df Sum Sq Mean Sq F value Pr(>F)
## c4         1   3.63   3.630    0.68  0.433
## Residuals  8  42.70   5.338

A Numeric Example of Repeated Measurement Design with Two Factors

A company has created a new training program for their customer service staff. To test the effectiveness of the program they took a sample of 10 employees and assessed their performance in three areas: Product (knowledge of the company’s products and services), Client (their ability to relate to the customer with politeness and empathy) and Action (their ability to take action to help the customer). The data are listed as below.

dta<-data.frame(s=gl(10,1,60,labels=1:10),
                time=gl(2,30,60,labels=c("pre","post")),
                knowledge=gl(3,10,60,labels=c("product","client","action")),
                score=c(13,12,17,12,19,6,17,18,23,18,
                        12,19,19,25,27,12,18,29,30,12,
                        17,18,24,25,19,6,30,36,24,24,
                        18,6,21,18,18,6,24,22,18,24,
                        30,18,31,39,28,18,36,36,38,25,
                        34,30,32,40,27,23,38,40,32,34))

Let’s check the means of all conditions. Note in the below codes, gl( ) actually creates a number series 1, 1, 2, 2, 3, and 3. Thus, ggplot( ) plot the group means following this series. However, in gl( ), we define labels=c(“product”,“client”,“action”). This changes the labels from numbers to characters. This is why the x axis did not show the labels in alphabetical order.

means<-with(dta,tapply(score,list(time,knowledge),mean))
ses<-with(dta,tapply(score,list(time,knowledge),sd))/sqrt(table(dta[,-c(1,4)]))
ggplot(data.frame(means=c(means),ses=c(ses),x=gl(3,2,6,labels=c("product","client","action")),
                  g=rep(c("pre","post"),3)),
       aes(x=x,y=means,fill=g))+
  geom_bar(stat="identity",position=position_dodge(0.9))+
  geom_errorbar(aes(ymin=means-ses,ymax=means+ses),width=0.2,
                position=position_dodge(0.9))

We conduct an omnibus F test for testing the main effects and the interaction effect. Suppose we are using the nonadditive model. See the below table.

Source numerator \(df_{1}\) denominator \(df_{2}\) F
\(A\) \(MSA\) \(p-1\) \(MS_{A/S}\) \((p-1)(n-1)\) \(F=MSA/MS_{A/S}\)
\(B\) \(MSB\) \(q-1\) \(MS_{B/S}\) \((q-1)(n-1)\) \(F=MSB/MS_{B/S}\)
\(AB\) \(MSAB\) \((p-1)(q-1)\) \(MS_{AB/S}\) \((p-1)(q-1)(n-1)\) \(F=MSAB/MS_{AB/S}\)
res.aov<-aov(score~knowledge*time+Error(s/(knowledge*time)),dta)
summary(res.aov)
## 
## Error: s
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals  9   1598   177.6               
## 
## Error: s:knowledge
##           Df Sum Sq Mean Sq F value   Pr(>F)    
## knowledge  2 1365.2   682.6   26.96 3.85e-06 ***
## Residuals 18  455.8    25.3                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Error: s:time
##           Df Sum Sq Mean Sq F value   Pr(>F)    
## time       1  828.8   828.8   33.85 0.000254 ***
## Residuals  9  220.3    24.5                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Error: s:knowledge:time
##                Df Sum Sq Mean Sq F value   Pr(>F)    
## knowledge:time  2  224.4  112.22   12.63 0.000373 ***
## Residuals      18  159.9    8.88                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Since the interaction effect is significant, we can test the simple effect for each factor. Together with the pattern of mean performance, it is concluded that the interaction effect comes from the fact that the training does not work for product knowledge but does work for other two kinds of knowledge.

# Simple effect of "time"
res.t.k1.aov<-aov(score~time+Error(s/time),subset(dta,knowledge=="product"))
summary(res.t.k1.aov)
## 
## Error: s
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals  9    489   54.33               
## 
## Error: s:time
##           Df Sum Sq Mean Sq F value Pr(>F)
## time       1     20   20.00     1.8  0.213
## Residuals  9    100   11.11
res.t.k2.aov<-aov(score~time+Error(s/time),subset(dta,knowledge=="client"))
summary(res.t.k2.aov)
## 
## Error: s
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals  9  793.8    88.2               
## 
## Error: s:time
##           Df Sum Sq Mean Sq F value  Pr(>F)   
## time       1  460.8   460.8   21.47 0.00123 **
## Residuals  9  193.2    21.5                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
res.t.k3.aov<-aov(score~time+Error(s/time),subset(dta,knowledge=="action"))
summary(res.t.k3.aov)
## 
## Error: s
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals  9    771   85.67               
## 
## Error: s:time
##           Df Sum Sq Mean Sq F value   Pr(>F)    
## time       1  572.4   572.4   59.19 3.02e-05 ***
## Residuals  9   87.0     9.7                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Simple effect of "knowledge"
res.k.t1.aov<-aov(score~knowledge+Error(s/knowledge),subset(dta,time=="pre"))
summary(res.k.t1.aov)
## 
## Error: s
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals  9    871   96.77               
## 
## Error: s:knowledge
##           Df Sum Sq Mean Sq F value Pr(>F)  
## knowledge  2  244.3  122.13   5.882 0.0108 *
## Residuals 18  373.7   20.76                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
res.k.t2.aov<-aov(score~knowledge+Error(s/knowledge),subset(dta,time=="post"))
summary(res.k.t2.aov)
## 
## Error: s
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals  9  947.5   105.3               
## 
## Error: s:knowledge
##           Df Sum Sq Mean Sq F value   Pr(>F)    
## knowledge  2 1345.4   672.7   50.05 4.44e-08 ***
## Residuals 18  241.9    13.4                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Suppose we test the effects in an additive model. Then the error term for testing all effects is the wihin-cell error. See the below table.

Source numerator \(df_{1}\) denominator \(df_{2}\) F
\(S\) \(E(MSS)=\sigma^2_{\epsilon}+pq\sigma^2_{\pi}\) \(n-1\) \(E(MS_{error})=\sigma^2_{\epsilon}\) \((pq-1)(n-1)\) \(F=MSBL/MS_{error}\)
\(A\) \(E(MSA)=\sigma^2_{\epsilon}+nq\sigma^2_{\alpha}\) \(p-1\) \(E(MS_{error})=\sigma^2_{\epsilon}\) \((pq-1)(n-1)\) \(F=MSA/MS_{error}\)
\(B\) \(E(MSB)=\sigma^2_{\epsilon}+np\sigma^2_{\beta}\) \(q-1\) \(E(MS_{error})=\sigma^2_{\epsilon}\) \((pq-1)(n-1)\) \(F=MSB/MS_{error}\)
\(AB\) \(E(MSAB)=\sigma^2_{\epsilon}+n\sigma^2_{\alpha\beta}\) \((p-1)(q-1)\) \(E(MS_{error})=\sigma^2_{\epsilon}\) \((pq-1)(n-1)\) \(F=MSAB/MS_{error}\)

Therefore, the results of the omnibus F test with all factors as fixed-effect variables are listed as follow.

res.nadd.aov<-aov(score~knowledge*time+s,dta)
summary(res.nadd.aov)
##                Df Sum Sq Mean Sq F value   Pr(>F)    
## knowledge       2 1365.2   682.6  36.743 3.47e-10 ***
## time            1  828.8   828.8  44.612 3.05e-08 ***
## s               9 1598.1   177.6   9.558 5.79e-08 ***
## knowledge:time  2  224.4   112.2   6.040  0.00475 ** 
## Residuals      45  836.0    18.6                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1