Contrasts

When the omnibus \(F\) test is significant and you are not completely having no idea about the reason, it is suggested to test those comparisons of your interest only. That is a priori comparisons or planned comparisons or contrasts. In the language of linear algebra, any comparison of two means can be the product of the vector of means and the vector of coefficients. Suppose there are three conditions in an experiments. Thus, the comparison \(\psi_1=\bar{x}_1-\bar{x}_2\) can be represented as \(\psi_1=\bar{x}_1-\bar{x}_2+0\cdot\bar{x}_3\), which is the product of the coefficient vector [1 -1 0] and the means vector [\(\bar{x}_1\) \(\bar{x}_2\) \(\bar{x}_3\)]. Similarly, the comparison \(\psi_2=\bar{x}_2-\bar{x}_3\) can be represented as \(\psi_2=0\cdot\bar{x}_1+\bar{x}_2-\bar{x}_3\), where the coefficient vector is changed to [0 1 -1]. Therefore, the comparison is just the linear combination of the means vector \(\bar{X}=[\bar{x}_1 \bar{x}_2 \cdots \bar{x}_k]\) weighted by the coefficient vector \(C=[c_1 c_2 \cdots c_k]\). That is, \(\Psi=C\cdot\bar{X}^T\).

The coefficients should be defined with the constraint that \(\sum_i c_i=0\). For any pair of the means we are about to comare, the coefficient for one of them is set as 1 and the other is set as -1. The rest coefficients are all set as 0. This coding manner is called effect coding rather than dummy coding.

A set of coefficients defines a contrast. Theoretically, the number of contrasts could be extremely large. Due to the principle of linear algebra, the orthogonal contrasts are demanded in order to correctly partition the sum of squares for testing the comparisons. Suppose you have 3 conditions in an experiment. The coefficient vector for comparing the first two means (\(\psi_1\)) is [1 -1 0], whereas it is [0 1 -1] for comparing the last two means (\(\psi_2\)). In the below matrices, each row is a coefficient vector. Obviously, the left matrix contains two comparsions \(\psi_1\) and \(\psi_2\). However, these two coefficient vectors are not orthogonal, as the inner product of them is not 0. The middle matrix lists the coefficients for \(\psi_1\) and the coefficients for comparing \(\bar{x}_3\) and the mean of \(\bar{x}_1\) and \(\bar{x}_2\). These two coefficient vectros are orthogonal, as their inner prodcut is 0. Similarly, the right matrix also lists two orthogonal contrasts. Thus, the contrasts in the right and middle matrices can be tested without problem. In fact, if you have \(k\) means, you can get \(k-1\) orthogonal contrasts. Theoretically, you can have many sets of orthogonal contrasts. Which set you are going to use depends on your research interest.

Example 1

See the data listed as below. There are four conditions, in each of which 8 scores from different subjects are collected. Let us start analyzing these data.

dta<-data.frame(G=gl(4,8,4*8,labels=1:4),
                Y=c(3,6,3,3,1,2,2,2,
                    4,5,4,3,2,3,4,3,
                    7,8,7,6,5,6,5,6,
                    7,8,9,8,10,10,9,11))
head(dta)
##   G Y
## 1 1 3
## 2 1 6
## 3 1 3
## 4 1 3
## 5 1 1
## 6 1 2

Describe Data

First, we make a bar plot for the means in all conditions, with the error bar as one standard error. As shown in the figure, it looks like that the response scores in the first two conditions are not different whereas the scores in the rest conditions are quite different.

library(ggplot2)
means<-with(dta,tapply(Y,G,mean))
ses<-with(dta,tapply(Y,G,sd))/sqrt(8)
fg1<-ggplot(data=data.frame(Resp=means,ses=ses,Cond=as.factor(1:4)),
            aes(x=Cond,y=Resp))+theme_bw()+
       theme(plot.background=element_blank(),
                  panel.grid.major=element_blank(),
                  panel.grid.minor=element_blank())+
       geom_bar(stat="identity",color="grey",fill="blanchedalmond")+
       geom_errorbar(aes(ymin=Resp-ses,ymax=Resp+ses),width=0.2,
                position=position_dodge(0.9))
fg1

ANOVA

Following the visual inspection of the above figure, the experimental treatment is expected to be significant. As this experiment took a between-subjects design, a between-subjects one-way ANOVA is conducted. The result supports our inspection.

result.aov1<-aov(Y~G,dta)
summary(result.aov1)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## G            3  194.5   64.83   44.28 9.32e-11 ***
## Residuals   28   41.0    1.46                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Contrasts

In addition to knowing whether or not the experimental treatment affects the dependent variable, we want to go a step further to examine from where the effect comes. According to the visual inspection of the bar plot, the difference between the first two means is not expected to be significant. On the contrary, the difference between the last two means should be significant. We can design contrasts to test these differences. These contrasts are determined by their coefficient vectors. For instance, the contrast between the first two means has the coefficient vector with 1 for condition 1, -1 for condition 2, and 0 for other conditions. Thus, we create this coefficient vector as a new column c1 in dta. Similarly, we create the second contrast for the last two means as c2 in dta. In order to make all contrasts orthogonal to each other, the coefficient for the third contrast is set as [0.5 0.5 -0.5 -0.5]. By calling the data frame dta, you can see now that the whole data include the three columns for the coefficients for the contrasts.

# Coefficient vector for the first contrast
dta$c1<-ifelse(dta$G==1,1,ifelse(dta$G==2,-1,0))
# Coefficient vector for the second contrast
dta$c2<-ifelse(dta$G==3,1,ifelse(dta$G==4,-1,0))
# Coefficient vector for the third contrast
dta$c3<-ifelse(as.numeric(dta$G)<3,0.5,-0.5)
dta
##    G  Y c1 c2   c3
## 1  1  3  1  0  0.5
## 2  1  6  1  0  0.5
## 3  1  3  1  0  0.5
## 4  1  3  1  0  0.5
## 5  1  1  1  0  0.5
## 6  1  2  1  0  0.5
## 7  1  2  1  0  0.5
## 8  1  2  1  0  0.5
## 9  2  4 -1  0  0.5
## 10 2  5 -1  0  0.5
## 11 2  4 -1  0  0.5
## 12 2  3 -1  0  0.5
## 13 2  2 -1  0  0.5
## 14 2  3 -1  0  0.5
## 15 2  4 -1  0  0.5
## 16 2  3 -1  0  0.5
## 17 3  7  0  1 -0.5
## 18 3  8  0  1 -0.5
## 19 3  7  0  1 -0.5
## 20 3  6  0  1 -0.5
## 21 3  5  0  1 -0.5
## 22 3  6  0  1 -0.5
## 23 3  5  0  1 -0.5
## 24 3  6  0  1 -0.5
## 25 4  7  0 -1 -0.5
## 26 4  8  0 -1 -0.5
## 27 4  9  0 -1 -0.5
## 28 4  8  0 -1 -0.5
## 29 4 10  0 -1 -0.5
## 30 4 10  0 -1 -0.5
## 31 4  9  0 -1 -0.5
## 32 4 11  0 -1 -0.5

Of course, we can test these contrasts in modified \(t\) test, just like those post hoc tests. As each contrast corresponds to a part of the variability of data caused by the treatment effect, \(SS_{treat}=SS_{\psi_1}+SS_{\psi_2}+SS_{\psi_3}\). This is because these contrasts are orthogonal to each other. Similar to testing the main effect of treatment, the error term in the \(F\) test for each contrast is always the same \(MS_{error}\). Now the independent variable becomes these three orthogonal contrasts. Thus, the same aov() with a different formula is applied for the data. In the below summary table, these three contrasts are tested with the same error term. Note the df and SS (or MS) are all the same as those for the error term in the summary table for the omnibus \(F\) test. Also, you can compute the sums of squares of c1, c2, and c3 and you will get the same value in the SS column for G in the summary table for the omnibus \(F\) test. The test results for the contrasts are consistent with our inspection. There is no difference between the first two conditions. However, the difference between the last two conditions and the difference between the means of the first two means and the last two means is significant.

contrasts.aov<-aov(Y~c1+c2+c3,dta)
summary(contrasts.aov)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## c1           1   2.25    2.25   1.537    0.225    
## c2           1  30.25   30.25  20.659 9.61e-05 ***
## c3           1 162.00  162.00 110.634 3.12e-11 ***
## Residuals   28  41.00    1.46                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Example 2

A study would like to examine which treatment is better for reducing stress. Three treatments were administered for different participants and the data are contained in stress.txt. Please answer the following questions that (1) whether there is an effect of treatment on reducing stress and (2) whether the comparison between “medical” and “physical” and the comparison between the average of “medical” and “physical” and “mental” are statistically significant.

Describe Data

Now let us import the data and describe them. There are two columns in this data frame representing for the treatment and stress reduction score in order.

stress.dta<-read.table("stress.txt",header=T,sep=",")
head(stress.dta)
##   Treatment StressReduction
## 1    mental               2
## 2    mental               2
## 3    mental               3
## 4    mental               4
## 5    mental               4
## 6    mental               5

Since there is no quantitative relation among these three treatments, using bar plot for the means is appropriate. The visual inspection of this figure suggests that the medical treament seems to reduce the least stress among all treatments.

# Compute the group means
stress.means<-with(stress.dta,tapply(StressReduction,Treatment,mean))
# Count the size of each group
nums<-table(stress.dta$Treatment)
# Compute the standard errors
stress.ses<-with(stress.dta,tapply(StressReduction,Treatment,sd))/sqrt(nums)
stress.fg<-ggplot(data=data.frame(cond=names(nums),
                                  SR=stress.means,
                                  ses=as.numeric(stress.ses)),
                  aes(x=cond,y=SR))+
             theme_bw()+theme(plot.background=element_blank(),
                  panel.grid.major=element_blank(),
                  panel.grid.minor=element_blank())+
             geom_bar(stat="identity",fill="deepskyblue2")+
             geom_errorbar(aes(ymin=SR-ses,ymax=SR+ses),width=0.2,
                   position=position_dodge(0.9))
stress.fg

ANOVA

Since this a between-subjects design, a one-way between-subjects ANOVA is suitable to conduct here. The \(F\) test for the treatment effect is significant, supporting our inspection.

stress.aov<-aov(StressReduction~Treatment,stress.dta)
summary(stress.aov)
##             Df Sum Sq Mean Sq F value Pr(>F)  
## Treatment    2  11.67   5.833   5.164 0.0126 *
## Residuals   27  30.50   1.130                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Now we can test the two comparisons in the question. Since these comparison were planned in advance, we create the contrasts for them. Apparently, the physical treament is significantly better than the medical treatment. Also, the mental treatment is significantly better than the mix of these two treatments.

stress.dta$c1<-with(stress.dta,ifelse(Treatment=="physical",1,
                                      ifelse(Treatment=="medical",-1,0)))
stress.dta$c2<-with(stress.dta,ifelse(Treatment=="mental",-1,0.5))
stressC.aov<-aov(StressReduction~c1+c2,stress.dta)
summary(stressC.aov)
##             Df Sum Sq Mean Sq F value Pr(>F)  
## c1           1  5.000   5.000   4.426 0.0448 *
## c2           1  6.667   6.667   5.902 0.0221 *
## Residuals   27 30.500   1.130                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Application of Contrast: Trend Analysis

As shown in the previous introduction of contrast, a contrast is not limited to the compariosn between two means. Generally speaking, it is a linear combination of all means. With the coefficients being set properly, different relations between the means in all conditions can be defined. For instance, for an independent variable with 3 continuous levels, when the coefficients for the three means are set as [-1 0 1], this contrast is not only referring to the comparison between the first and the third mean, but also a linear relation among these means. Similarly, when the coefficients are set as [-1 2 -1], this contrast is referring to a quadratic relation among these three means. It is common in experiments to have a continuous indpendent variable, such as the dose of a drug, the time length of sleep deprivation, the amplitude of noise, and so on. Accordingly, the subjects’ performance for the levels of these variables reveals the effectiveness of the independent variable, that can be described as a trend in mathematics. Thus, the analysis to test different kinds of trends of the effectiveness of the experimental treatmetn is called trend analysis.

Mathematically speaking, a line with the degree = 1 (i.e., \(y=ax+b\)) is linear and it is quadratic if the degree = 2 (i.e., \(y=ax^2+bx+c\)). If the degree = 3 (i.e., \(y=ax^3+bx^2+cx+d\)), the line is cubic. We can apply linear contrasts to testing the trend of the treatment effectiveness among the means. See the below example.

Example 3

Giancola and Corman (2007) tested the relationship between alcohol and aggression in the presence of a distracting task. The data are listed as follow. We can easily recognize the experimental design by reading these codes. First, there are 5 conditions, as the first argument of gl() is 5. Second, there are 12 subjects in each condition, as the second argument of gl() defines that each of these 5 elements will repeat for 12 times. Third, this is a between-subjects design, as there is no column for subject ids.

alc.dta<-data.frame(cond=gl(5,12,12*5,labels=1:5),
                score=c(1.28,1.35,3.31,3.06,2.59,3.25,2.98,1.53,-2.68,2.64,1.26,1.06,
                        -1.18,0.15,1.36,2.61,0.66,1.32,0.73,-1.06,0.24,0.27,0.72,2.28,
                        -0.41,-1.25,-1.33,-0.47,-0.60,-1.72,-1.74,-0.77,-0.41,-1.2,-0.31,-0.74,
                        -0.85,0.14,-1.38,1.28,1.85,-0.59,-1.3,0.38,-0.35,2.29,-0.25,0.51,
                        2.01,0.4,2.34,-1.8,5,2.27,6.47,2.94,0.47,3.22,0.01,-0.66))
dim(alc.dta)
## [1] 60  2

Describe Data

Since the levels of the independent variable in this example is the the amount of alcohol consumed, the line plot makes more sense than the bar plot does for showing the trend of the influence of alcohol on aggressive behavior in a distracting task. As shown in the below figure, a quadratic trend is expected.

alc.means<-with(data=alc.dta,tapply(score,cond,mean))
alc.nums<-with(data=alc.dta,tapply(score,cond,length))
alc.ses<-with(data=alc.dta,tapply(score,cond,sd))/sqrt(alc.nums)
alcohol.fg<-ggplot(data=data.frame(Condition=1:5,
                                   Agg=alc.means,Agg.ses=alc.ses),
                   aes(x=Condition,y=Agg))+
                   geom_point()+geom_line()+
                   geom_errorbar(aes(ymin=Agg-Agg.ses,ymax=Agg+Agg.ses),
                                 width=0.2)
alcohol.fg

ANOVA

As usual, the omnibus \(F\) test is conducted first, as this test is used to verify the hypothesis that alcohol will influence the aggressive behavior in a distracting task. Obviously, the main effect of alochol is significant.

alc.aov1<-aov(score~cond,alc.dta)
summary(alc.aov1)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## cond         4  66.45  16.612   7.341 8.22e-05 ***
## Residuals   55 124.46   2.263                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Trend Analysis

Now we can test the trends among these 5 means. Since a trend among the means is just a linear contrast, the trends to be tested must be orthogonal to each other. Note that for \(k\) means, there are \(k-1\) trends to test. Just like testing for contrasts in R, we need to create new columns for the coefficients for the trends. Since we have 5 means in this case, there are 4 trends to test from degree=1 to degree=4. The results clearly supports the quadratic trend, as expected by the visual inspection of the line plot. That is, the aggresive behavior decreased to the lowest level when the subjects consumed intermediate level of alcohol. Other trends are not significant.

alc.dta$C1<-rep(c(-2,-1,0,1,2),each=12)
alc.dta$C2<-rep(c(2,-1,-2,-1,2),each=12)
alc.dta$C3<-rep(c(-1,2,0,-2,1),each=12)
alc.dta$C4<-rep(c(1,-4,6,-4,1),each=12)
alc.trends.aov<-aov(score~C1+C2+C3+C4,alc.dta)
summary(alc.trends.aov)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## C1           1   0.15    0.15   0.068    0.796    
## C2           1  60.32   60.32  26.658 3.45e-06 ***
## C3           1   1.58    1.58   0.699    0.407    
## C4           1   4.39    4.39   1.940    0.169    
## Residuals   55 124.46    2.26                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

You must be wondering how to set up the coefficients to make orthogonal trends. This webpage provides you some examples, in which the coefficients are listed for the trends up to degree 6.

Exercise:

The effect of morphine for releasing pain is known to gradually decrease as the injection times increases. It suggests that the neural system has beomce resistant to morphine. In a study for the effect of morphine, five different groups of rats were tested for their pain tolerance as how long they can stand for the heat of the floor of cage. Among these five groups, the first group was injected with morphine for the first four days and saline on the fifth day. The second group was injected with morphine for all five days. The third group was injected with saline all the way through these five days. The fourth group was injected with saline for the first four days and morphine on the fifth day. The last group was injected with morphine all the way through these five days. However, on the fifth day, they were moved to a new cage. The data simulated for this study are listed below. Please analyze these data and try to answer whether the resistance to morphine is physically caused or not.

morphine.dta<-data.frame(G=gl(5,8,5*8,labels=c("MS","MM","SS","SM","McM")),
                         Y=c(3,5,1,8,1,1,4,9,
                             2,12,13,6,10,7,11,19,
                             14,6,12,4,19,3,9,21,
                             29,20,36,21,25,18,26,17,
                             24,26,40,32,20,33,27,30))