Testing a Sample Mean When Population Standard Deviation is Known

In Chapter 7, the author showed a simple case for testing a sample mean given the population standard deviation. In that case, 166 children from homes in which at least one parent had a history of depression. These children all completed the Youth Self-Report, and the sample mean was 55.71 with a standard deviation of 7.35. We want to test the null hypothesis that these children come from a normal population with a mean of 50 and a standard deviation of 10. Let the mean of the population of these children be \(\mu\). Thus, \(H_1: \mu\neq\mu_o\) and \(H_o: \mu=\mu_o\), where \(\mu_o=50\). Since the mean is what we want to test, we need to determine the parameters of the population distribution of means (or the sampling distribution when \(H_o\) is true) when \(H_o\) is true.

In this case, the mean of the sampling distribution when \(H_o\) is true is 50 obviously. What about the standard deviation of that sampling distribution? That is \(\frac{\sigma}{\sqrt{N}}=\frac{10}{\sqrt{166}}\). The below code computes the probability to get the sample mean when \(H_o\) is true. That is 1.88e-13. Definitively, we can reject \(H_o\) and draw a conclusion that these children are not from the normal population. In this case, we are doing a two-detailed test. However, the function pnorm( ) only computes the probability below a particular value and is suitable for a one-tailed test. Therefore, we double the estimated probability to get a value equal to or larger than the sample mean on the \(H_o\) distribution in order to cover the situation in the opposite direction.

(1-pnorm(55.71,50,10/166^0.5))*2
## [1] 1.882938e-13

Testing a Sample Mean when Population Standard Deviation is Unknown

In general, we rarely know the value of population standard deviation \(\sigma\) and usually have to estimate it by way of the sample standard deviation \(s\). However, \(s\) is more likely to underestimate \(\sigma\) than overestimate it, especially for small samples. The relevant description can be seen in Section 7.3 and the below codes are used to ilustrate how Figure 7.3 is created.

library(ggplot2)
library(gridExtra)
set.seed(1234)
# Create population
p<-rnorm(10000,5,50^0.5)
sd.p<-sd(p) # Population SD
# Generate random samples of size = 5
s5<-sapply(1:10000,function(i)sample(p,5,replace=T))
# Compute the s of each sample
s5.sds<-apply(s5,2,sd)
# Generate random samples of size = 30
s30<-sapply(1:10000,function(i)sample(p,30,replace=T))
# Compute the s of each sample
s30.sds<-apply(s30,2,sd)
# Plot histograms
fg1<-ggplot(data=data.frame(x=s5.sds^2),aes(x))+
  geom_histogram(color="white",fill="deepskyblue2",bins=30)+
  geom_vline(xintercept=50)
fg2<-ggplot(data=data.frame(x=s30.sds^2),aes(x))+
  geom_histogram(color="white",fill="tomato",bins=30)+
  geom_vline(xintercept=50)
grid.arrange(fg1,fg2,ncol=2)

As the sample variance underestimates the population variance when sample size is small, we will get too many “significant” testing results, if the raw score is transferred to a “\(z\)” score but with the sample SD as the standard deviation. Thus, the \(z\) distribution is not a good choice for testing the mean when \(\sigma\) is unknown and sample size is small. Gosset (1908) showed that if the data are sampled from a normal distribution, using \(s^2\) in place of \(\sigma^2\) would lead to a particular sampling distribution, now generally known as Student’s \(t\) distribution or \(t\) distribution.

Similar to \(z\), \(t=\frac{\bar{x}-\mu}{s/ \sqrt{N}}\). The probability density function of \(t\) is \(f(t)=\frac{\Gamma(\frac{\nu+1}{2})}{\sqrt{\nu\pi}\Gamma(\frac{\nu}{2})}(1+\frac{t^2}{\nu})^{-\frac{\nu+1}{2}}\). The detailed descriptions about \(t\) distribution can be found here. Unlike \(z\), \(t\) has three parameters, \(\mu\), \(s\), and \(\nu\) and the distribution of \(t\) varies as a function of \(\nu\), namely the the degrees of freedom or \(df\). Let us see how \(t\) distribution varies with \(df\). The below codes plot the probability density of \(t\) for three \(df\)’s and the probability density of \(z\). As what can be seen, the larger the \(df\), the more similar \(t\) is to \(z\). Thus, when \(n\rightarrow\infty\), \(t\rightarrow z\). Normally, n = 30 is regarded as “significantly large”. For one-sample case, \(df=n-1\).

# Create a series of numbers from -3 to 3 with interval = 0.1
x<-seq(-3,3,0.1)
# Generate probability density of t for three df's and z
dta<-data.frame(x=rep(x,4),y=c(dt(x,5),dt(x,15),dt(x,30),dnorm(x,0,1)),
           g=rep(c("t1","t10","t30","z"),each=length(x)))
ggplot(data=dta,aes(x,y,color=g))+
  geom_line()

Example of one-sample t test

Nurbombe, et al. (1984) reported on an intervention program for the mothers of LBW infants. This program was designed to make mothers more aware of their infants’ signals and more responsive to their needs. One of the dependent variables used in this program was the Psychomoto Development Index (PDI) of the Bayley Scales of Infant Development. The researchers were interesed in whether LBW infants in general were significantly different from the normative population mean of 100 usually found with this index. Therefore, we have a hypothese that the LBW infants are different from the normal population with a mean of 100, namely \(H_1: \mu\neq100\) and \(H_o: \mu=100\). As we do not know the standard deviation of the normative popuation distribution on PDI, we use \(t\) distribution for testing our hypothesis. In the below codes, I first import data from a text file “7_1.txt” with the function scan( ), because the data in this file is not structuralized and we cannot use the function such as read.table( ), which is particularly designed for importing the spreadsheet data.

# Import data
scores<-scan("7_1.txt",what=double(),sep=",")
# Describe data
fg3<-ggplot(data=data.frame(x=rep("LBW",length(scores)),y=scores),aes(x,y))+
  geom_boxplot()+coord_flip()
fg4<-ggplot(data=data.frame(x=scores),aes(sample=x))+
  geom_qq(shape=1,color="red")+geom_qq_line()
grid.arrange(fg3,fg4,ncol=2)

Although the Q-Q plot here does not suggest that the distribution of the scores are normal, the sample size (56) is large enough for us to assume that the sampling distribution of the mean would be reasonably normal. Then we can use \(t\) distribution to test our hypothesis. The below lists two ways in R to do a \(t\) test. The first is to transfer the sample mean to a \(t\) score and compute the probability with the values more extreme than that \(t\) value on the \(t\) distribution when \(H_o\) is true. Since this is a two-tailed test, the probability larger than the \(t\) score for the sample mean should be doubled to cover the opposite situation. The second way is to use the function t.test( ) to directly run a one-sample \(t\) test with the argument mu set up as the mean in \(H_o\). The report shows that \(df\) is 55 and \(p\) value is smaller than .05. Thus, \(H_o\) is rejected and the LBW infants do perform differently from the mean of the normative distribution.

xt<-(mean(scores)-100)/(sd(scores)/sqrt(length(scores)))
(1-pt(xt,length(scores)-1))*2
## [1] 0.01736831
t.test(scores,mu=100)
## 
##  One Sample t-test
## 
## data:  scores
## t = 2.4529, df = 55, p-value = 0.01737
## alternative hypothesis: true mean is not equal to 100
## 95 percent confidence interval:
##  100.7549 107.4951
## sample estimates:
## mean of x 
##   104.125

Another example comes from the study of Kaufman and Rock’s paper (1962) on the moon illusion. The moon illusion refers to the fact that we see the moon near the horizon, it appears to be considerably larger than when we see it high in the sky. See below.

Kaufman and Rock measured subjects’ estimated size of the moon that appeared to be on the horizon with a special apparatus. The dependent variable was the ratio of subjects’ estimation the and standard moons. Thus, a ratio of 1.00 would indicate no illusion, namely \(H_1: \mu\neq1\) and \(H_o: \mu=1\). For the 10 subjects, the mean of the ratios was 1.463 and the standard deviation is 0.341. Since the \(p\) value is smaller than .05, we can reject \(H_o\) at \(\alpha=.05\).

xt<-(1.463-1)/(0.341/sqrt(10))
xt
## [1] 4.29365
(1-pt(xt,10-1))*2
## [1] 0.002009281

Confidence Interval on the Mean

Confidence intervals are a useful way to convey the meaning of an experimental result that goes beyond the simple hypothesis test. Take the moon illusion study as an example. The mean ratio estimated by the 10 subjects is 1.463. Due to the sampling error and measuring error, it is quite unlikely to get the exact number if we run the experiment for another 10 subjects. Thus, we can set up a range to encompass the population mean. This range consists of an upper limit on \(\mu\) and a lower limit on \(\mu\). By convention, we want the probability that the true \(\mu\) appears in this confidence interval to be 95%. Let us start from the \(t\) distribution. The mean of \(t\) distribution is always 0. Then, the lower \(t\) limit is the value smaller than which there are 2.5% of the \(t\) scores. Similarly, the upper \(t\) limit is the value larger than which there are 2.5% of the \(t\) scores. The below codes firstly compute these two limits on the \(t\) distribution with \(df=9\). According to \(t=\frac{\bar{x}-\mu}{s/ \sqrt{N}}\), we know that the upper limit for the population mean can be computed as \(\bar{x}_{.025}=\mu+t_{.025}s/\sqrt{N}\), which is 1.707. Again, the lower limit for the population mean can be computed in the same way, which is 1.219. That is, the 95% confidence interval for the population mean is \(1.219\leq\mu\leq1.707\).

# Upper limit on t with df = 9
ut<-qt(0.975,9)
lt<-qt(0.025,9)
c(lt,ut)
## [1] -2.262157  2.262157
1.463+ut*0.341/sqrt(10)
## [1] 1.706937
1.463+lt*0.341/sqrt(10)
## [1] 1.219063