In the spring of 2000, two Applied Statistics students, Ben Jones and Emily Zabor studied the relationship between student backpack weight and other factors. They used a convenience sample of students encountered at several locations on campus, asking passing students who were carrying backpacks to spend a few minutes filling out a survey and having their packs weighed.
To obtain the data:
(1) Select and copy the data set from the web (including the header).
(2) Inside R, type:
backpack.df <- read.table(file='clipboard', header=T)
The result will be you have a data set stored in R, in a data structure R calls a "data frame." (The .df extension is used for this reason.)
ID Weight Sex Height Year SocSci Science Humanities Major Live Age Location 1 5.0 M 68.0 4 2 0 1 ss S 0.5 ARH 2 3.0 M 75.0 1 2 1 1 ss N 1.0 ARH 3 2.0 F 61.0 1 2 0 2 unknown S 1.0 ARH 4 20.0 F 63.0 1 2 1 2 s N 3.0 ARH 5 7.0 F 62.0 1 1 2 1 unknown N 3.0 ARH 6 9.0 M 72.0 1 2 0 2 h S 7.0 ARH 7 10.0 M 74.0 2 3 0 1 ss S 2.0 ARH 8 6.0 F 64.0 1 2 1 1 unknown S 1.0 ARH 9 9.0 F 63.0 2 2 0 2 ss S 2.0 ARH 10 6.0 M 69.0 2 0 2 1 s/h N 3.0 ARH 11 18.5 F 64.0 1 2 0 2 ss S 4.0 ARH 12 9.5 M 69.0 1 2 1 1 s S 1.0 ARH 13 8.5 M 75.0 1 1 2 1 s/h S 1.5 ARH 14 9.0 M 75.0 1 2 1 1 ss S 0.1 ARH 15 9.5 F 65.0 1 1 0 2 ss/h N 3.0 ARH 16 16.0 F 64.0 1 1 1 2 s S 6.0 Burl 17 24.0 F 64.0 2 1 0 3 h N 0.3 Burl 18 15.0 F 63.0 3 2 2 0 s N 3.0 Burl 19 8.0 F 67.0 3 2 0 2 h N 3.0 Burl 20 10.0 F 67.5 1 3 1 0 s S 4.0 Burl 21 17.0 F 64.5 1 1 2 1 s N 0.5 Burl 22 9.0 F 64.0 3 2 0 2 h S 3.0 Burl 23 8.0 F 64.0 3 0 1 3 ss/h N 0.7 Burl 24 10.0 F 59.0 3 2 1 1 ss N 0.6 Burl 25 11.0 M 70.0 2 2 0 2 h S 2.0 Burl 26 11.0 M 70.0 4 2 0 1 h S 2.0 Burl 27 10.0 F 64.0 2 1 0 3 h O 1.0 Burl 28 11.0 F 63.5 3 0 1 3 h O 4.0 Burl 29 5.0 F 68.0 4 1 0 2 h S 2.0 Burl 30 15.0 M 72.0 3 4 1 0 ss N 4.0 Burl 31 15.0 M 74.0 3 0 3 1 s O 1.0 Burl 32 20.0 F 67.0 2 2 2 0 s S 1.0 Burl 33 18.0 F 67.0 2 2 2 0 s S 0.5 Burl 34 12.0 F 69.0 3 1 1 2 ss/h N 0.3 Burl 35 17.0 M 69.0 3 1 2 2 ss/h S 1.5 Burl 36 13.0 M 72.0 1 0 2 2 h S 1.5 Beach 37 10.0 M 67.0 2 2 0 2 h O 1.5 Beach 38 12.0 M 65.0 1 1 2 1 s S 0.5 Beach 39 13.0 F 62.0 1 1 0 3 h S 3.0 Beach 40 5.0 M 71.0 1 1 1 1 h N 1.0 Beach 41 14.0 F 62.0 4 0 0 4 h S 4.0 Beach 42 5.0 M 70.0 4 2 0 2 ss S 10.0 Beach 43 5.0 F 67.0 4 2 0 1 ss S 2.0 Beach 44 18.5 M 68.0 2 0 1 2 h N 3.0 Beach 45 12.0 F 67.0 3 2 0 2 h O 3.0 Beach 46 5.0 F 65.0 3 2 0 2 ss S 20.0 Beach 47 28.0 M 70.0 4 1 0 2 h S 0.3 Beach 48 9.0 M 71.0 4 3 1 1 s S 1.5 Beach 49 14.0 F 70.0 4 1 0 2 ss/h O 1.0 Beach 50 15.0 M 70.0 3 1 2 1 s O 4.0 Beach 51 8.0 M 68.0 3 0 0 3 h S 4.0 ARH 52 11.0 M 74.0 4 2 0 1 ss O 4.0 ARH 53 6.0 F 68.0 2 1 1 2 h O 0.5 ARH 54 4.0 F 67.0 3 2 1 1 s S 3.0 ARH 55 11.0 M 72.0 1 1 2 1 h S 3.0 ARH 56 3.0 F 68.0 2 2 1 1 ss S 1.0 ARH 57 13.0 M 68.0 1 2 1 1 ss N 3.0 ARH 58 10.0 F 67.0 3 3 0 0 ss S 0.5 ARH 59 5.0 M 75.0 1 1 2 1 ss/h S 2.0 ARH 60 3.0 F 62.0 1 2 0 2 ss S 3.5 ARH 61 12.0 F 66.0 4 1 0 3 ss O 5.0 ARH 62 15.0 M 70.0 2 2 2 0 s S 0.5 ARH 63 18.0 M 71.0 2 2 1 1 ss S 3.0 ARH 64 7.0 F 61.0 4 0 1 2 s N 0.3 ARH 65 5.0 F 67.0 1 1 1 2 h S 3.0 ARH 66 5.0 M 70.0 3 1 2 1 s S 5.0 ARH 67 5.0 F 68.0 1 1 1 2 h S 5.0 ARH 68 2.0 F 67.0 3 2 1 1 ss N 3.0 ARH 69 15.0 F 67.0 4 4 0 0 ss/h S 6.0 ARH 70 7.0 M 71.0 3 2 1 1 h N 0.5 ARH 71 9.0 M 71.0 3 4 1 2 ss N 2.3 ARH 72 5.0 F 65.0 3 1 2 1 s N 2.0 ARH 73 2.0 F 66.0 1 1 2 1 s N 1.0 ARH 74 6.0 F 66.0 2 2 2 0 s S 4.0 ARH 75 5.0 F 70.0 2 0 1 3 h S 2.0 ARH 76 17.0 M 66.0 4 0 2 2 s/h O 3.0 ARH 77 10.0 F 69.0 4 2 2 0 s/ss N 3.0 ARH 78 10.0 F 67.0 1 0 3 0 s N 0.8 ARH 79 9.0 M 69.0 3 1 1 2 h S 5.0 ARH 80 17.0 M 79.0 1 1 1 2 h N 4.5 ARH 81 15.0 M 70.0 3 0 2 2 s N 4.0 ARH 82 10.0 M 72.0 2 3 0 1 h N 5.0 ARH 83 9.0 M 72.0 2 1 0 3 h N 0.3 ARH 84 22.0 M 75.0 1 1 2 1 s N 2.0 ARH 85 4.0 F 69.0 4 1 0 3 h O 0.0 ARHTry the following R commands.
backpack.df[,2:12] # to see the data frame, minus ID variable
attach(backpack.df) # attach the data frame
tapply(Weight,Sex,mean)
tapply(Weight,Sex,sd) # get mean and sd
boxplot(Weight~Sex) # side-by-side boxplots of Decrease by Groupt.test(Weight ~ Sex) # two-sample t-test (Welch's version)NOTE: At this point, exit from R, BUT SAVING THE WORKSPACE IMAGE.
t.test(Weight ~ Sex, var.equal=TRUE) # two-sample t-test (assume equal variances)Now, re-enter R and find backpack.df still there.
Re-attach backpack.df.
Alternatively, you can do the t.test function when you have separate vectors for the two samples. For example:
menpacks <- split(Weight, Sex)$M womenpacks <- split(Weight, Sex)$F t.test(menpacks,womenpacks) t.test(menpacks,womenpacks,var.equal=T)