A study by Zhong, Bohns, and Gino (2010) was designed to test the following research hypothesis:
Darkness may conceal identity and encourage moral transgressions, and if so, participants completing a test in a dimly-lit room should self report a test score that is higher than what they actually earned.
Independent variable (X): Eighty-four college student participants were randomly assigned to one of two testing conditions; 1) a small laboratory room that was normally illuminated or 2) a small laboratory room with dimmed lighting. The dimmed lighting was sufficient for participants to see the behavioral task, but it was visibly dimmer than the well-lit room.
Behavioral Task: Upon entering the testing room, participants received an envelope that contained ten dollars, which could be earned during the behavioral task. For the task, participants received 20 number matrices on separate pieces of paper, each of which consisted of 12 three-digit numbers in a grid format (see below). Participants had 5 minutes to find two numbers in each matrix that added up to 10.0. At the end of 5 minutes, participants scored their own matrices and collected 50 cents from the envelope for each correct matrix.
Dependent variable (Y): Participants dishonesty was
measured as the discrepancy between their self-reported scores and their
actual test performance (scored by the researcher after participants
left the room). For example, a participant who reported 16/20 correct
and actually scored 16/20 would have a discrepancy score = 0 (no
dishonesty), whereas a participant who reported a score of 16/20 but
actually scored 10/20 would have a discrepancy score = 6, so higher
discrepancy scores indicate more dishonesty.
Load and view the data in the console:
= read.csv("https://andrewebrandt.github.io/object/datasets/honest.csv", header = TRUE)
honest.df head(honest.df)
## light.cond Gender Discrep
## 1 Dim F 3
## 2 Norm M 5
## 3 Norm F 2
## 4 Dim M 4
## 5 Dim F 4
## 6 Norm M 2
Use the describeBy()
function in the psych
package to create a descriptive statistics summary
table for the discrepancy scores in the normal and dim light
conditions.
library(psych)
<- describeBy(Discrep ~ light.cond, # quantitative variable ~ categorical variable
summary.df data = honest.df, mat = TRUE, skew = FALSE, digits = 2) # data frame, matrix format, omit skew and kurtosis, round
# show the results summary.df
## item group1 vars n mean sd min max range se
## Discrep1 1 Dim 1 42 4.05 1.51 1 7 6 0.23
## Discrep2 2 Norm 1 42 3.00 1.48 0 7 7 0.23
Use the ggplot()
function in the ggplot2
package to create a strip chart for the discrepancy scores across
conditions. Add a symbol to indicate the mean and error bars to indicate
the SEM.
library(ggplot2)
ggplot(honest.df, aes(x = light.cond, y = Discrep, color = light.cond)) +
geom_jitter(na.rm = TRUE, position = position_jitter(0.1),
size = 3, alpha = 0.7) +
stat_summary(fun = mean, na.rm = TRUE, geom = "point",
shape = 18, size = 3, color = "black") +
stat_summary(fun.data = mean_se, na.rm = TRUE, geom = "errorbar",
width = .1, color = "black") +
scale_x_discrete(name = "Light Condition") +
scale_y_continuous(name = "Mean Discrepancy Score") +
theme_minimal() +
theme(text = element_text(size = 16), legend.position = "none") +
scale_color_brewer(palette="Dark2")
Whenever you plot your data, it is good practice to check your work
for accuracy by comparing the descriptive values reported in summary
table to those in the plots (e.g., Are the means at the correct
location?, Do the error bars appear to be the correct size?).
Student’s t-statistic (William Gosset, 1908) was built on 3 assumptions about the populations from which the sample data was collected, and the degree to which the assumptions are met impacts your statistical conclusion validity.
Welch’s t-statistic (Bernard Welch, 1938) is a modified version of Student’s t in which pooled variance is replaced by the variance of each sample and a more conservative degrees of freedom is used. It performs similarly to Student’s t when population variances are equal, and is less prone to Type I error when they are not. It comes with only two assumptions.
Student’s t-statistic is the ratio between the sample mean difference (numerator) and the estimated standard error of the mean difference (denominator), or simply the ratio of the treatment effect to estimated error.
Find t and it’s associated p-value using the t.test()
function:
t.test(Discrep ~ light.cond, data = honest.df, var.equal = TRUE) # use FALSE for Welch t-test
##
## Two Sample t-test
##
## data: Discrep by light.cond
## t = 3.2057, df = 82, p-value = 0.001921
## alternative hypothesis: true difference in means between group Dim and group Norm is not equal to 0
## 95 percent confidence interval:
## 0.3975129 1.6977252
## sample estimates:
## mean in group Dim mean in group Norm
## 4.047619 3.000000
Make a statistical decision: The results show that \(p \le .05\) so reject the null hypothesis, \(H_0:\mu_{normal} = \mu_{dim}\)
Cohen’s d expresses the magnitude of the treatment effect in terms of standard deviation (pooled).
Use the effectsize()
function in the effectsize
package
to find d:
library("effectsize")
cohens_d(Discrep ~ light.cond, data = honest.df)
## Cohen's d | 95% CI
## ------------------------
## 0.70 | [0.26, 1.14]
##
## - Estimated using pooled SD.
The following statements show how the results may be reported in an APA-style report.
An
independent-samples t test showed that discrepancy scores were
significantly higher in the dim light condition (M = 4.05,
SD = 1.51) than in the normal light condition (M =
3.00, SD = 1.48), t(82) = 3.2057, p = .00192,
d = 0.70.
A study by Stephens, Atkins, and Kingston (2009) was designed to test the following research hypothesis:
Yelling a swear word during a painful experience may reduce the pain sensation, and if so, pain tolerance should be higher when yelling swear words during a painful laboratory task compared to when yelling neutral words.
Behavioral Task: The cold-pressor task is a measure of pain tolerance. Participants are asked to submerge their hand in icy water and to keep it there as long as they can before removing it from the water.
Independent variable (X): Sixty-seven college student participants completed two conditions with the cold-pressor task. During the “Neutral Words” condition, participants were instructed to yell non-swear words during the cold-pressor task. During the “Swear Words” condition, the same participants were instructed to yell swear words during the cold-pressor task. Condition order was counterbalanced across participants (e.g., half the participants experience the Neutral -> Swear condition order and the other half experienced the reverse order).
Dependent variable (Y): Several measures were
collected, but we will focus on latency, which is the number of
seconds participants kept their hand in the icy water.
Load and view the data in the console:
.2 = read.csv("https://andrewebrandt.github.io/object/datasets/SBS11.2.csv", header = TRUE)
SBS11head(SBS11.2)
## P gender neutral swear
## 1 1 M 87 102
## 2 2 M 124 159
## 3 3 F 147 162
## 4 4 M 201 224
## 5 5 M 105 150
## 6 6 F 110 125
The data in SBS11.2 are in wide format, i.e., latency scores from the
neutral and swear conditions are in separate columns, so we will begin
by reshaping it to a wide data frame called “pain.df” using the melt()
function in the reshape2 package:
library(reshape2)
<- melt(SBS11.2, # stack this data frame)
pain.df id.vars = c("P", "gender"), # don't stack these variables
measure.vars = c("neutral", "swear"), # stack these variables
variable.name = "X.cond", # name new X variable
value.name = "Y.latency") # name new Y variable
Once you have checked that the new data frame contains the correct X - Y pairs in each row, you can save the information to a new .csv file for later use (be sure to set your own file path).
write.csv(SBS11.2,"D:/My Drive/RStudio Working Directory/pain.df.csv", row.names = FALSE)
Load the psych()
package then calculate and save a
descriptive summary on latency scores in the neutral and swear word
conditions:
library(psych)
<- describeBy(
painStats.df 4], # data frame, Y scores in fourth column
pain.df[$X.cond, # grouping variable
pain.dfmat = TRUE, # matrix format
digits = 2) # round values to 2 digits
# show descriptives painStats.df
## item group1 vars n mean sd median trimmed mad min max range
## Y.latency1 1 neutral 1 67 105.39 42.41 105 103.73 35.58 17 210 193
## Y.latency2 2 swear 1 67 134.94 41.58 131 134.15 38.55 51 239 188
## skew kurtosis se
## Y.latency1 0.33 0.07 5.18
## Y.latency2 0.19 -0.01 5.08
Load the ggplot2()
package and plot the means and SEMs
for each condition:
library(ggplot2)
ggplot(painStats.df, aes(x = group1, y = mean)) + # plot X and Y
geom_col( # bar graph
width = 0.5,
color = "black",
fill = hsv(0.3, 0.5, 0.7)) +
geom_errorbar(aes(ymin = mean-se, ymax = mean+se),# calculate error bar
color = "black", # error bar color
width = .1) + # error bar size
xlab("Word Condition") +
scale_y_continuous(name = "Mean Latency Score", limits = c(0, 150)) +
ggtitle("Pain tolerance across word conditions")
The paired-samples t-statistic is based on two assumptions:
The paired-samples t-statistic is a one sample t-test applied to difference scores (D); in this case, the difference between a participant’s latency score in the neutral and swear condition: \(D = Y_{neutral} - Y_{swear}\)
t.test(
~ # data, DV scores
Y.latency # grouping variable, IV condition labels
X.cond, data = pain.df, # data frame
paired = TRUE) # paired t-test
##
## Paired t-test
##
## data: Y.latency by X.cond
## t = -22.962, df = 66, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -32.12184 -26.98264
## sample estimates:
## mean difference
## -29.55224
Make a statistical decision: The results show that \(p \le .05\) so reject the null hypothesis, \(H_0:\mu_{neutral} = \mu_{swear}\)
Cohen’s d expresses the magnitude of the treatment effect in terms of standard deviation (pooled).
library(effectsize)
cohens_d(
~ # data, DV scores
Y.latency # grouping variable, IV condition labels
X.cond, data = pain.df) # data frame
## Cohen's d | 95% CI
## --------------------------
## -0.70 | [-1.05, -0.35]
##
## - Estimated using pooled SD.
The following statements show how the results may be reported in an APA-style report.
A paired-samples t test showed that latency scores were significantly higher in the swear word condition (M = 134.94, SD = 41.58) than in the neutral word condition (M = 105.39, SD = 42.41), t(66) = 22.962, p < .001, d = 0.70.