metaphor data.csv file from https://osf.io/qrc6b/metaphor which is the
result of calling read_csv() on
metaphor data.csvlibrary(tidyverse)
metaphor <- read_csv('https://www.stephenskalicky.com/r_data/metaphor_data.csv')
## Rows: 1304 Columns: 28
## ── Column specification ──────────────────────────────────────────────────────────────────────────
## Delimiter: ","
## chr (6): metaphor_id, response, met_type, sex, hand, language_group
## dbl (22): subject, conceptual, nm, trial_order, met_stim, met_RT, age, colle...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
met.small from
metaphor. Using dplyr::select(), choose the
following columns:met.small <- metaphor %>%
dplyr::select(subject, met_type, met_RT, conceptual, nm, NFC)

mutate the RT (the time spent writing
the metaphor) into seconds. The current measurement is in milliseconds,
and we want seconds. Therefore we need to divide met_RT by
1000. Using mutate, create a new variable in
met.small named RT, which is the result of dividing
met_RT by 1000. (Note that I am going to extend the pipe
from the original creation of met.small each time).met.small <- metaphor %>%
dplyr::select(subject, met_type, met_RT, conceptual, nm, NFC) %>%
mutate(RT = met_RT/1000)
Next, we need to remove outliers. We will define an outlier as someone who spent longer than 2.5 standard deviations writing their metaphor. We will use z-scores to help us with this (don’t worry if you do not know what that is).
Using mutate, create a new variable in met.small
named zRT, which is the result of calling the function
scale on met_RT.
met.small <- metaphor %>%
dplyr::select(subject, met_type, met_RT, conceptual, nm, NFC) %>%
mutate(RT = met_RT/1000) %>%
mutate(zRT = scale(met_RT))
mutate call which
creates a new variable named outliers. The value of
outliers will be a 1 if zRT is
>= to 2.5, otherwise it will be a 0. To do this, we can
use the if_else function in our mutate call
(we could also use the case_when function).The basic syntax for if_else is
if_else(condition, A, B), where if condition = TRUE, do A,
otherwise, do B. You can write your mutate call like this:
mutate(outliers = if_else(condition, A, B))
It is up to you to write the correct values for
condition, A, and B.
Below is the final pipe will all the previous commands in one pipe.
This is again why pipes are cool - you can add each line, step-by-step,
as part of your data cleaning / wrangling process. You could easily put
all the mutate functions into one call to mutate(), but
this method has the advantage of being a bit more easy to read and see
how steps link to one another.
met.small <- metaphor %>%
dplyr::select(subject, met_type, met_RT, conceptual, nm, NFC) %>%
mutate(RT = met_RT/1000) %>%
mutate(zRT = scale(met_RT)) %>%
mutate(outliers = ifelse(zRT >= 2.5, 1, 0))
met.small?sum(met.small$outliers)
## [1] 34
met.trim which is the result
of removing the outliers from met.small. Use the
filter() function.met.trim <- met.small %>%
dplyr::filter(outliers == 0)
## Warning: Using one column matrices in `filter()` was deprecated in dplyr 1.1.0.
## ℹ Please use one dimensional logical vectors instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
# sanity check
sum(met.trim$outliers)
## [1] 0
met.violin which is the
result of calling the ggplot() function. Inside the
ggplot() call, set the data argument to
met.trim, and make a aes() call with the
correct x and y axis in order to replicate the chart above. View your
plot by running the name of the object (i.e., run
met.violin by itself). Compare your Figure to the one above
- is it correct? What else are we missing?met.violin <- ggplot(data = met.trim, aes(x = met_type, y = RT)) +
geom_violin()
met.violin

geom_violin, add a new line with
only the function theme_base() - you will need the package
ggthemes() to do thisgeom_violin() function, add
aes(fill = met_type)+ sign to add a new function to your ggplot which
is scale_x_discrete(). This function can rename the values
on your x axis using the labels() function. Use the
template scale_x_discrete(labels = c()) to rename the
values of met_type to match the Figure above.+ sign to add a new function to your ggplot which
is labs. This function will renamed your axes and figure.
Use x =,y =, and title = inside
labs to create new labels. We want NO label for the x axis,
what can you type to make that happen?theme(legend.position = 'none')geom_jitter() call. Using this information,
what do violin plots show us? How can you interpret this data?library(ggthemes)
# with a final geom_point() - the original figure does not have these though.
met.violin <- ggplot(data = met.trim, aes(x = met_type, y = RT)) +
geom_violin(aes(fill = met_type)) +
theme_base() +
scale_x_discrete(labels = c('Conventional', 'Novel')) +
labs(y = 'Production Time (seconds)', x = '', title = 'Metaphor Production Times') +
theme(legend.position = 'none') +
geom_jitter(aes(alpha = .5))
met.violin

ratings which matches the right panel of the figure
above?ratings <- ggplot(data = met.trim, aes(x = met_type, y = nm)) +
theme_base() +
geom_violin(aes(fill = met_type)) +
scale_x_discrete(labels = c('Conventional', 'Novel')) +
labs(y = 'Ratings (1-5)', x = '', title = 'Metaphor Novelty/Mirth Ratings') +
theme(legend.position = 'none') +
geom_jitter(aes(alpha = .5))
ratings

gridExtra. Install the package and then use the
grid.arrange() function to join the two figures.gridExtra::grid.arrange(met.violin, ratings, nrow = 1)
