logistic regression example - adding random effects

Author

Stephen Skalicky

Published

April 22, 2026

Reminder of the data:

This notebook uses data from Kim et al. (2019)

Kim, Y., Jung, Y., & Skalicky, S. (2019). Linguistic alignment, learner characteristics, and the production of stranded prepositions in relative clauses. Studies in Second Language Acquisition, 41(5), 937–969.

This research compared the degree of primed production of English stranded prepositions among Korean learners of English. Learners completed an alignment session, where half of their input trials included a stranded preposition, and half did not. After each trial, we assessed whether the participants produced (or did not produce) a stranded preposition. If a participant produced a stranded preposition after an input trial containing a stranded preposition, this was taken as evidence of alignment. We also measured the degree of learning of stranded prepositions from the alignment sessions using a pre/immediate/delayed posttest design. These two questions are further nested within a comparison of modality: half of the participants completed the alignment session in a face-to-fact (FTF) context, whereas the other half completed the session in a synchronous computer-mediated context (SCMC). A separate control condition only completed the pre/immediate/delayed posttests and did not participate in the alignment sessions.

As such there are two main analyses:

1. What is the degree of linguistic alignment, and are there differences between FTF/SCMC modality?
2. Does the alignment session lead to learning of stranded prepositions, and are there differences between FTF/SCMC modality?

We answered both questions using logistic regression, which determines the probability of a binary outcome (yes/no, accurate/innacurate, etc) based on independent variables (modality, pre/post, etc.).

Primed production data

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.1     ✔ stringr   1.5.2
✔ ggplot2   4.0.0     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.1.0     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(emmeans) # model posthoc

Welcome to emmeans.
Caution: You lose important information if you filter this package's results.
See '? untidy'

library(performance) # check models

dat <- read_csv('sp-priming.csv') %>%
  mutate(score = factor(score))

Rows: 4608 Columns: 12
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (5): verb, modality, test, session, trial_type
dbl (7): subject, score, trial_order, cloze, wmc, prod_pre, rec_pre

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

glimpse(dat)

Rows: 4,608
Columns: 12
$ subject     <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ verb        <chr> "pet", "organize", "burn", "stack", "toast", "need", "cut"…
$ score       <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ modality    <chr> "FTF", "FTF", "FTF", "FTF", "FTF", "FTF", "FTF", "FTF", "F…
$ test        <chr> "priming1", "priming1", "priming1", "priming1", "priming1"…
$ trial_order <dbl> 12, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,…
$ session     <chr> "session1", "session1", "session1", "session1", "session1"…
$ trial_type  <chr> "non-prime", "prime", "non-prime", "prime", "prime", "non-…
$ cloze       <dbl> 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40…
$ wmc         <dbl> 74, 74, 74, 74, 74, 74, 74, 74, 74, 74, 74, 74, 74, 74, 74…
$ prod_pre    <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ rec_pre     <dbl> 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6…

Adding random effects

Take a look at this plot, which shows the distribution of accuracy across participants. You should notice that there is incredible variation - with some participants producing no, some, or many stranded prepositions.

In our prior glm models, we were not taking this information into account - our model assumed each row was independent data. But the data are no independent since each participant provided more than one data point - this is known as autocorrelation - a person’s response on one trial will be related to their responses on other trials.

So we want to give the model this information, but we do not want to make subject a predictor variable (i.e., fixed effect.) We can instead model this as a random effect - give a random intercept for subjects.

ggplot(dat, aes(y = score, x = trial_type, shape = modality, color = score)) + 
  facet_wrap(. ~ subject, scales = 'free') + 
  geom_jitter() + 
  theme_bw()

Fitting a random effect glm

First load a library for fitting multilevel models

library(lme4)

Loading required package: Matrix


Attaching package: 'Matrix'

The following objects are masked from 'package:tidyr':

    expand, pack, unpack

The syntax to fit a multilevel model is not much different than the glm syntax:

m1 <- glmer(score ~ trial_type * modality + (1|subject), data = dat, family = binomial(link = 'logit'))

The summary output now includes the random + fixed effects information.

summary(m1)

Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: score ~ trial_type * modality + (1 | subject)
   Data: dat

      AIC       BIC    logLik -2*log(L)  df.resid 
   3467.4    3499.6   -1728.7    3457.4      4603 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.4340 -0.4086 -0.2031 -0.0601 11.3528 

Random effects:
 Groups  Name        Variance Std.Dev.
 subject (Intercept) 1.571    1.254   
Number of obs: 4608, groups:  subject, 64

Fixed effects:
                             Estimate Std. Error z value Pr(>|z|)    
(Intercept)                   -3.4793     0.2574 -13.516  < 2e-16 ***
trial_typeprime                2.2852     0.1526  14.971  < 2e-16 ***
modalitySCMC                   0.2159     0.3742   0.577    0.564    
trial_typeprime:modalitySCMC   0.9110     0.2194   4.151  3.3e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) trl_ty mdSCMC
tril_typprm -0.489              
modaltySCMC -0.682  0.333       
trl_ty:SCMC  0.333 -0.692 -0.470

What does the the random intercept of subject show us? The variance and standard deviation terms are the same thing expressed on different scales (Variance is the square of the SD):

1.254^2

[1] 1.572516

The values represent how much variance exists in the intercepts among participants. So then we have to ask, what is the intercept here? Well, its essentially “everything” in this case, the base probability of producing a stranded preposition when absorbing all other info in the model - a baseline propensity to get a 1 over a 0. Remember the intercept is the spread of the variance - higher shows that subjects differed greatly in their baseline chances, which we also saw in the raw data.

We can look at the spread of random intercepts here:

ranef_m1 <- ranef(m1, condVar = T)
lattice::dotplot(ranef_m1)

$subject

adding a random slope

Great, now let’s add a random slope, specifically of trial type. This is because we saw that people seemed to vary not just in their production of primed prepositions overall but also in terms of which condition they made them in. So let’s also capture that additional variance in the model. The random slope is entered onto the intercept like this.

m2 <- glmer(score ~ trial_type * modality + (1+trial_type|subject), data = dat, family = binomial(link = 'logit'))

New output to interpret - what do you notice?

summary(m2)

Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: score ~ trial_type * modality + (1 + trial_type | subject)
   Data: dat

      AIC       BIC    logLik -2*log(L)  df.resid 
   3446.5    3491.5   -1716.2    3432.5      4601 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.7342 -0.3639 -0.2227 -0.1198  6.4811 

Random effects:
 Groups  Name            Variance Std.Dev. Corr
 subject (Intercept)     0.5339   0.7307       
         trial_typeprime 0.6092   0.7805   0.70
Number of obs: 4608, groups:  subject, 64

Fixed effects:
                             Estimate Std. Error z value Pr(>|z|)    
(Intercept)                   -3.1653     0.1953 -16.209  < 2e-16 ***
trial_typeprime                1.9267     0.2149   8.964  < 2e-16 ***
modalitySCMC                   0.2587     0.2630   0.984  0.32532    
trial_typeprime:modalitySCMC   0.9098     0.2916   3.120  0.00181 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) trl_ty mdSCMC
tril_typprm -0.261              
modaltySCMC -0.646  0.102       
trl_ty:SCMC  0.106 -0.653 -0.122

We now have the trial slope term. The intercept term changes because we’ve added trial type to the random effects structure, and the baseline of the categorial variable is now in the intercept. So this is now the variation between participants for non-prime trials (lower than the prior intercept which was over everything).

The trial slope term now tells us how much variation there is between participants in terms of the effects of prime vs. non-prime trials - note that this is not the difference between non-prime/prime. These are two different variance components.

The correlation is the correlation between the random effects:

baseline probability
effects of prime random slope
so here a higher baseline probability of production (in non prime) is positively correlated with higher probability in prime condition (makes sense?)

We can see this, visually:

ranef_m2 <- ranef(m2, condVar = TRUE)
lattice::dotplot(ranef_m2)

$subject

Want to confirm what the RE are showing us? We can reorder the trial type variable and see how the random effects output changes

dat_recoded <- dat %>%
  mutate(trial_type = factor(trial_type, levels = c('prime', 'non-prime')))


m3 <- glmer(score ~ trial_type * modality + (1+trial_type|subject), data = dat_recoded, family = binomial(link = 'logit'))

summary(m3)

Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: score ~ trial_type * modality + (1 + trial_type | subject)
   Data: dat_recoded

      AIC       BIC    logLik -2*log(L)  df.resid 
   3446.5    3491.5   -1716.2    3432.5      4601 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.7342 -0.3639 -0.2227 -0.1198  6.4811 

Random effects:
 Groups  Name                Variance Std.Dev. Corr 
 subject (Intercept)         1.9425   1.3937        
         trial_typenon-prime 0.6092   0.7805   -0.93
Number of obs: 4608, groups:  subject, 64

Fixed effects:
                                 Estimate Std. Error z value Pr(>|z|)    
(Intercept)                       -1.2386     0.2500  -4.955 7.25e-07 ***
trial_typenon-prime               -1.9267     0.2150  -8.961  < 2e-16 ***
modalitySCMC                       1.1684     0.3683   3.173  0.00151 ** 
trial_typenon-prime:modalitySCMC  -0.9098     0.2917  -3.119  0.00181 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) trl_t- mdSCMC
trl_typnn-p -0.657              
modaltySCMC -0.677  0.444       
trl_t-:SCMC  0.479 -0.653 -0.705

Now observe the negative correlation (high baseline in prime associated with less diff in non-prime)

ranef_m3 <- ranef(m3, condVar = TRUE)
lattice::dotplot(ranef_m3)

$subject

what’s the bloody point?

Compare the two models

base_glm <- glm(score~trial_type*modality, data = dat, 
                family = binomial(link='logit'))

multilevel_glm <- glmer(score~trial_type*modality + (1+trial_type|subject), data = dat, 
                      family = binomial(link = 'logit'))

base_glm_em <- as.data.frame(emmeans(base_glm, ~ trial_type|modality), type = 'response')
multilevel_glm_em <- as.data.frame(emmeans(multilevel_glm, ~ trial_type|modality), type = 'response')

glm_plot <- ggplot(base_glm_em, aes(y = prob, x = trial_type, group = modality, lty = modality)) + 
  geom_point() + 
  geom_line() + 
  geom_errorbar(aes(ymin = asymp.LCL, ymax = asymp.UCL), width = .1) + 
  labs(title = 'base glm', x = '', subtitle = 
         'score~trial_type*modality') + 
  theme(legend.position = 'top') + 
  ylim(0, 1)

multilevel_plot <- ggplot(multilevel_glm_em, aes(y = prob, x = trial_type, group = modality, lty = modality)) + 
  geom_point() + 
  geom_line() + 
  geom_errorbar(aes(ymin = asymp.LCL, ymax = asymp.UCL), width = .1) + 
  labs(title = 'multilevel model', x = '', subtitle = 'score~trial_type*modality+(1+trial_type|subject)') + 
  theme(legend.position = 'bottom') + 
  ylim(0, 1)

library(patchwork)

Compare the predictions from each model:

the GLM has smaller errorbars, meaning its estimates are more precise. The multilevel model has wider bars because it has taken into account random effects
- the GLM thinks all rows are independent, higher n!
the estimates are more conservative for the multilevel model

(glm_plot + multilevel_plot) + plot_layout(guides = "collect") & theme_bw()