Mixed Effects Models

Goals of the lecture

  • The independence assumption of linear regression.
    • What about when data are non-independent?
  • Introducing mixed effects models.
    • Fixed vs. random effects.
    • Adding random intercepts.
    • Adding random slopes.
  • Mixed models in practice with lme4.
    • Model comparison: the logic of likelihood-ratio tests.

Load libraries

### Loading the tidyverse
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
### Loading lme4
library(lme4)
Loading required package: Matrix

Attaching package: 'Matrix'

The following objects are masked from 'package:tidyr':

    expand, pack, unpack

Part 1: Independence

The independence assumption; why it matters; examples of non-independence.

What is independence?

Independence means that each observation in your dataset is generated by a separate, unrelated causal process.

  • When you flip a coin, each flip is independent of the others
  • The outcome of flip #1 doesn’t influence flip #2
  • No hidden factors connect multiple observations
  • Each data point provides new information

Independence is a fundamental assumption of many statistical tests. When we assume independence, we’re saying that knowing the value of one observation tells us nothing about another observation beyond what our model predicts.

Why does independence matter?

Standard statistical methods (t-tests, OLS regression, ANOVA) assume independence.

When this assumption is violated:

  • Standard errors are underestimated
  • Confidence intervals are too narrow
  • p-values are too small (inflated Type I error)
  • Your conclusions may be wrong!

The problem: Treating non-independent observations as independent makes you overconfident in your results.

Example: Independent data

You test whether a new drug increases happiness:

  • 100 people receive the drug
  • 100 people receive a placebo
  • Each person provides one happiness rating (1-7 scale)
  • You compare the two groups with a t-test

Independence is reasonable here
Each person contributes exactly one data point.

Another independent example

You test whether people respond differently to metaphor type A vs. B:

  • 200 participants, randomly assigned to condition A or B
  • Each participant sees one item
  • Between-subjects design with one item per person
  • Compare responses with OLS regression

Independence is reasonable here
No repeated measures, no item effects to worry about.

Visualizing independent data

Each observation is independent—no nesting structure.

Analyzing independent data

simple_model = lm(data = df, value ~ group)
summary(simple_model)$coefficients
             Estimate Std. Error    t value      Pr(>|t|)
(Intercept) 723.22404   1.596112 453.116229 1.419621e-300
groupB      -16.61191   2.257243  -7.359383  4.822485e-12

Interpretation: - Intercept (groupA) = 723.22 (mean of group A) - groupB coefficient = -16.61 (difference from A) - Group B is about 20 units lower than Group A

When independence fails

Non-independence occurs when different observations are systematically related, i.e., they were produced by the same generative process.

Common sources of non-independence in behavioral research:

  • Repeated measures: Same participant tested multiple times
  • Nested data: Students within classrooms within schools
  • Stimulus items: Same stimuli shown to multiple participants
  • Time series: Measurements over time from same units
  • Clustered designs: Participants recruited from same communities

Example: The sleep study

The sleepstudy dataset tracks reaction times in 18 subjects over subsequent days of sleep deprivation:

data(sleepstudy)
head(sleepstudy, 3)
  Reaction Days Subject
1 249.5600    0     308
2 258.7047    1     308
3 250.8006    2     308
Note💭 Check-in

Based on the structure of this study, what might be a source of non-independence?

Naive analysis (wrong!)

Let’s pretend we don’t know about the nested structure:

sleepstudy %>%
  ggplot(aes(x = Days, y = Reaction)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", color = "red") +
  labs(title = "Naive analysis: treating all 180 observations as independent") +
  theme_minimal(base_size = 14)
`geom_smooth()` using formula = 'y ~ x'

Note💭 Check-in

What’s not pictured here?

The real structure

`geom_smooth()` using formula = 'y ~ x'

Subjects differ in:

  • Overall reaction time (baseline speed)
  • Effect of sleep deprivation (some are more resilient)

Subject-level variability

Some subjects are consistently faster; others are consistently slower.

Two types of nested variance

  1. Random intercepts: Subjects differ in their average reaction time
    • Some people are just faster/slower overall
    • Even with no sleep deprivation!
  2. Random slopes: Subjects differ in how much sleep deprivation affects them
    • Some people are resilient (shallow slope)
    • Others deteriorate quickly (steep slope)

Mixed effects models let us account for both types of variance.

Part 2: Introduction to mixed effects models

Fixed vs. random effects; random intercepts; random slopes.

What are mixed effects models?

Mixed effects models combine:

  • Fixed effects: Variables we care about (e.g., experimental conditions)
    • These are your hypotheses!
    • Estimated with specific coefficient values
  • Random effects: Grouping variables that create non-independence
    • Sources of variability we need to control for
    • Allow intercepts/slopes to vary by group
    • Estimated as distributions (variance parameters)

Goal: Get better estimates of fixed effects by accounting for nested structure.

Fixed vs. random effects

Determining which factors to include as fixed vs. random effects is not always straightforward. Here’s a rough guide, but we’ll discuss it again later too:

How do you decide?

  • Fixed effects:
    • Variables you’re theoretically interested in
    • Want to estimate specific coefficient values
    • Often experimental manipulations
    • Examples: treatment condition, time, drug dosage
  • Random effects:
    • Grouping/clustering variables
    • Want to account for variance, not estimate specific values
    • Usually sampling from larger population
    • Examples: participant ID, item ID, school, lab

The mixed model formula

Standard regression:

\[Y = \beta_0 + \beta_1 X + \epsilon\]

Mixed effects model with random intercepts:

\[Y = (\beta_0 + u_0) + \beta_1 X + \epsilon\]

Where:

  • \(\beta_0\) = overall intercept (fixed effect)
  • \(u_0\) = subject-specific deviation from overall intercept (random effect)
  • \(\beta_1\) = fixed effect of X
  • \(\epsilon\) = residual error

Random intercepts: Intuition

Random intercepts allow each group (e.g., subject) to have its own baseline:

Same slope, different intercepts.

Random intercepts: R syntax

Basic model with random intercepts:

model_intercepts <- lmer(Reaction ~ Days + (1 | Subject), 
              data = sleepstudy)

Breaking down the syntax:

  • Reaction ~ Days: Fixed effect of Days on Reaction
  • (1 | Subject): Random intercept for each Subject
    • 1 = intercept
    • | = “grouped by”
    • Subject = grouping variable

Extracting random intercepts

After fitting the model, you can extract the random effects:

# Extract random intercepts
random_int <- ranef(model_intercepts)$Subject
head(random_int, 4)
    (Intercept)
308   40.783710
309  -77.849554
310  -63.108567
330    4.406442

These are deviations from the overall intercept (fixed effect).

Understanding the intercepts

Each subject’s fitted intercept = fixed intercept + random deviation:

# Overall (fixed) intercept
fixed_intercept <- fixef(model_intercepts)["(Intercept)"]
fixed_intercept
(Intercept) 
   251.4051 
# Subject 308's total intercept
subject_308_deviation <- ranef(model_intercepts)$Subject["308", "(Intercept)"]
subject_308_deviation
[1] 40.78371
subject_308_total <- fixed_intercept + subject_308_deviation
subject_308_total
(Intercept) 
   292.1888 

Visualizing random intercepts

# Create dataframe of random intercepts
intercept_df <- ranef(model_intercepts)$Subject %>%
  rownames_to_column("Subject") %>%
  rename(deviation = `(Intercept)`)

ggplot(intercept_df, aes(x = reorder(Subject, deviation), y = deviation)) +
  geom_point(size = 3, color = "steelblue") +
  geom_hline(yintercept = 0, linetype = "dashed", color = "red") +
  labs(title = "Random intercept deviations by subject",
       x = "Subject", y = "Deviation from fixed intercept (ms)") +
  theme_minimal(base_size = 14) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Random slopes: Intuition

Random slopes allow the effect of X to vary by group:

`geom_smooth()` using formula = 'y ~ x'

Random slopes: R syntax

Model with random intercepts AND slopes:

model_slopes <- lmer(Reaction ~ Days + (1 + Days | Subject), 
              data = sleepstudy)
summary(model_slopes)$coefficients
             Estimate Std. Error   t value
(Intercept) 251.40510   6.824597 36.838090
Days         10.46729   1.545790  6.771481

Breaking down the syntax:

  • Reaction ~ Days: Fixed effect of Days
  • (1 + Days | Subject): Random effects for Subject
    • 1 = random intercept
    • Days = random slope for Days
    • | = grouped by
    • Subject = grouping variable

Extracting random slopes

After fitting the model, you can extract the random effects:

# Extract random intercepts
random_eff <- ranef(model_slopes)$Subject
head(random_eff, 4)
    (Intercept)      Days
308    2.258551  9.198976
309  -40.398738 -8.619681
310  -38.960409 -5.448856
330   23.690620 -4.814350

Now, we have deviations from the intercept and from the Days slope.

Understanding the slopes

Each subject’s fitted slope = fixed slope + random deviation:

# Overall (fixed) slope for Days
fixed_slope <- fixef(model_slopes)["Days"]
fixed_slope
    Days 
10.46729 
# Subject 308's slope deviation and total slope
subject_308_slope_dev <- ranef(model_slopes)$Subject["308", "Days"]
subject_308_slope_dev
[1] 9.198976
subject_308_total_slope <- fixed_slope + subject_308_slope_dev
subject_308_total_slope
    Days 
19.66626 

Subject 308’s RT increases by ~19.67 ms per day (vs. population average of ~10.47 ms).

Visualizing random slopes

# Create dataframe of random effects
slopes_df <- ranef(model_slopes)$Subject %>%
  rownames_to_column("Subject") %>%
  rename(intercept_dev = `(Intercept)`, slope_dev = Days)

ggplot(slopes_df, aes(x = reorder(Subject, slope_dev), y = slope_dev)) +
  geom_point(size = 3, color = "darkgreen") +
  geom_hline(yintercept = 0, linetype = "dashed", color = "red") +
  labs(title = "Random slope deviations by subject",
       x = "Subject", y = "Deviation from fixed slope (ms/day)") +
  theme_minimal(base_size = 14) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Why random slopes matter

Without random slopes: - Assumes the effect is the same for everyone - Underestimates uncertainty in the fixed effect - Can lead to false positives

With random slopes: - Acknowledges that effects vary across individuals - Provides more conservative (realistic) estimates - Better generalization to new subjects

Rule of thumb: If a variable varies within your grouping variable, include it as a random slope.

Multiple random effects

You can include multiple sources of random effects:

# Random effects for both subjects and items
model <- lmer(RT ~ Condition + 
                (1 + Condition | Subject) +
                (1 + Condition | Item),
              data = mydata)

Common in psycholinguistics: - Random effects for subjects (people vary) - Random effects for items (stimuli vary) - Called “crossed random effects”

Conceptual summary

Mixed effects models help when:

  • You have repeated measures
  • You have nested/hierarchical structure
  • You want to generalize beyond your specific sample

Key components: - Fixed effects: Your hypotheses - Random intercepts: Group-level baseline differences
- Random slopes: Group-level differences in effects

Part 3: Using lme4

Building and evaluating models; model comparisons; best practices and common issues.

The lme4 package

Most common R package for mixed models:

library(lme4)

Main functions: - lmer(): Linear mixed effects models - glmer(): Generalized linear mixed models (logistic, Poisson, etc.)

Building your first model

Let’s fit a model with random intercepts only:

model_intercepts <- lmer(Reaction ~ Days + (1 | Subject),
                         data = sleepstudy,
                         REML = FALSE)

Note: REML = FALSE uses maximum likelihood estimation, which we need for model comparison.

Viewing model output

summary(model_intercepts)
Linear mixed model fit by maximum likelihood  ['lmerMod']
Formula: Reaction ~ Days + (1 | Subject)
   Data: sleepstudy

      AIC       BIC    logLik -2*log(L)  df.resid 
   1802.1    1814.9    -897.0    1794.1       176 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.2347 -0.5544  0.0155  0.5257  4.2648 

Random effects:
 Groups   Name        Variance Std.Dev.
 Subject  (Intercept) 1296.9   36.01   
 Residual              954.5   30.90   
Number of obs: 180, groups:  Subject, 18

Fixed effects:
            Estimate Std. Error t value
(Intercept) 251.4051     9.5062   26.45
Days         10.4673     0.8017   13.06

Correlation of Fixed Effects:
     (Intr)
Days -0.380

Adding random slopes

Now let’s add random slopes for the Days effect:

model_full <- lmer(Reaction ~ Days + (1 + Days | Subject),
                   data = sleepstudy,
                   REML = FALSE)

This allows both the intercept AND the slope to vary by subject.

Comparing the models

summary(model_full)
Linear mixed model fit by maximum likelihood  ['lmerMod']
Formula: Reaction ~ Days + (1 + Days | Subject)
   Data: sleepstudy

      AIC       BIC    logLik -2*log(L)  df.resid 
   1763.9    1783.1    -876.0    1751.9       174 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.9416 -0.4656  0.0289  0.4636  5.1793 

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Subject  (Intercept) 565.48   23.780       
          Days         32.68    5.717   0.08
 Residual             654.95   25.592       
Number of obs: 180, groups:  Subject, 18

Fixed effects:
            Estimate Std. Error t value
(Intercept)  251.405      6.632  37.907
Days          10.467      1.502   6.968

Correlation of Fixed Effects:
     (Intr)
Days -0.138

Model comparison: The logic

Question: Does adding variable \(X\) improve the model over a model without \(X\)?

Approach: Likelihood ratio test

  • Compare log-likelihood of two nested models
  • Model with more parameters should fit better
  • But: is the improvement “worth it”?
  • Test statistic: \(\chi^2 = -2(LL_{reduced} - LL_{full})\)
  • Compare to chi-square distribution

Running model comparison

model_full <- lmer(Reaction ~ Days + (1 + Days | Subject),
                   data = sleepstudy,
                   REML = FALSE)

model_reduced <- lmer(Reaction ~ (1 + Days | Subject),
                   data = sleepstudy,
                   REML = FALSE)

anova(model_full, model_reduced)
Data: sleepstudy
Models:
model_reduced: Reaction ~ (1 + Days | Subject)
model_full: Reaction ~ Days + (1 + Days | Subject)
              npar    AIC    BIC  logLik -2*log(L)  Chisq Df Pr(>Chisq)    
model_reduced    5 1785.5 1801.4 -887.74    1775.5                         
model_full       6 1763.9 1783.1 -875.97    1751.9 23.537  1  1.226e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Interpretation:: Adding Days significantly improves fit

Extracting coefficients

Fixed effects (population-level):

fixef(model_full)
(Intercept)        Days 
  251.40510    10.46729 

Random effects (subject-specific deviations):

head(ranef(model_full)$Subject, 4)
    (Intercept)      Days
308    2.815789  9.075507
309  -40.047855 -8.644152
310  -38.432497 -5.513471
330   22.831765 -4.658665

Visualizing random effects

ranef_df <- ranef(model_full)$Subject %>%
  rownames_to_column("Subject")

ggplot(ranef_df, aes(x = `(Intercept)`, y = Days)) +
  geom_point(size = 3, alpha = 0.6) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  geom_vline(xintercept = 0, linetype = "dashed") +
  labs(title = "Random effects for each subject",
       x = "Random intercept", y = "Random slope for Days") +
  theme_minimal(base_size = 14)

Conclusion

  • Mixed models are a valuable tool for modeling non-independent data.
  • Random intercetps model correlated variance in y; random slopes model correlated variance in y ~ x.
  • Coming up, we’ll discuss:
    • More complex models (e.g., nested variance).
    • Deep dive into fitting random effects (e.g., pooling).
    • Understanding the outputs (e.g., variance partitioning).
    • Best practices (e.g., “keep it maximal”).
    • Diagnozing and fixing common issues (e.g., convergence failures).
    • Generalizing to generalized linear mixed effects models.