Author

Emorie D Beck

Download .Rmd (won’t work in Safari or IE)
See GitHub Repository

For a much more exhaustive tutorial, see this.

Workspace

Packages

First let’s load in our packages. We’re going to load in a couple of extras (brms and tidybayes) that we haven’t used before. These will let us make some pretty plots later.

Code
library(psych)
library(knitr)
library(kableExtra)
library(lme4)
library(broom.mixed)
library(brms)
library(tidybayes)
library(plyr)
library(tidyverse)

data_path <- "https://github.com/emoriebeck/R-tutorials/raw/master"

Read in Data

The National Longitudinal Study of Youths 1979 Child and Young Adult Sample (NLSYCYA) is a longitudinal study conducted by the National Bureau of Labor Statistics. The sample includes the children of the original 1979 sample, of which we will use a small subset. Here, we are going to use a subset of the more than 11,000 variables available that include the following:

Item Name Description Time-Varying?
PROC_CID Participant ID No
Dem_DOB Year of Date of Birth No
groups Jail, Community Service, None No
DemPWeight Weight Percentile at age 10 No
age Age of participant Yes
Year Year of Survey Yes
age0 Age of participant (centered) Yes
SensSeek Sensation-Seeking Composite Yes
CESD CESD Depression Composite Yes
Code
load(url(sprintf("%s/11-ggplot-p3-mlm/data/sample.RData", data_path)))

sample_dat

Restructure Data

These data are already largely cleaned as that is not our focus today. We just need to do a little bit of restructuring.

To run our models using purrr, we need to restrucutre the data to long. While we’re at it, we need to create a time variable centered at zero, so we can interpret our moderators later.

Code
(df1_long <- sample_dat %>%
  gather(key = trait, value = value, CESD, SensSeek, na.rm = T) %>%
   mutate(wave = year - 1996,
          age0 = age-16)) 

The Basic Growth Model

We’ll start with the basic growth model (just looking at change in a single variable over time).

we will use list columns to do it. We’ll start by using the group_by() and nest() functions from dplyr and tidyr to put the data for each trait into a cell of our data frame:

Code
(df1_nested <- df1_long %>%
  group_by(trait) %>%
  nest())

Now, our data frame is 2 x 2, with the elements in the second column each containing the data frame that corresponds to that trait. This makes it really easy to run our models using the map() family of unctions from purrr.

Before we fit the full growth model, we will first fit the unconditional model. Below, we will add a new column to our data frame that will contain the unconditional model for each trait.

In this case the model will be in the form of:
  • Level 1: \(Y_{ij} = \beta_{0j} + \varepsilon{ij}\)
  • Level 2:
    • \(\beta_{0j} = \gamma_{00} + U_{0j}\)
Code
(df1_nested <- df1_nested %>%
  mutate(fit0 = map(data, ~lmer(value ~ 1 + (1 | PROC_CID), data = .))))

Now we can see we have a new list column in our data frame called fit0 that contains an S4 class lmerMod, which simply means your growth model. To understand model, I personally find it easiest to visualize it. What this model is telling us is the mean across all observations as well as the between-person variability in that estimate.

Now, moving on to the growth model:

In this case the model will be in the form of:
  • Level 1: \(Y_{ij} = \beta_{0j} + \beta_{1j}*time_{ij} + \varepsilon{ij}\)
  • Level 2:
    • \(\beta_{0j} = \gamma_{00} + U_{0j}\)
    • \(\beta_{1j} = \gamma_{10} + U_{1j}\)
Code
(df1_nested <- df1_nested %>%
  mutate(fit2 = map(data, ~lmer(value ~ 1 + age0 + (age0 | PROC_CID), data = .))))

Now that we’ve run our model, we are ready to plot the results.

To do so, we’ll write a short function that will get the predicted values both for group level effects, as well as for individual-level estimates. There’s also a hidden function for getting standard errors of our pointwise estimates that we’ll use to create confidence bands in our plots. Don’t worry about that.

Predicted Values

Now we’ll use the predict function to get predicted values and the function I created to get pointwise standard errors for the fixed effects.

Code
# function to get fixed effects
fixed_pred_fun <- function(m){
  frame <- tibble(Intercept = 1, age0 = seq(0, 8, .1)) %>%
    mutate(fixed_pred = predict(m, newdata = ., re.form = NA),
           age = age0 + 16)
  frame$SE <- pv_fun(frame %>% select(Intercept, age0), m)
  frame %>% select(-Intercept)
}

# function to get random effects predictions  
ran_pred_fun <- function(m){
  crossing(age0 = seq(0, 8, 1),
         PROC_CID = m@frame$PROC_CID) %>%
    mutate(ran_pred = predict(m, newdata = .),
           age = age0 + 16)
}

ran_pred_fun <- function(m){
  augment(m)
}

# get fixed and random efects, and also combine them.
(df1_nested <- df1_nested %>%
  mutate(fixed_pred = map(fit2, fixed_pred_fun),
         ran_pred = map(fit2, ran_pred_fun),
         combined_pred = map2(fixed_pred, ran_pred, full_join)))
Code
df1_nested <- df1_nested  %>%
  mutate(ran_pred = map(fit2, augment))

df1_nested %>% unnest(ran_pred) %>%
  mutate(age = age0 + 16) %>%
  ggplot(aes(x = age, y = .fitted, group = PROC_CID)) +
    geom_line(size = .25, alpha = .5, color = "darkgreen") +
    facet_grid(~trait) +
    theme_classic()

Plot

Now let’s

Code
df1_nested %>%
  select(trait, combined_pred) %>%
  unnest(combined_pred) %>%
  ggplot(aes(x = age)) +
    geom_line(aes(y = .fitted, color = trait, group = PROC_CID), size = .25, alpha = .5) +
    geom_line(aes(y = fixed_pred), size = 2) +
    lims(y = c(0,4)) +
    labs(x = "Age", y = "Composite Rating",
         title = "Simple Growth Models") +
    facet_grid(~trait) +
    theme_classic() +
    theme(axis.text = element_text(face = "bold", size = rel(1.2)),
          axis.title = element_text(face = "bold", size = rel(1.2)),
          legend.title = element_text(face = "bold", size = rel(1.2)),
          legend.position = "none",
          plot.title = element_text(face = "bold", size = rel(1.2), hjust = .5))

And with confidence bands:

Code
df1_nested %>% 
  select(trait, fixed_pred) %>%
  unnest(fixed_pred) %>%
  ggplot(aes(x = age, y = fixed_pred, fill = trait)) +
    geom_ribbon(aes(ymin=fixed_pred-1.96*SE,
                    ymax=fixed_pred+1.96*SE),alpha=0.2) +  
    geom_line(size = 2) +
    labs(x = "Age", y = "Predicted Values") +
    ylim(0,4) +
    facet_grid(~trait) +
    theme_classic() +
    theme(axis.text = element_text(face = "bold", size = rel(1.2)),
          axis.title = element_text(face = "bold", size = rel(1.2)),
          legend.title = element_text(face = "bold", size = rel(1.2)),
          legend.position = "none",
          plot.title = element_text(face = "bold", size = rel(1.2), hjust = .5))

Code
df1_nested %>% 
  mutate(df = map_dbl(fit2, df.residual)) %>%
  select(trait, fixed_pred, df) %>%
  unnest(fixed_pred) %>%
  ggplot(aes(x = age, y = fixed_pred, fill = trait)) +
    stat_dist_lineribbon(
      aes(dist = "student_t", arg1 = unique(df), arg2 = fixed_pred, arg3 = SE),
      alpha = 1/4
    ) +
    facet_grid(~trait) +
    theme_classic() +
    theme(axis.text = element_text(face = "bold", size = rel(1.2)),
          axis.title = element_text(face = "bold", size = rel(1.2)),
          legend.title = element_text(face = "bold", size = rel(1.2)),
          legend.position = "none",
          plot.title = element_text(face = "bold", size = rel(1.2), hjust = .5))

Two-Level Categorical Moderator

Let’s start with the basic syntax:

  • Level 1: \(Y_{ij} = \beta_{0j} + \beta_{1j}*time_{1j} + \varepsilon{ij}\)
  • Level 2:
    • \(\beta_{0j} = \gamma_{00} + \gamma_{01}*X_{2j} + U_{0j}\)
    • \(\beta_{1j} = \gamma_{10} + \gamma_{11}*X_{2j} + U_{1j}\)

Now let’s swap that out for a 2 group sample from the present data:

  • Level 1: \(Y_{ij} = \beta_{0j} + \beta_{1j}*age0_{ij} + \varepsilon{ij}\)
  • Level 2:
    • \(\beta_{0j} = \gamma_{00} + \gamma_{01}*groupsNone + U_{0j}\)
    • \(\beta_{1j} = \gamma_{10} + \gamma_{11}*groupsNone + U_{1j}\)
Variable D1
Jail 0
None 1
Code
df1_nested <- df1_nested %>%
  mutate(data_2g = map(data, function(x) x %>% filter(groups != "CommServ")),
         fit3 = map(data_2g, ~lmer(value ~ 1 + age0*groups + (age0 | PROC_CID), data = .)))

When we plot these, we are plotting the simple slopes. Subbing 1 and 0 into the equations above we end up with the following for the groups.

None: \(Y_{ij} = \gamma_{00} + \gamma_{10}*age\)
Jail: \(Y_{ij} = \gamma_{00} + \gamma_{01} + (\gamma_{10} + \gamma_{11})*age\)

Code
fixed_pred_fun <- function(m){
  frame <- crossing(Intercept = 1, age0 = seq(0, 8, .1),
           groups = c("Jail", "None")) %>%
    mutate(fixed_pred = predict(m, newdata = ., re.form = NA),
           age = age0 + 16,
           groupsn = as.numeric(mapvalues(groups, unique(groups), c(1,0))),
           Int = age0*groupsn)
  frame$SE <- pv_fun(frame %>% select(Intercept, age0, groupsn, Int), m)
  frame %>% select(-Intercept, -groupsn, -Int)
}

ran_pred_fun <- function(m){
  crossing(age0 = seq(0, 8, 1),
         PROC_CID = m@frame$PROC_CID) %>%
    left_join(m@frame %>% as_tibble() %>% select(PROC_CID, groups)) %>%
    distinct() %>%
    mutate(ran_pred = predict(m, newdata = .),
           age = age0 + 16)
}

(df1_nested <- df1_nested %>%
  mutate(fixed_pred3 = map(fit3, fixed_pred_fun),
         ran_pred3 = map(fit3, ran_pred_fun),
         combined_pred3 = map2(fixed_pred3, ran_pred3, full_join)))
Code
df1_nested %>%
  select(trait, combined_pred3) %>%
  unnest(combined_pred3) %>%
  ggplot(aes(x = age)) +
    scale_color_manual(values = c("blue", "red")) +
    geom_line(aes(y = ran_pred, color = groups, group = PROC_CID), size = .25, alpha = .1) +
    geom_line(aes(y = fixed_pred, color = groups), size = 2) +
    lims(y = c(0,4)) +
    labs(x = "Age", y = "Composite Rating",
         title = "Simple Growth Models") +
    facet_grid(~trait) +
    theme_classic() +
    theme(axis.text = element_text(face = "bold", size = rel(1.2)),
          axis.title = element_text(face = "bold", size = rel(1.2)),
          legend.title = element_text(face = "bold", size = rel(1.2)),
          legend.position = "none",
          plot.title = element_text(face = "bold", size = rel(1.2), hjust = .5))

Code
df1_nested %>%
  select(trait, fixed_pred3) %>%
  unnest(fixed_pred3) %>%
  ggplot(aes(x = age, y = fixed_pred, group = groups)) +
    geom_ribbon(aes(ymin = fixed_pred - SE * 1.96,
                    ymax = fixed_pred + SE * 1.96, 
                    fill = groups),
                alpha = .5) +
    geom_line(size = 1) +
    facet_grid(~trait) +
    theme_classic()

Three-Level Categorical Moderator

  • Level 1: \(Y_{ij} = \beta_{0j} + \beta_{1j}*age0_{ij} + \varepsilon{ij}\)
  • Level 2:
    • \(\beta_{0j} = \gamma_{00} + \gamma_{01}*D1 + \gamma_{02}*D2 + U_{0j}\)
    • \(\beta_{1j} = \gamma_{10} + \gamma_{11}*D1 + \gamma_{12}*D2 + U_{1j}\)
Variable D1 D2
Jail 0 0
None 1 0
CommServ 0 1
Code
df1_nested <- df1_nested %>%
  mutate(fit4 = map(data, ~lmer(value ~ 1 + age0*groups + (age0 | PROC_CID), data = .)))

When we plot these, we are plotting the simple slopes. Subbing 1 and 0 into the equations above we end up with the following for the groups.

None: \(Y_{ij} = \gamma_{00} + \gamma_{10}*age\)
Jail: \(Y_{ij} = \gamma_{00} + \gamma_{01} + (\gamma_{10} + \gamma_{11})*age\)
Community Service: \(Y_{ij} = \gamma_{00} + \gamma_{02} + (\gamma_{10} + \gamma_{12})*age\)

Code
fixed_pred_fun <- function(m){
  crossing(age0 = seq(0, 8, 1),
           groups = c("Jail", "None", "CommServ")) %>%
    mutate(fixed_pred = predict(m, newdata = ., re.form = NA),
           age = age0 + 16)
}

ran_pred_fun <- function(m){
  crossing(age0 = seq(0, 8, 1),
         PROC_CID = m@frame$PROC_CID) %>%
    left_join(m@frame %>% as_tibble() %>% select(PROC_CID, groups)) %>%
    distinct() %>%
    mutate(ran_pred = predict(m, newdata = .),
           age = age0 + 16)
}

(df1_nested <- df1_nested %>%
  mutate(fixed_pred4 = map(fit4, fixed_pred_fun),
         ran_pred4 = map(fit4, ran_pred_fun),
         combined_pred4 = map2(fixed_pred4, ran_pred4, full_join)))
Code
df1_nested %>%
  select(trait, combined_pred4) %>%
  unnest(combined_pred4) %>%
  ggplot(aes(x = age)) +
    scale_color_manual(values = c("blue", "red", "black")) +
    geom_line(aes(y = ran_pred, color = groups, group = PROC_CID), size = .25, alpha = .1) +
    geom_line(aes(y = fixed_pred, color = groups), size = 2) +
    lims(y = c(0,4)) +
    labs(x = "Age", y = "Composite Rating",
         title = "Simple Growth Models") +
    facet_grid(~trait) +
    theme_classic() +
    theme(axis.text = element_text(face = "bold", size = rel(1.2)),
          axis.title = element_text(face = "bold", size = rel(1.2)),
          legend.title = element_text(face = "bold", size = rel(1.2)),
          legend.position = "none",
          plot.title = element_text(face = "bold", size = rel(1.2), hjust = .5))

Continuous Time-Invariant Moderator

Code
df1_nested <- df1_nested %>%
  mutate(fit5 = map(data, ~lmer(value ~ 1 + age0*DemPweight + (age0 | PROC_CID), data = .)))
Code
fixed_pred_fun <- function(m){
  desc <- Rmisc::summarySE(m@frame, measurevar = "DemPweight")
    crossing(age0 = seq(0, 8, 1),
           DemPweight = c(desc$DemPweight - desc$sd, 
                          desc$DemPweight, 
                          desc$DemPweight + desc$sd)) %>%
    mutate(fixed_pred = predict(m, newdata = ., re.form = NA),
           age = age0 + 16,
           Weight = factor(DemPweight, levels = unique(DemPweight), labels = c("-1SD", "0SD", "1SD")))
}

ran_pred_fun <- function(m){
  crossing(age0 = seq(0, 8, 1),
         PROC_CID = m@frame$PROC_CID) %>%
    left_join(m@frame %>% as_tibble() %>% select(PROC_CID, DemPweight)) %>%
    distinct() %>%
    mutate(ran_pred = predict(m, newdata = .),
           age = age0 + 16)
}

(df1_nested <- df1_nested %>%
  mutate(fixed_pred5 = map(fit5, fixed_pred_fun),
         ran_pred5 = map(fit5, ran_pred_fun),
         combined_pred5 = map2(fixed_pred5, ran_pred5, full_join)))
Code
df1_nested %>%
  select(trait, combined_pred5) %>%
  unnest(combined_pred5) %>%
  ggplot(aes(x = age)) +
    scale_color_manual(values = c("blue", "red", "black")) +
    # geom_line(aes(y = ran_pred, group = PROC_CID), size = .25, alpha = .1) +
    geom_line(aes(y = fixed_pred, color = Weight), size = 1) +
    lims(y = c(0,4)) +
    labs(x = "Age", y = "Composite Rating",
         title = "Simple Growth Models") +
    facet_wrap(~trait, scales = "free_y") +
    theme_classic() +
    theme(axis.text = element_text(face = "bold", size = rel(1.2)),
          axis.title = element_text(face = "bold", size = rel(1.2)),
          legend.title = element_text(face = "bold", size = rel(1.2)),
          legend.position = "none",
          plot.title = element_text(face = "bold", size = rel(1.2), hjust = .5))

Code
df1_nested %>%
  select(trait, fixed_pred5) %>%
  unnest(fixed_pred5) %>%
  ggplot(aes(x = age, y = fixed_pred, group = Weight)) +
    geom_line(aes(color = Weight), size = 1) +
    facet_wrap(~trait, scales = "free_y") + 
    theme_classic()