The linear model: Categorical predictors, curves and diagnostics

Interpret regression models

Height and VO2max

A linear model is constructed as

\[\dot{\text{V}}\text{O}_{2\text{max}} = \beta_0 + \beta_1 \times x_{\text{height}}\] with the resulting model fit giving us

\[\begin{align} \beta_0 &= -1945 \\ \beta_1 &= 38.2 \end{align}\]

Note
  1. Create a graphical representation of the model by completing the code below
Code
library(tidyverse)

data.frame(x = c(0, 200), 
           y = c(-2000, 5000)) %>%
        ggplot(aes(x, y)) + 
        
        geom_blank() + # Draws an empty canvas with scales
        
        geom_abline(intercept, slope) + 
        
        coord_cartesian(xlim = c(120, 180), 
                        ylim = c(1000, 5000))
  1. Predict V̇O2max when height is 155, 165 and 175 cm. Add predictions to the figure.

Performance and V̇O2max

A model is constructed as

\[\text{Performance} = \beta_0 + \beta_1 \times x_{\dot{\text{V}}\text{O}_{2\text{max}}}\] Where \(\text{Performance}\) is time used to complete a running time trial in seconds. The resulting model fit gives us

\[\begin{align} \beta_0 &= 1548 \\ \beta_1 &= -12.4 \end{align}\]

Note
  1. Who will win the time trial?
  • Anders: V̇O2max: 78.4
  • Sigvard: V̇O2max: 72.9

Calculate their respective times to complete the time trial.

  1. Plot the relationship between V̇O2max and performance.

A dummy variable to predict group differences

A model is constructed as

\[\Delta\text{Strength} = \beta_0 + \beta_1 \times x_{\text{group}}\]

Where \(\Delta\text{Strength}\) is the change in strength from pre to post-intervention, \(x_{\text{group}}\) is set to 0 when group == "ctrl" and 1 when group == "expr".

The model fit gives us

\[\begin{align} \beta_0 &= 8.2 \\ \beta_1 &= 4.4 \end{align}\]

Note
  1. Create a dummy variable in the data frame (see below) and calculate the expected strength gains
Code
dat <- data.frame(group = c("ctrl", "ctrl", "expr", "expr"))

dat %>%
        mutate() %>% # create a dummy variable
        mutate(delta_strength = ) # do predictions 
  1. Make a graph that compares the average of each group.

Two dummy variables

A model is constructed as

\[\Delta\text{Strength} = \beta_0 + \beta_1 \times x_{\text{group}} + \beta_2 \times x_{\text{sex}}\]

Where \(\Delta\text{Strength}\) is the change in strength from pre to post-intervention, \(x_{\text{group}}\) is set to 0 when group == "ctrl" and 1 when group == "expr". \(x_{\text{sex}}\) is 0 when sex == "female" and 1 when sex == "male"

The model fit gives us

\[\begin{align} \beta_0 &= 8.2 \\ \beta_1 &= 4.4 \\ \beta_2 &= -3.9 \end{align}\]

Note
  1. What is the expected average strength gain in females in the control group?
  2. What is the expected average strength gain in males in the experimental group?

Three dummy variables

We could allow the model to describe different expected averages in males and females in response to the intervention. A new model is constructed:

\[\Delta\text{Strength} = \beta_0 + \beta_1 \times x_{\text{group}} + \beta_2 \times x_{\text{sex}} + \beta_3 \times x_{\text{sex}} \times x_{\text{group}}\]

The model fit gives us

\[\begin{align} \beta_0 &= 8.2 \\ \beta_1 &= 4.4 \\ \beta_2 &= -3.9 \\ \beta_3 &= 2.7 \end{align}\]

Note
  1. Calculate predictions by completing the code below
Code
dat <- data.frame(group = c("ctrl", "ctrl", "expr", "expr"), 
                  sex = c("male", "female", "male", "female"))

dat %>%
        mutate(xgroup = if_else(group == "ctrl", 0, 1), 
               xsex = if_else(sex == "female", 0, 1), 
               interaction = xgroup * xsex) %>% # create dummy variables
        
        mutate(delta_strength) # do predictions 
  1. Make a figure that shows differences between groups.

Prediction with error

A ordinary linear model with normally distributed error (a Gaussian model) can be described as

\(y \sim \operatorname{Normal}(\mu, \sigma)\)

\(\mu\) tells us about an average and \(\sigma\) corresponds to the expected variation around the average.

Our model has a linear model to predict \(\mu\) with corresponding estimated parameters:

\[\begin{align} \mu &= \beta_0 + \beta_1 \times x \\ \beta_0 &= -1945 \\ \beta_1 &= 38.2 \\ \sigma &= 72.2 \end{align}\]

Note

Let’s say we want to simulate a set of new observations based on this model. The function rnorm takes the arguments n, mean and sd. Use this function to simulate observations when x is 155, 165 and 175.

Combine this simulation with the first model in this workshop.

The hypertrophy data set!