library(nlme) #used for the gls() function
3 Marginal Model Interpretation
3.1 Brief Review of What We Learned
- Marginal Models
- GLS
- Define covariance model
- Compare models using AIC, BIC, likelihood ratio test
- GEE
- Compare models using QIC
- Define covariance model and link function
- GLS
3.2 Interpreting Coefficients of Marginal Models (Systematic Component)
First, lets consider a few simple cases in understanding the Marginal Model:
\[\mu_{ij} = E(\textcolor{red}{y}_{ij} \mid {X_{ij},t_{ij}})\]
where \(X_{ij}\) are the explanatory variables and \(t_{ij}\) is time.
- If the mean is constant across time and other variables (\(X_{ij}\)), and the mean is the same for all individuals, then
\[\mu_{ij} = \mu\]
- If the mean changes across time linearly and is not impacted by other variables (\(X_{i}\)), and is the same for all individuals, then
\[\mu_{ij} = \beta_0 + \beta_1 t_{ij}\]
- If the mean changes across time with time-varying variables (\(X_{ij}\)) and is the same for all individuals, then
\[\mu_{ij} = \beta_0 + \beta_1 x_{ij}\]
Now, lets go back to interpreting coefficients of our models
# unstructured covariance
<- gls(measurements ~ factor(group) + time, data = long_TLC, corr = corSymm(form = ~1 |
marg_mod weights = varIdent(form = ~1 | as.factor(level)), method = "REML")
id),
summary(marg_mod)
Generalized least squares fit by REML
Model: measurements ~ factor(group) + time
Data: long_TLC
AIC BIC logLik
2583.433 2635.224 -1278.716
Correlation Structure: General
Formula: ~1 | id
Parameter estimate(s):
Correlation:
1 2 3
2 0.124
3 0.247 0.856
4 0.518 0.455 0.522
Variance function:
Structure: Different standard deviations per stratum
Formula: ~1 | as.factor(level)
Parameter estimates:
lead0 lead1 lead4 lead6
1.000000 1.836722 1.583388 1.437453
Coefficients:
Value Std.Error t-value p-value
(Intercept) 26.053146 0.6930432 37.59238 0.0000
factor(group)Treatment -2.012020 0.9786776 -2.05586 0.0404
time -0.399965 0.0862719 -4.63610 0.0000
Correlation:
(Intr) fct()T
factor(group)Treatment -0.706
time -0.054 0.000
Standardized residuals:
Min Q1 Med Q3 Max
-2.3061017 -0.7995745 -0.3107700 0.3410439 5.5215819
Residual standard error: 5.324253
Degrees of freedom: 400 total; 397 residual
\(\beta_{trt}\) and \(\beta_{T}\) represents the estimated population mean change.
The general interpretation for these coeffiecients when \(X_{ij}\) is continuous, is that a unit difference in \(X\) leads to a \(\beta_1\) difference in overall mean response while holding all other variables constant.
Lets interpret these coefficients in the context of the TLC data.
Interpretation:
\(\beta_{trt}\): Among the treatment group, the average population change of blood lead levels (BLL) decreases by 2 in comparison to the control group.
\(\beta_{T}\): Among the control and treatment groups, blood lead levels (BLL) decreases by 0.40 (\(\beta_{T}\)) for every one-unit increase in time (\(T\)).
\(\beta_{0}\): Among the control and treatment group, the average population change of blood lead levels (BLL) increases by 26 and is constant over time.
How would our interpretation change after adding an interaction term?
Consider the following model:
# unstructured covariance
<- gls(measurements ~ factor(group) + time + factor(group):time, data = long_TLC,
marg_mod_inter corr = corSymm(form = ~1 | id), weights = varIdent(form = ~1 | as.factor(level)),
method = "REML")
summary(marg_mod_inter)
Generalized least squares fit by REML
Model: measurements ~ factor(group) + time + factor(group):time
Data: long_TLC
AIC BIC logLik
2586.646 2642.386 -1279.323
Correlation Structure: General
Formula: ~1 | id
Parameter estimate(s):
Correlation:
1 2 3
2 0.132
3 0.271 0.846
4 0.543 0.406 0.495
Variance function:
Structure: Different standard deviations per stratum
Formula: ~1 | as.factor(level)
Parameter estimates:
lead0 lead1 lead4 lead6
1.000000 1.833984 1.551842 1.439514
Coefficients:
Value Std.Error t-value p-value
(Intercept) 26.039582 0.6940593 37.51780 0.0000
factor(group)Treatment -1.938410 0.9815481 -1.97485 0.0490
time -0.368743 0.1223416 -3.01404 0.0027
factor(group)Treatment:time -0.169559 0.1730172 -0.98001 0.3277
Correlation:
(Intr) fct()T time
factor(group)Treatment -0.707
time -0.077 0.054
factor(group)Treatment:time 0.054 -0.077 -0.707
Standardized residuals:
Min Q1 Med Q3 Max
-2.2991338 -0.7973523 -0.2854752 0.3570825 5.6284768
Residual standard error: 5.310688
Degrees of freedom: 400 total; 396 residual
Interpretation:
\(\beta_{0}\): Among the control and treatment groups, the average population change of blood lead levels (BLL) increases by 26 and is constant over time.
\(\beta_{trt}\): Among the treatment group, the average population change of blood lead levels (BLL) decreases by 1.93 in comparison to the control group.
\(\beta_{T}\): Among the control group, blood lead levels (BLL) decreases by 0.40 (\(\beta_{T}\)) for every one-unit increase in time (\(T\)).
\(\beta_{T} + \beta_{trt:T}\): Among the treatment groups, blood lead levels (BLL) decreases by 2.09 for every one-unit increase in time (\(T\)).
\(\beta_{trt:T}\): Comparing the treatment and control groups, the difference in the mean change of blood lead levels (BLL) is 0.20 less among the patients in the treatment group compared to the control group for every one-unit increase in time (\(T\)).
How would this change if time was categorical?
Consider the Response Profiles Analysis case with no interaction terms first:
# unstructured covariance
<- gls(measurements ~ factor(group) + factor(level), data = long_TLC,
marg_mod_resp_prof corr = corSymm(form = ~1 | id), weights = varIdent(form = ~1 | as.factor(level)),
method = "REML")
summary(marg_mod_resp_prof)
Generalized least squares fit by REML
Model: measurements ~ factor(group) + factor(level)
Data: long_TLC
AIC BIC logLik
2525.171 2584.854 -1247.585
Correlation Structure: General
Formula: ~1 | id
Parameter estimate(s):
Correlation:
1 2 3
2 0.334
3 0.407 0.822
4 0.551 0.512 0.550
Variance function:
Structure: Different standard deviations per stratum
Formula: ~1 | as.factor(level)
Parameter estimates:
lead0 lead1 lead4 lead6
1.000000 1.567427 1.478108 1.484947
Coefficients:
Value Std.Error t-value p-value
(Intercept) 27.412077 0.7104365 38.58484 0.0000
factor(group)Treatment -2.012154 0.9786870 -2.05597 0.0404
factor(level)lead1 -7.315000 0.7993312 -9.15140 0.0000
factor(level)lead4 -6.614000 0.7247826 -9.12549 0.0000
factor(level)lead6 -4.202000 0.6448475 -6.51627 0.0000
Correlation:
(Intr) fct()T fct()1 fct()4
factor(group)Treatment -0.689
factor(level)lead1 -0.222 0.000
factor(level)lead4 -0.205 0.000 0.814
factor(level)lead6 -0.105 0.000 0.437 0.446
Standardized residuals:
Min Q1 Med Q3 Max
-2.23559962 -0.64349371 -0.04593541 0.61603790 5.58341207
Residual standard error: 5.150368
Degrees of freedom: 400 total; 395 residual
Interpretation:
\(\beta_{0}\): Among the control and treatment groups, the average population change of blood lead levels (BLL) increases by 27.41 and is constant over time.
\(\beta_{trt}\): Among the treatment group, the average population change of blood lead levels (BLL) decreases by 2.01 in comparison to the control group.
\(\beta_{T_1}, \beta_{T_4}, \beta_{T_6}\): represent the average change in blood lead levels (BLL) among the population (e.g. both control and treatment groups) relative to baseline which is the reference group for each time variable (dichotomous indicator variable).
For example, the average change in blood lead levels (BLL) among the population is 7.31 less compared to the baseline measure.
How would this change if an interaction term was included in the model?
Consider the Response Profiles Analysis case with no interaction terms first:
# unstructured covariance
<- gls(measurements ~ factor(group) + factor(level) + factor(group):factor(level),
marg_mod_resp_prof data = long_TLC, corr = corSymm(form = ~1 | id), weights = varIdent(form = ~1 |
as.factor(level)), method = "REML")
summary(marg_mod_resp_prof)
Generalized least squares fit by REML
Model: measurements ~ factor(group) + factor(level) + factor(group):factor(level)
Data: long_TLC
AIC BIC logLik
2452.076 2523.559 -1208.038
Correlation Structure: General
Formula: ~1 | id
Parameter estimate(s):
Correlation:
1 2 3
2 0.571
3 0.570 0.775
4 0.577 0.582 0.581
Variance function:
Structure: Different standard deviations per stratum
Formula: ~1 | as.factor(level)
Parameter estimates:
lead0 lead1 lead4 lead6
1.000000 1.325877 1.370471 1.524816
Coefficients:
Value Std.Error t-value p-value
(Intercept) 26.272 0.7102892 36.98775 0.0000
factor(group)Treatment 0.268 1.0045006 0.26680 0.7898
factor(level)lead1 -1.612 0.7919151 -2.03557 0.0425
factor(level)lead4 -2.202 0.8149218 -2.70210 0.0072
factor(level)lead6 -2.626 0.8885198 -2.95548 0.0033
factor(group)Treatment:factor(level)lead1 -11.406 1.1199371 -10.18450 0.0000
factor(group)Treatment:factor(level)lead4 -8.824 1.1524735 -7.65658 0.0000
factor(group)Treatment:factor(level)lead6 -3.152 1.2565568 -2.50844 0.0125
Correlation:
(Intr) fct()T fct()1 fct()4 fct()6
factor(group)Treatment -0.707
factor(level)lead1 -0.218 0.154
factor(level)lead4 -0.191 0.135 0.680
factor(level)lead6 -0.096 0.068 0.386 0.385
factor(group)Treatment:factor(level)lead1 0.154 -0.218 -0.707 -0.481 -0.273
factor(group)Treatment:factor(level)lead4 0.135 -0.191 -0.481 -0.707 -0.272
factor(group)Treatment:factor(level)lead6 0.068 -0.096 -0.273 -0.272 -0.707
f()T:()1 f()T:()4
factor(group)Treatment
factor(level)lead1
factor(level)lead4
factor(level)lead6
factor(group)Treatment:factor(level)lead1
factor(group)Treatment:factor(level)lead4 0.680
factor(group)Treatment:factor(level)lead6 0.386 0.385
Standardized residuals:
Min Q1 Med Q3 Max
-2.1756516 -0.6849999 -0.1515556 0.5294195 5.6327729
Residual standard error: 5.022503
Degrees of freedom: 400 total; 392 residual
Interpretation of Interaction Term Parameters:
\(\beta_{T_1}, \beta_{T_4}, \beta_{T_6}\): represent the average change in blood lead levels (BLL) among the control group relative to baseline which is the reference group in this case for each time variable (dichotomous indicator variable).
\(\beta_{T_1} + \beta_{trt:T1}, \beta_{T_4} + \beta_{trt:T4}, \beta_{T_6} + \beta_{trt:T6}\): represent the average change in blood lead levels (BLL) among the treatment group compared to control group relative to baseline which is the reference group in this case for each time variable (dichotomous indicator variable).
For example, the average change in blood lead levels (BLL) among the treatment group at week 1 compared to baseline is 13.02 less (\(\beta_{T_1} + \beta_{trt:T1}\)) in comparison to the control group at visit 1. Notice that there are two references here in our interpretation: the group (treatment vs control) and the time point (week 1 vs baseline).
\(\beta_{trt:T1}, \beta_{trt:T4}, \beta_{trt:T6}\): represent the average difference in mean blood lead levels (BLL) between treatment groups at the different time points in comparison to baseline which is the reference group.
For example, the average difference in mean blood lead levels (BLL) from baseline to week 1 was 11.40 (\(\beta_{trt:T1}\)) between treatment groups.