surv-extrapolation/AIC-BIC-tests.qmd at main · ICON-in-R/surv-extrapolation · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
---
title: "AIC/BIC tests"
bibliography: references.bib
format:
  html:
    code-copy: true
editor_options:
  chunk_output_type: console
---

## Background

Akaike's Information Criterion (AIC) and the Bayesian Information Criterion (BIC) provide a useful statistical test of the relative fit of alternative parametric models, and they are usually available as outputs from statistical software. Further details on these are available from [@Collett2013]. Measures such as the negative 2 log likelihood are only suitable for comparing nested models, whereby one model is nested within another (for example, one model adds an additional covariate compared to another model). Different parametric models which use different probability distributions cannot be nested within one another. Thus the negative 2 log likelihood test is not suitable for assessing the fit of alternative parametric models. The AIC and BIC allow a comparison of models that do not have to be nested, including a term which penalises the use of unnecessary covariates (these are penalised more highly by the BIC). Generally, it is not necessary to include covariates in survival modelling in the context of an RCT as it would be expected that any important covariates would be balanced through the process of randomisation. However, some parametric models have more parameters than others, and the AIC and BIC take account of these -- for example an exponential model only has one parameter and so in comparative terms two parameter models such as the Weibull or Gompertz models are penalised. The AIC and BIC statistics therefore weigh up the improved fit of models with the potentially inefficient use of additional parameters, with the use of additional parameters penalised more highly by the BIC relative to the AIC.

Suppose that we have a statistical model of some data. Let $k$ be the number of estimated parameters in the model. Let $\hat{L}$ be the maximized value of the likelihood function for the model. Then the AIC value of the model is the following.

$$
AIC = 2k - 2 \ln({\hat{L}})
$$

Given a set of candidate models for the data, the preferred model is the one with the minimum AIC value.

The BIC is formally defined as

$$
BIC = k \ln(n) - 2 \ln(\widehat{L})
$$

where

$\hat{L}$ = the maximized value of the likelihood function of the model $M$, i.e. $\hat{L} = p(x \mid \widehat{\theta}, M)$, where $\widehat{\theta}$ are the parameter values that maximize the likelihood function; $x$ = the observed data; $n$ = the number of data points in $x$, the number of observations, or equivalently, the sample size; $k$ = the number of parameters estimated by the model. For example, in multiple linear regression, the estimated parameters are the intercept, the $q$ slope parameters, and the constant variance of the errors; thus, $k = q + 2$.

::: callout-warning
## Relative values

IC values only really make sense when compared between one another i.e relatively. The absolute value is not helpful and depends on the specifics of the model and data.
:::

## R example

Cox proportional hazard model

```{r warning=FALSE, message=FALSE}
library(survival)

test1 <- list(time=c(4,3,1,1,2,2,3),
              status=c(1,1,1,0,1,1,0),
              x=c(0,2,1,1,1,0,0),
              sex=c(0,0,0,0,1,1,1))
# Fit a stratified model
fit_ph <- coxph(Surv(time, status) ~ x + strata(sex), test1)

AIC(fit_ph)

##TODO: what this difference with AIC()?
extractAIC(fit_ph)

BIC(fit_ph)
```

### Parametric model with `flexsurv`

The package `flexsurv` computes the AIC when it fits a model. This can be accessed by \`.$AIC$ i.e.

```{r warning=FALSE, message=FALSE}
library(flexsurv)

fitg <- flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian, dist = "gengamma")
fitg$AIC
```

For other information criteria statistics it is straightforward to calculate these using the `flexsurvreg` outpuu. For simplicity we can write a function to do them all at the same time, which we shall call `fitstats.flexsurvreg`.

```{r}
#| code-fold: true
#| code-summary: "fitstats.flexsurvreg() function"

# helper function
# from flexsurv github
# https://github.com/chjackson/flexsurv-dev/issues/44
fitstats.flexsurvreg <- function(x) {
  ll <- x$loglik
  aic <- x$AIC
  k <- length(x$coefficients)
  n <- sum(x$data$m["(weights)"])
  aicc <- aic + ((2 * k) * (k + 1) / (n - k - 1))
  bic <- - 2 * ll + (k * log(n))

  data.frame(
   Df = k,
    "n2ll" = -2 * ll,
    AIC = aic,
    AICc = aicc,
    BIC = bic)
}
```

Now if we run this we obtain all of the statistics for the previous generalised gamma model fit.

```{r}
fitstats.flexsurvreg(fitg)
```

### Bayesian model with `survHE` package

First let us fit an example model using `survHE`.

```{r warning=FALSE, message=FALSE}
library(survHE)

data(bc)

mle <- fit.models(formula = Surv(recyrs,censrec) ~ group,
                  data = bc,
                  distr = "exp",
                  method = "mle")
```

Now, the print method for class `survHE` returns several model fitting summaries.

```{r}
print(mle)
```

If we wished to access the values directly, perhaps to use in our own code, then we can use

```{r}
mle$model.fitting
```

Further, if we were to fit several different models to compare the IC statistics, which is really the main point of doing it, then `survHE` also has some nice plotting functions.

```{r}
mle_multi <- fit.models(formula = Surv(recyrs,censrec) ~ group,
                  data = bc,
                  distr = c("exp", "weibull", "gompertz", "lognormal", "loglogistic"),
                  method = "mle")

model.fit.plot(mle_multi)
model.fit.plot(mle_multi, type = "BIC")
model.fit.plot(mle_multi, scale = "relative")
```