Appendix L — Exam formula sheet
L.1 Epi 202: Probability
\[ \begin{aligned} \text{Var}\left(\tilde{a}\cdot \tilde{X}\right) &= \text{Var}\left(\sum_{i=1}^na_i X_i\right) \\ &= \tilde{a}^{\top} \text{Var}\left(\tilde{X}\right) \tilde{a} \\ &= \sum_{i=1}^n\sum_{j=1}^n a_i a_j \text{Cov}\left(X_i,X_j\right) \end{aligned} \]
L.2 Epi 203: Statistical inference
\[\mathscr{L}(\theta) \stackrel{\text{def}}{=}\text{p}(\tilde{X}= \tilde{x}| \Theta = \theta)\]
\[\ell\stackrel{\text{def}}{=}\text{log}\left\{\mathscr{L}(\tilde{x}|\theta)\right\}\]
\[\ell'\stackrel{\text{def}}{=}\frac{\partial}{\partial \theta} \ell(\tilde{x}|\theta)\]
\[\ell''\stackrel{\text{def}}{=}\frac{\partial}{\partial \tilde{\theta}}\frac{\partial}{\partial \tilde{\theta}^{\top}} \ell(\tilde{x}| \tilde{\theta})\]
\[\ell_{ij}''= \frac{\partial}{\partial \theta_i}\frac{\partial}{\partial \theta_j} \ell(\tilde{X}= \tilde{x}| \tilde{\theta})\]
\[I\stackrel{\text{def}}{=}-\ell''(\tilde{x}|\tilde{\theta})\] \[\mathcal{I}\stackrel{\text{def}}{=}\text{E}\left[I(\tilde{x}|\theta)\right]\]
\[\hat\theta_{ML}\ \dot \sim\ \text{N}\left(\theta,\left[\mathcal{I}(\tilde{\theta})\right]^{-1}\right)\]
L.3 Epi 204: Generalized linear models
Generalized linear models have three components:
The outcome distribution family: \(\text{p}(Y|\mu(\tilde{x}))\)
The link function: \(g(\mu(\tilde{x})) = \eta(\tilde{x})\)
The linear component: \(\eta(\tilde{x}) = \tilde{x}\cdot \beta\)
\[ \underbrace{\pi}_{\atop{\Pr(Y=1)} } \overbrace{ \underbrace{ \underset{ \xleftarrow[\frac{\omega}{1+\omega}]{} } { \xrightarrow{\frac{\pi}{1-\pi}} } \underbrace{\omega}_{\text{odds}(Y=1)} \underset{ \xleftarrow[\exp{\eta}]{} } { \xrightarrow{\text{log}\left\{\omega\right\}} } }_{\text{expit}(\eta)} }^{\text{logit}(\pi)} \underbrace{\eta}_{\atop{\text{log-odds}(Y=1)}} \]
\[\theta(\tilde{x},{\tilde{x}^*}) = \exp{(\Delta\tilde{x}) \cdot \tilde{\beta}}\]
L.3.1 Estimates of odds ratios from 2x2 contingency tables
\[\hat\theta=\frac{ad}{bc}\]
L.3.2 Survival analysis
Probability distribution functions
Name | Symbols | Definition |
---|---|---|
Probability density function (PDF) | \(\text{f}(t), \text{p}(t)\) | \(\text{p}(T=t)\) |
Cumulative distribution function (CDF) | \(\text{F}(t), \text{P}(t)\) | \(\text{P}(T\leq t)\) |
Survival function | \(\text{S}(t), \bar{\text{F}}(t)\) | \(\text{P}(T > t)\) |
Hazard function | \(\lambda(t), \text{h}(t)\) | \(\text{p}(T=t|T\ge t)\) |
Cumulative hazard function | \(\Lambda(t), \text{H}(t)\) | \(\int_{u=-\infty}^t {\lambda}(u)du\) |
Log-hazard function | \(\eta(t)\) | \(\text{log}\left\{{\lambda}(t)\right\}\) |
Diagram of survival distribution function relationships
\[ \text{f}(t) \xleftarrow[\text{S}(t){\lambda}(t)]{-S'(t)} \text{S}(t) \xleftarrow[]{\text{exp}\left\{-{\Lambda}(t)\right\}} {\Lambda}(t) \xleftarrow[]{\int_{u=0}^t {\lambda}(u)du} {\lambda}(t) \xleftarrow[]{\text{exp}\left\{\eta(t)\right\}} \eta(t) \]
\[ \text{f}(t) \xrightarrow[\int_{u=t}^\infty \text{f}(u)du]{\text{f}(t)/{\lambda}(t)} \text{S}(t) \xrightarrow[-\log{\text{S}(t)}]{} {\Lambda}(t) \xrightarrow[{\Lambda}'(t)]{} {\lambda}(t) \xrightarrow[\text{log}\left\{{\lambda}(t)\right\}]{} \eta(t) \]
Survival likelihood contributions, assuming non-informative censoring
\[ \begin{aligned} \text{p}(Y=y,D=d) &= [\text{f}_T(y)]^{d} [\text{S}_T(y)]^{1-d} \\ &= [{\lambda}_T(y)]^{d} [\text{S}_T(y)] \end{aligned} \]
Nonparametric time-to-event distribution estimators
\[\hat{\lambda}_i = \frac{d_i}{n_i}\]
\[\hat{\text{S}}_{KM}(t) \stackrel{\text{def}}{=}\prod_{\left\{i:\ t_i < t\right\}} \left[1-\hat{\lambda}_i\right]\]
\[\hat{{\Lambda}}_{NA}(t) \stackrel{\text{def}}{=}\sum_{\left\{i:\ t_i < t\right\}}\hat{\lambda}_i\]
Proportional hazards model structure
Joint likelihood of data set: \(\mathscr{L}\stackrel{\text{def}}{=}\text{p}(\tilde{Y}= \tilde{y}, \tilde{D}= \tilde{d}| \mathbf{X}= \mathbf{x})\)
Marginal likelihood contribution of obs. i : \(\mathscr{L}_i \stackrel{\text{def}}{=}\text{p}(Y_i= y_i, D_i = d_i | \tilde{X}_i = \tilde{x}_i)\)
Independent Observations Assumption: \(\mathscr{L}= \prod_{i=1}^n \mathscr{L}_i\)
Non-Informative Censoring Assumption: \(T_i\perp\!\!\!\perp C_i | \tilde{X}_i\)
\[ \mathscr{L}_i \propto [\text{f}_T(y_i|\tilde{x}_i)]^{d_i} [\text{S}_T(y_i | \tilde{x}_i)]^{1-d_i} = \text{S}_T(y_i|\tilde{x}_i) \cdot [{\lambda}_T(y_i|\tilde{x}_i)]^{d_i} \]
Survival function: \(\text{S}(t | \tilde{x}) \stackrel{\text{def}}{=}\text{P}(T > t|\tilde{X}= \tilde{x}) = \int_{u=t}^{\infty} \text{f}(u|\tilde{x})du = \text{exp}\left\{-{\Lambda}(t|\tilde{x})\right\}\)
Probability density function: \(\text{f}(t| \tilde{x}) \stackrel{\text{def}}{=}\text{p}(T=t|\tilde{X}= \tilde{x}) = -\text{S}'(t|\tilde{x}) = {\lambda}(t| \tilde{x}) \text{S}(t | \tilde{x})\)
Cumulative hazard function: \({\Lambda}(t | \tilde{x}) \stackrel{\text{def}}{=}\int_{u=0}^t {\lambda}(u|\tilde{x})du = -\text{log}\left\{\text{S}(t|\tilde{x})\right\}\)
Hazard function: \({\lambda}(t |\tilde{x}) \stackrel{\text{def}}{=}\text{p}(T=t|T\ge t,\tilde{X}= \tilde{x}) = {\Lambda}'(t|\tilde{x}) = \frac{\text{f}(t|\tilde{x})}{\text{S}(t|\tilde{x})}\)
Log-Hazard function: \(\eta(t|\tilde{x}) \stackrel{\text{def}}{=}\text{log}\left\{\lambda(t|\tilde{x})\right\} = \eta_0(t) + \Delta \eta(t|\tilde{x})\)
Proportional Hazards Assumption:
\[ \begin{aligned} {\lambda}(t |\tilde{x}) &= {\lambda}_0(t) \cdot \theta(\tilde{x}) \\ {\Lambda}(t |\tilde{x}) &= {\Lambda}_0(t) \cdot \theta(\tilde{x}) \\ \eta(t|\tilde{x}) &= \eta_0(t) + \Delta \eta(\tilde{x}) \end{aligned} \]
Logarithmic Link Function Assumption:
Link function: \[\text{log}\left\{{\lambda}(t|\tilde{x})\right\} = \eta(t|\tilde{x})\] \[\text{log}\left\{\theta(\tilde{x})\right\} = \Delta \eta(\tilde{x})\]
Inverse link function: \[{\lambda}(t|\tilde{x}) = \text{exp}\left\{\eta(t|\tilde{x})\right\}\] \[\theta(\tilde{x}) = \text{exp}\left\{\Delta \eta(\tilde{x})\right\}\]
Linear Predictor Component:
\[\eta(t|\tilde{x}) = \eta_0(t) + \Delta \eta(t|\tilde{x})\] \[\Delta \eta(t|\tilde{x}) = \tilde{x}\cdot \tilde{\beta}\]
Linear Predictor Component Functional Form Assumption:
\[\Delta \eta(t|\tilde{x}) = \tilde{x}\cdot \tilde{\beta}\stackrel{\text{def}}{=}\beta_1 x_1 + \cdots + \beta_p x_p\]
Proportional hazards model partial likelihood formula:
\[ \begin{aligned} \mathscr{L}^*_i &= \frac{\theta(\tilde{x}_i)}{\sum_{k \in R(t_i)} \theta(\tilde{x}_k)} \\ \mathscr{L}^* &= \prod_{\left\{i:\ d_i = 1\right\}} \mathscr{L}^*_i \end{aligned} \]
Proportional hazards model baseline cumulative hazard estimator:
\[\hat {\Lambda}_0(t) = \sum_{t_i < t} \frac{d_i}{\sum_{k\in R(t_i)} \theta(x_k)}\]