Appendix B — Mathematics
These lecture notes use:
- algebra
- precalculus
- univariate calculus
- linear algebra
- vector calculus
Some key results are listed here.
B.1 Elementary Algebra
Mastery of Elementary Algebra (a.k.a. “College Algebra”) is a prerequisite for calculus, which is a prerequisite for Epi 202 and Epi 203, which are prerequisites for this course (Epi 204). Nevertheless, each year, some Epi 204 students are still uncomfortable with algebraic manipulations of mathematical formulas. Therefore, I include this section as a quick reference.
B.1.1 Equalities
Theorem B.1 (Equalities are transitive) If \(a=b\) and \(b=c\), then \(a=c\)
Theorem B.2 (Substituting equivalent expressions) If \(a = b\), then for any function \(f(x)\), \(f(a) = f(b)\)
B.1.2 Inequalities
Theorem B.3 If \(a<b\), then \(a+c < b+c\)
Theorem B.4 (negating both sides of an inequality) If \(a < b\), then: \(-a > -b\)
Theorem B.5 If \(a < b\) and \(c \geq 0\), then \(ca < cb\).
Theorem B.6 \[-a = (-1)*a\]
B.1.3 Sums
Theorem B.7 (adding zero changes nothing) \[a+0=a\]
Theorem B.8 (Sums are symmetric) \[a+b = b+a\]
Theorem B.9 (Sums are associative)
When summing three or more terms, the order in which you sum them does not matter:
\[(a + b) + c = a + (b + c)\]
B.1.4 Products
Theorem B.10 (Multiplying by 1 changes nothing) \[a \times 1 = a\]
Theorem B.11 (Products are symmetric) \[a \times b = b \times a\]
Theorem B.12 (Products are associative) \[(a \times b) \times c = a \times (b \times c)\]
B.1.5 Division
Theorem B.13 (Division can be written as a product) \[\frac {a}{b} = a \times \frac{1}{b}\]
B.1.6 Sums and products together
Theorem B.14 (Multiplication is distributive) \[a(b+c) = ab + ac\]
B.1.7 Quotients
Definition B.1 (Quotients, fractions, rates)
A quotient, fraction, or rate is a division of one quantity by another:
\[\frac{a}{b}\]
In epidemiology, rates typically have a quantity involving time or population in the denominator.
Definition B.2 (Ratios) A ratio is a quotient in which the numerator and denominator are measured using the same unit scales.
Definition B.3 (Proportion) In statistics, a “proportion” typically means a ratio where the numerator represents a subset of the denominator.
See https://en.wikipedia.org/wiki/Population_proportion.
See also https://en.wikipedia.org/wiki/Proportion_(mathematics) for other meanings.
Definition B.4 (Proportional) Two functions \(f(x)\) and \(g(x)\) are proportional if their ratio \(\frac{f(x)}{g(x)}\) does not depend on \(x\). (c.f. https://en.wikipedia.org/wiki/Proportionality_(mathematics))
Additional reference for elementary algebra: https://en.wikipedia.org/wiki/Population_proportion#Mathematical_definition
B.2 Exponentials and Logarithms
Theorem B.15 (logarithm of a product is the sum of the logs of the factors) \[ \log{a\cdot b} = \log{a} + \log{b} \]
Corollary B.1 (logarithm of a quotient)
The logarithm of a quotient is equal to the difference of the logs of the factors:
\[\log{\frac{a}{b}} = \log{a} - \log{b}\]
Theorem B.16 (logarithm of an exponential function) \[ \log{a^b} = b \cdot\log{a} \]
Theorem B.17 (exponential of a sum)
The exponential of a sum is equal to the product of the exponentials of the addends:
\[\text{exp}\left\{a+b\right\} = \text{exp}\left\{a\right\} \cdot\text{exp}\left\{b\right\}\]
Corollary B.2 (exponential of a difference)
The exponential of a difference is equal to the quotient of the exponentials of the addends:
\[\text{exp}\left\{a-b\right\} = \frac{\text{exp}\left\{a\right\}}{\text{exp}\left\{b\right\}}\]
Theorem B.18 (exponential of a product) \[a^{bc} = \left(a^b\right)^c = \left(a^c\right)^b\]
Corollary B.3 (natural exponential of a product) \[\text{exp}\left\{ab\right\} = (\text{exp}\left\{a\right\})^b = (\text{exp}\left\{b\right\})^a\]
Theorem B.19 (exp{} and log{} are mutual inverses) \[\text{exp}\left\{\log{a}\right\} = \log{\text{exp}\left\{a\right\}} = a\]
B.3 Derivatives
Theorem B.20 (Derivatives of polynomials) \[\frac{\partial}{\partial x}x^q = qx^{q-1}\]
Theorem B.21 (derivative of natural logarithm) \[\text{log}'\left\{x\right\} = \frac{1}{x} = x^{-1}\]
Theorem B.22 (derivative of exponential) \[\text{exp}'\left\{x\right\} = \text{exp}\left\{x\right\}\]
Theorem B.23 (Product rule) \[(ab)' = ab' + ba'\]
Theorem B.24 (Quotient rule) \[(a/b)' = a'/b - (a/b^2)b'\]
Theorem B.25 (Chain rule) \[\frac{\partial a}{\partial c} = \frac{\partial a}{\partial b} \frac{\partial b}{\partial c}\]
i.e.,
\[(f(g(x)))' = f'(g(x)) g'(x)\]
Corollary B.4 (Chain rule for logarithms) \[ \frac{\partial}{\partial x}\log{f(x)} = \frac{f'(x)}{f(x)} \]
Proof. Apply Theorem B.25 and Theorem B.21.
B.4 Linear Algebra
Definition B.5 (Dot product/linear combination/inner product) For any two real-valued vectors \(\tilde{x}= (x_1, \ldots, x_n)\) and \(\tilde{y}= (y_1, \ldots, y_n)\), the dot-product, linear combination, or inner product of \(\tilde{x}\) and \(\tilde{y}\) is:
\[\tilde{x}\cdot \tilde{y}= \tilde{x}^{\top} \tilde{y}\stackrel{\text{def}}{=}\sum_{i=1}^nx_i y_i\]
Theorem B.26 (Dot product is symmetric) The dot product is symmetric:
\[\tilde{x}\cdot \tilde{y}= \tilde{y}\cdot \tilde{x}\]
Proof. Apply:
- Definition B.5
- symmetry of scalar multiplication
- Definition B.5 again
B.5 Vector Calculus
(adapted from Fieller (2016), §7.2)
Let \(\tilde{x}\) and \(\tilde{\beta}\) be vectors of length \(p\), or in other words, matrices of length \(p \times 1\):
\[ \tilde{x}= \begin{bmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_{p} \end{bmatrix} \\ \]
\[ \tilde{\beta}= \begin{bmatrix} \beta_{1} \\ \beta_{2} \\ \vdots \\ \beta_{p} \end{bmatrix} \]
Definition B.6 (Transpose) The transpose of a row vector is the column vector with the same sequence of entries:
\[ \tilde{x}' \equiv \tilde{x}^\top \equiv [x_1, x_2, ..., x_p] \]
Example B.1 (Dot product as matrix multiplication) \[ \begin{aligned} \tilde{x}\cdot \tilde{\beta} &= \tilde{x}^{\top} \tilde{\beta} \\ &= [x_1, x_2, ..., x_p] \begin{bmatrix} \beta_{1} \\ \beta_{2} \\ \vdots \\ \beta_{p} \end{bmatrix} \\ &= x_1\beta_1+x_2\beta_2 +...+x_p \beta_p \end{aligned} \]
Theorem B.27 (Transpose of a sum) \[(\tilde{x}+\tilde{y})^{\top} = \tilde{x}^{\top} + \tilde{y}^{\top}\]
Definition B.7 (Vector derivative) If \(f(\tilde{\beta})\) is a function that takes a vector \(\tilde{\beta}\) as input, such as \(f(\tilde{\beta}) = x'\tilde{\beta}\), then:
\[ \frac{\partial}{\partial \tilde{\beta}} f(\tilde{\beta}) = \begin{bmatrix} \frac{\partial}{\partial \beta_1}f(\tilde{\beta}) \\ \frac{\partial}{\partial \beta_2}f(\tilde{\beta}) \\ \vdots \\ \frac{\partial}{\partial \beta_p}f(\tilde{\beta}) \end{bmatrix} \]
Definition B.8 (Row-vector derivative) If \(f(\tilde{\beta})\) is a function that takes a vector \(\tilde{\beta}\) as input, such as \(f(\tilde{\beta}) = x'\tilde{\beta}\), then:
\[ \frac{\partial}{\partial \tilde{\beta}^{\top}} f(\tilde{\beta}) = \begin{bmatrix} \frac{\partial}{\partial \beta_1}f(\tilde{\beta}) & \frac{\partial}{\partial \beta_2}f(\tilde{\beta}) & \cdots & \frac{\partial}{\partial \beta_p}f(\tilde{\beta}) \end{bmatrix} \]
Theorem B.28 (Row and column derivatives are transposes) \[\frac{\partial}{\partial \tilde{\beta}^{\top}} f(\tilde{\beta}) = \left(\frac{\partial}{\partial \tilde{\beta}} f(\tilde{\beta})\right)^{\top}\]
\[\frac{\partial}{\partial \tilde{\beta}} f(\tilde{\beta}) = \left(\frac{\partial}{\partial \tilde{\beta}^{\top}} f(\tilde{\beta})\right)^{\top}\]
Theorem B.29 (Derivative of a linear combination) \[ \frac{\partial}{\partial \tilde{\beta}} \tilde{x}^{\top} \tilde{\beta}= x \]
This looks a lot like non-vector calculus, except that you have to transpose the coefficient.
Proof. \[ \begin{aligned} \frac{\partial}{\partial \beta} (x^{\top}\beta) &= \begin{bmatrix} \frac{\partial}{\partial \beta_1}(x_1\beta_1+x_2\beta_2 +...+x_p \beta_p ) \\ \frac{\partial}{\partial \beta_2}(x_1\beta_1+x_2\beta_2 +...+x_p \beta_p ) \\ \vdots \\ \frac{\partial}{\partial \beta_p}(x_1\beta_1+x_2\beta_2 +...+x_p \beta_p ) \end{bmatrix} \\ &= \begin{bmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_{p} \end{bmatrix} \\ &= \tilde{x} \end{aligned} \]
Definition B.9 (Quadratic form) A quadratic form is a mathematical expression with the structure
\[\tilde{x}^{\top} \mathbf{S} \tilde{x}\]
where \(\tilde{x}\) is a vector and \(\mathbf{S}\) is a matrix with compatible dimensions for vector-matrix multiplication.
Quadratic forms occur frequently in regression models. They are the matrix-vector generalizations of the scalar quadratic form \(cx^2 = xcx\).
Theorem B.30 (Derivative of a quadratic form) If \(S\) is a \(p\times p\) matrix that is constant with respect to \(\beta\), then:
\[ \frac{\partial}{\partial \beta} \beta'S\beta = 2S\beta \]
This is like taking the derivative of \(cx^2\) with respect to \(x\) in non-vector calculus.
Corollary B.5 (Derivative of a simple quadratic form) \[ \frac{\partial}{\partial \tilde{\beta}} \tilde{\beta}'\tilde{\beta}= 2\tilde{\beta} \]
This is like taking the derivative of \(x^2\).
Theorem B.31 (Vector chain rule) \[\frac{\partial \tilde{z}}{\partial \tilde{x}} = \frac{\partial \tilde{y}}{\partial \tilde{x}} \frac{\partial \tilde{z}}{\partial \tilde{y}}\]
or in Euler/Lagrange notation:
\[(f(\tilde{g}(\tilde{x})))' = \tilde{g}'(\tilde{x}) f'(\tilde{g}(\tilde{x}))\]
See https://quickfem.com/finite-element-analysis/, specifically https://quickfem.com/wp-content/uploads/IFEM.AppF_.pdf
See also https://en.wikipedia.org/wiki/Gradient#Relationship_with_Fr%C3%A9chet_derivative
This chain rule is like the univariate chain rule (Theorem B.25), but the order matters now. The version presented here is for the gradient (column vector); the total derivative (row vector) would be the transpose of the gradient.
Corollary B.6 (Vector chain rule for quadratic forms) \[\frac{\partial}{\partial \tilde{\beta}}{\left(\tilde{\varepsilon}(\tilde{\beta})\cdot \tilde{\varepsilon}(\tilde{\beta})\right)} = \left(\frac{\partial}{\partial \tilde{\beta}}\tilde{\varepsilon}(\tilde{\beta})\right) \left(2 \tilde{\varepsilon}(\tilde{\beta})\right)\]