Appendix H — Common Mistakes

Published

Last modified: 2024-07-03: 8:25:30 (AM)



Configuring R

Functions from these packages will be used throughout this document:

Show R code
library(conflicted) # check for conflicting function definitions
# library(printr) # inserts help-file output into markdown output
library(rmarkdown) # Convert R Markdown documents into a variety of formats.
library(pander) # format tables for markdown
library(ggplot2) # graphics
library(ggeasy) # help with graphics
library(ggfortify) # help with graphics
library(dplyr) # manipulate data
library(tibble) # `tibble`s extend `data.frame`s
library(magrittr) # `%>%` and other additional piping tools
library(haven) # import Stata files
library(knitr) # format R output for markdown
library(tidyr) # Tools to help to create tidy data
library(plotly) # interactive graphics
library(dobson) # datasets from Dobson and Barnett 2018
library(parameters) # format model output tables for markdown
library(haven) # import Stata files
library(latex2exp) # use LaTeX in R code (for figures and tables)
library(fs) # filesystem path manipulations
library(survival) # survival analysis
library(survminer) # survival analysis graphics
library(KMsurv) # datasets from Klein and Moeschberger
library(parameters) # format model output tables for
library(webshot2) # convert interactive content to static for pdf
library(forcats) # functions for categorical variables ("factors")
library(stringr) # functions for dealing with strings
library(lubridate) # functions for dealing with dates and times

Here are some R settings I use in this document:

Show R code
rm(list = ls()) # delete any data that's already loaded into R

conflicts_prefer(dplyr::filter)
ggplot2::theme_set(
  ggplot2::theme_bw() + 
        # ggplot2::labs(col = "") +
    ggplot2::theme(
      legend.position = "bottom",
      text = ggplot2::element_text(size = 12, family = "serif")))

knitr::opts_chunk$set(message = FALSE)
options('digits' = 4)

panderOptions("big.mark", ",")
pander::panderOptions("table.emphasize.rownames", FALSE)
pander::panderOptions("table.split.table", Inf)
conflicts_prefer(dplyr::filter) # use the `filter()` function from dplyr() by default
legend_text_size = 9

H.1 Parameters versus random variables

The parameters of a probability distribution shouldn’t involve the random variables being modeled:

This is wrong

\[X \sim Pois(\lambda)\] \[\hat{\lambda}_{ML} \rightarrow_D N(\bar{X}, \lambda/n)\]

Solution. \[\hat{\lambda}_{ML} \rightarrow_D N(\lambda, \lambda/n)\]

Expectations are means, not sums, despite the similarity of \(\Sigma\) and \(\text{E}\). Really, we should use \(\mu\) instead of \(\text{E}\).

H.2 R

H.2.1 Don’t copy-paste code

Successful programmers don’t use copy-paste! Write functions instead.

H.3 Quarto

H.3.1 Separate divs and slide breaks

Make sure not to put a div ::: on the next line after a slide break ---:

---
::: notes
:::

There needs to be an empty line between them:

---

::: notes
:::

H.3.2 library(printr) currently breaks df-print: paged

See https://github.com/yihui/printr/issues/41

H.4 LaTeX

Double superscript issues: https://www.overleaf.com/learn/latex/Errors/Double_superscript