Literate programming with Quarto

RaukR 2024 • Advanced R for Bioinformatics

Roy Francis

21-Jun-2024

Quarto

An open-source scientific and technical publishing system built on Pandoc

  • Command-line tool enabling weaving code and documentation using Python, R, Julia, and Observable.
  • Supports Knitr (plain text, markdown) or Jupyter engines.
  • Supports many IDEs
    • RStudio
    • JupyterLab
    • VS Code
    • Neovim
  • Numerous output formats
    • Documents (HTML, PDF, MS Word)
    • Presentations (RevealJS, Powerpoint, Beamer)
    • Websites, Blogs, e-Books
    • Interactive (ObservableJS, Shiny)
  • Features and components
    • Scientific markdown
    • Tables & Figures
    • Equations
    • Citations
    • Call-Out
    • Layout

Quarto

  • Literate programming: Combining code with narrative
  • Reproducible research
  • Documentation and reporting
    • Websites, Presentations, e-books, PDFs …

Installation

  • Install the latest quarto executable
  • R package quarto is a wrapper that runs quarto from R
  • If using RStudio, you need a version v2022.07.1 or newer
  • Visual Studio Code along with quarto extension is a great option too

Quarto Notebook

  • Create a quarto document, ie; a file that ends in .qmd
    • In RStudio, File > New File > Quarto Document
  • Add YAML matter to top
---
title: "This is a title"
format: html
---
  • Text & visual editor

PDF

  • Render to PDF format

  • Requires LaTeX installation

    • Default engine is xelatex
    • A lightweight option is R package tinytex
    • Change pdf-engine as needed
  • Change YAML options

    format: pdf
    pdf-engine: pdflatex
  • Typst format

    format: typst

For PDF options, see here

Presentation

  • Create presentations as HTML (RevealJS) or Powerpoint
  • Change format to revealjs

For RevealJS options, see here

Quarto document anatomy


---
title: "Iris report"
author: "John Doe"
date: "4-Mar-2023"
format: html
---

## Iris

Let's explore the **iris** dataset.

```{r}
#| echo: true
head(iris)
```

Quarto document anatomy


Metadata (YAML)




Text (Markdown)


Code (R, Python, Julia, Observable)

Literate programming is natural language interspersed with programming code for the purpose of documentation, reproducibility and accessibility particularly relevant in data science.

YAML metadata

  • Yet Another Markup Language (YAML)
---
key: value
---
  • 2 space indentation
format:
  html:
    smooth-scroll: true
  • Strings
description: "This report contains..."
  • Multiline string (Literal/Folded)
description: |
  This is
  a multiline
  string
description: >
  This is
  a multiline
  string
  • Arrays
items: [ 1, 2, 3, 4, 5 ]
names: [ "one", "two", "three" ]
names:
  - "one"
  - "two"
  - "three"
  • Dictionary arrays
items:
  - things:
      thing1: huey
      things2: dewey
      thing3: louie
  - other things:
      key: value

YAML basics

YAML metadata


---
title: "Iris report"
author: "John Doe"
date: "4-Mar-2023"
format:
  html:
    toc: true
    number-sections: true
execute:
  echo: false
  warning: false
---

## Iris

Let's explore the **iris** dataset.

### Table

```{r}
#| echo: true
iris[1:3,]
```

### Plot

```{r}
#| label: fig-hist-sepal
#| fig-cap: "Distribution of Sepal lengths."
#| fig-height: 3
hist(iris$Sepal.Length)
```

YAML metadata

title: Report
subtitle: Topic
date: today
author: "John Doe"
format:
  html:
    toc: true
    toc-depth: 3
    number-sections: true
    code-fold: true
    df-print: paged

execute:
  eval: true
  echo: false
  warning: false
  message: false
  freeze: true

Markdown

Human readable markup

### Heading 3

#### Heading 4

*italic text*  
**bold text**  
`code text`  

~~strikethrough~~  
2^10^  
2~10~  
$2^{10}$  
$2_{10}$  

Heading 3

Heading 4

italic text
bold text
code text
strikethrough
210
210
\(2^{10}\)
\(2_{10}\)

$\sum\limits_{n=1}^{10} \frac{3}{2}\cdot n$

- bullet point

Link to [this](somewhere.com)

![](https://www.r-project.org/Rlogo.png)

![](https://www.r-project.org/Rlogo.png){width="50%"}

\(\sum\limits_{n=1}^{10} \frac{3}{2}\cdot n\)

  • bullet point

Link to this

RMarkdown

  • Markdown + embedded R chunks
  • RMarkdown mostly uses Pandoc markdown
  • R code can be executed inline

Today’s date is `r date()`
Today’s date is Mon Jun 10 07:34:32 2024

  • R code can be executed in code chunks
```{r}
date()
```
  • By default, input source code and output results are displayed.
date()
[1] "Mon Jun 10 07:34:32 2024"

RMarkdown • Chunk options

  • Setting chunk options
```{r}
#| eval: false
date()
```
  • Chunk options define how chunks behave
    • eval: false to not evaluate a code chunk
    • echo: false to hide input code
    • output: true to show output, asis to skip styling
    • warning: false hides warnins
    • message: false hides messages
    • error: true shows error message and continues code execution
    • include: false supresses all output
  • Other chunk options include figure options and captions

Rendering

  • Live preview
    • From R console quarto::quarto_preview("report.qmd")
    • From terminal quarto preview report.qmd
  • Render
    • Interactively using the Render button
    • From R console quarto::quarto_render("report.qmd")
    • From terminal quarto render report.qmd

How it all works

quarto render index.qmd

flowchart LR
  input-md(YAML + Markdown\n.md) --> engine-md([MARKDOWN engine]) --> markdown(Markdown\n.md)
  input-qmd(YAML + Markdown + Code\n.qmd, .rmd) --> engine-knitr-a([KNITR engine\nR, Python, Julia, Bash]) --> markdown
  input-jupyter(JupyterLab\n.ipynb) --> engine-jupyter([JUPYTER engine\nPython]) --> markdown
  input-rnw(YAML + Markdown + Code\n.qmd, .rnw) --> engine-knitr-b([KNITR engine\nR, Python, Julia, Bash]) --> tex(Tex\n .tex)
  markdown --> render([RENDER\nPandoc Lua])
  tex --> render
  render --> output-md(Markdown\nGFM, Docusaurus)
  render --> output-html(Reports, Websites, RevealJS slides, Blogs, Manuscripts ...\n.html)
  render --> output-pdf(PDF, Beamer slides\n.pdf)
  render --> output-docx(Word\n.docx)
  render --> output-pptx(Powerpoint\n.pptx)
  render --> output-wikis(Wikis)
  output-html --> publish([PUBLISH]) --> hosting(Github pages\n Quarto pub\n Netlify)
  publish --> confluence(Confluence)

Parameterized reports

  • Parameters can be passed into a document during rendering
  • Define parameters/defaults in YAML
---
title: "My Document"
params:
  alpha: 0.1
  ratio: 0.1
---
  • Access parameters as such
```{r}
params$alpha
```
  • Pass parameters through command-line or params file

quarto render document.qmd -P alpha:0.2 -P ratio:0.3

For more parameter options, see here

Projects

  • Render all files as a project
  • Share YAML configuration across formats
  • Examples of project: website, book
  • Defined in _quarto.yml
project:
  output-dir: _output

toc: true
number-sections: true
  
format:
  html:
    css: styles.css
  pdf:
    documentclass: report
    margin-left: 30mm
  • Directory level metadata is also allowed
  • Defined in _metadata.yml
format:
  revealjs: 
    menu: false
    progress: false
search: false
  • YAML merging priority:
    root < directory level < document
  • Render all files: quarto render

Interactive documents

  • Make documents interactive using htmlwidgets, Shiny or ObservableJS
  • ObservableJS and htmlwidgets run in the browser
  • Shiny generally requires a server

Interactive documentation

htmlwidgets

R package plotly provides R binding around javascript plotting library plotly.

```{r}
library(plotly)
iris %>%
    plot_ly(x = ~Sepal.Length, y = ~Sepal.Width, 
    color = ~Species, width = 550, height = 400) %>%
    add_markers()
```

ObservableJS

Quarto supports OJS for interactive visualizations in the browser

irism <- iris
colnames(irism) <- gsub("[.]","_",tolower(colnames(irism)))
ojs_define(ojsd = irism)
ojsdata = transpose(ojsd)
viewof x = Inputs.select(Object.keys(ojsdata[0]), {value: "sepal_length", multiple: false, label: "X axis"})
viewof y = Inputs.select(Object.keys(ojsdata[0]), {value: "sepal_width", multiple: false, label: "Y axis"})
Plot.plot({
  marks: [
    Plot.dot(ojsdata, {
      x: x, y: y, fill: "species",
      title: (d) => `${d.species} \n Petal length: ${d.petal_length} \n Sepal length: ${d.sepal_length}`})
  ],
  grid: true
})

Publish

Quarto supports directly publishing to several popular services

  • Quarto Pub: Free public publishing for Quarto documents, websites, and books
  • GitHub pages
  • Netlify
  • Confluence
  • quarto publish quarto-pub
  • _publish.yml stores information
- source: project
  quarto-pub:
    - id: "5f3abafe-68f9-4c1d-835b-9d668b892001"
      url: "https://njones.quarto.pub/blog"

Publishing documentation

Quarto from the terminal

>  quarto --help

Commands:

render          - Render files or projects to various document types.        
preview         - Render and preview a document or website project.          
serve           - Serve a Shiny interactive document.                        
create          - Create a Quarto project or extension                       
create-project  - Create a project for rendering multiple documents          
convert         - Convert documents to alternate representations.            
pandoc          - Run the version of Pandoc embedded within Quarto.          
run             - Run a TypeScript, R, Python, or Lua script.                
add             - Add an extension to this folder or project                 
install         - Installs an extension or global dependency.                
publish         - Publish a document or project. Available providers include:
check           - Verify correct functioning of Quarto installation.         
help            - Show this help or the help of a sub-command. 
>  quarto --version
1.4.549

Extending Quarto

Acknowledgements

Getting to know Quarto, Julia Müller, R-Ladies Freiburg 2022

Welcome to Quarto, Tom Mock, Posit Meetup 2023

Thank you! Questions?

         _                  
platform x86_64-pc-linux-gnu
os       linux-gnu          
major    4                  
minor    3.2                

2024 • SciLifeLabNBISRaukR

Compared to Rmd

  • Quarto is a command-line tool independent of R
  • Supports multiple languages seamlessly (R, Python, Julia, Observable)
  • YAML options maybe slightly different in quarto. Quarto uses hyphens instead of underscores.
    • toc_depth becomes toc-depth
    • number_sections becomes number-sections
    • code_folding becomes code-fold
  • Chunk options are specified inside the chunk like #| echo: false rather than r{echo=FALSE}
  • Many more chunk options are available including figure captions and layout
  • CSS classes are easier to use using ::: notation
  • Many additional functionality out of the box
    • Page layouts
    • Figure layouts, Figure captions and numbering
    • Call-Outs
    • Cross referencing, Citation, Bibliography
    • Margin content
  • Quarto supports htmlwidgets in R and jupyter widgets for Python/Julia
  • Client-side interactivity using ObservableJS

Output formats

Rmd Quarto
html_document html
pdf_document pdf
word_document docx
beamer_presentation beamer
powerpoint_presentation pptx
revealjs revealjs
xaringan
distill/tufte quarto article layout
html_document2 quarto crossref
pdf_document2 quarto crossref
word_document2 quarto crossref
blogdown/distill quarto website/quarto blog
bookdown quarto books
shiny documents quarto interactive documents
pagedown
rticles
flexdashboard