Format columns as markdown • ftExtra

This vignette introduces how to format columns in flextable.

library(flextable)
library(ftExtra)
#> Registered S3 method overwritten by 'ftExtra':
#>   method                  from     
#>   as_flextable.data.frame flextable
#> 
#> Attaching package: 'ftExtra'
#> The following object is masked from 'package:flextable':
#> 
#>     separate_header

Why markdown?

The flextable package is an excellent package that allows fine controls on styling tables, and export it to variety of formats (HTML, MS Word, PDF). Especially, when output format is MS Word, this package is the best solution in R.

On the other hand, styling texts with the flextable package often require large efforts. The following example subscripts numeric values in chemical formulas.

df <- data.frame(Oxide = c("SiO2", "Al2O3"), stringsAsFactors = FALSE)
ft <- flextable::flextable(df)

ft %>%
  flextable::compose(
    i = 1, j = "Oxide",
    value = flextable::as_paragraph(
      "SiO", as_sub("2")
    )
  ) %>%
  flextable::compose(
    i = 2, j = "Oxide",
    value = flextable::as_paragraph(
      "Al", as_sub("2"), "O", as_sub("3")
    )
  )

Oxide
SiO2
Al2O3

The above example has two problems:

This is just a manual re-writing of the table.
- Basically, users will explicitly input which characters to subscript.
- For fine formatting, users have to apply compose for each cells one by one.
Users have to learn a lot of functions from the flextable package
- compose, as_paragraph, and as_sub in the above example

The first point can be solved by using a for loop, however, the code becomes quite complex.

df <- data.frame(Oxide = c("SiO2", "Fe2O3"), stringsAsFactors = FALSE)
ft <- flextable::flextable(df)

for (i in seq_len(nrow(df))) {
  ft <- flextable::compose(
    ft,
    i = i,
    j = "Oxide",
    value = flextable::as_paragraph(
      list_values = df$Oxide[i] %>%
        stringr::str_replace_all("([2-9]+)", " \\1 ") %>%
        stringr::str_split(" ", simplify = TRUE) %>%
        purrr::map_if(
          function(x) stringr::str_detect(x, "[2-9]+"),
          flextable::as_sub
        )
    )
  )
}
ft

Oxide
SiO2
Fe2O3

The ftExtra package provides easy solution by introducing markdown. As markdown texts self-explain their formats by plain texts, what users have to do is manipulations of character columns with their favorite tools such as the famous dplyr and stringr packages.

Preprocess a data frame to decorate texts with markdown syntax.
Convert the data frame into a flextable object with the flextable function or flextable function.
Format markdown columns with colformat_md

The following example elegantly simplifies the prior example.

df <- data.frame(Oxide = c("SiO2", "Fe2O3"), stringsAsFactors = FALSE)

df %>%
  dplyr::mutate(
    Oxide = stringr::str_replace_all(Oxide, "([2-9]+)", "~\\1~")
  ) %>%
  flextable::flextable() %>%
  ftExtra::colformat_md()

Oxide
SiO2
Fe2O3

The colformat_md function is smart enough to detect character columns, so users can start without learning its arguments. Of course, it is possible to chose columns.

Another workflow is to read a markdown-formatted table from a external file. Again, markdown is by design a plain text, and can easily be embed in any formats such as CSV and Excel. So users can do something like

readr::read_csv("example.csv") %>%
  flextable::flextable() %>%
  ftExtra::colformat_md()

By default, the ftExtra package employs Pandoc’s markdown, which is also employed by R Markdown. This enables consistent user experience when using the ftExtra package in R Markdown.

Basic examples

The example below shows that colformat_md() function parses markdown texts in the flextable object.

data.frame(
  a = c("**bold**", "*italic*"),
  b = c("^superscript^", "~subscript~"),
  c = c("`code`", "[underline]{.underline}"),
  d = c(
    "*[**~ft~^Extra^**](https://ftextra.atusy.net/) is*",
    "[Cool]{.underline shading.color='skyblue'}"
  ),
  stringsAsFactors = FALSE
) %>%
  flextable() %>%
  colformat_md()

a	b	c	d
bold	superscript	code	ft Extra is
italic	subscript	underline	Cool

The table header can also be formatted by specifying part = "header" or "all" to colformat_md()

Supported syntax are

bold
italic
code
^superscript
_subscript
link
footnote
image
line break
citations
math
attributes with Span, Link, Code, and so on
- to underline: [foo]{.underline})
- to color: [foo]{color=red}
- to highlight: [foo]{shading.color=gray}
- to change font: [foo]{font.family=Roboto}
- and the combinations of the above

Notes:

other syntax may result in unexpected behaviors.
multiple paragraphs are collapsed to a single paragraph with a separator given to the .sep argument (default: "\n\n").

Footnotes

An easy way to add a footnote is inline footnote.

data.frame(
  package = "ftExtra",
  description = "Extensions for 'Flextable'^[Supports of footnotes]",
  stringsAsFactors = FALSE
) %>%
  flextable() %>%
  colformat_md() %>%
  flextable::autofit(add_w = 0.5)

package	description
ftExtra	Extensions for ‘Flextable’1
1Supports of footnotes

Reference symbols can be configured by footnote_options(). Of course, markdown can be used inside footnotes as well.

data.frame(
  package = "ftExtra^[Short of *flextable extra*]",
  description = "Extensions for 'Flextable'^[Supports of footnotes]",
  stringsAsFactors = FALSE
) %>%
  flextable() %>%
  colformat_md(
    .footnote_options = footnote_options(
      ref = "i",
      prefix = "[",
      suffix = "]",
      start = 2,
      inline = TRUE,
      sep = "; "
    )
  ) %>%
  flextable::autofit(add_w = 0.5)

package	description
ftExtra[ii]	Extensions for ‘Flextable’[iii]
[ii]Short of flextable extra; [iii]Supports of footnotes;

In order to add multiple footnotes to a cell, use normal footnotes syntax.

data.frame(
  x =
    "foo[^a]^,^ [^b]

[^a]: aaa

[^b]: bbb",
  stringsAsFactors = FALSE
) %>%
  flextable() %>%
  colformat_md()

x
foo1, 2
1aaa
2bbb

Experimentally, reference symbols can be formatted by an user-defined function.

#' Custom formatter of reference symbols
#'
#' @param n n-th reference symbol (integer)
#' @param part where footnote exists: "body" or "header"
#' @param footer whether to format symbols in the footer: `TRUE` or `FALSE`
#'
#' @return a character vector which will further be processed as markdown texts
ref <- function(n, part, footer) {
  # Header uses letters and body uses integers for the symbols
  s <- if (part == "header") {
    letters[n]
  } else {
    as.character(n)
  }

  # Suffix symbols with ": " (a colon and a space) in the footer
  if (footer) {
    return(paste0(s, ":\\ "))
  }

  # Use superscript in the header and the body
  return(paste0("^", s, "^"))
}

# Apply custom function to format a table with footnotes
tibble::tibble(
  "header1^[note a]" = c("x^[note 1]", "y"),
  "header2" = c("a", "b^[note 2]")
) %>%
  flextable() %>%
  # process header first
  colformat_md(
    part = "header", .footnote_options = footnote_options(ref = ref)
  ) %>%
  # process body next
  colformat_md(
    part = "body", .footnote_options = footnote_options(ref = ref)
  ) %>%
  # tweak width for visibility
  flextable::autofit(add_w = 0.2)

header1a	header2
x1	a
y	b2
a: note a
1: note 1
2: note 2

Some notes:

colformat_md() should be applied separately to the header and the body. In other words, part = "all" is not recommended. That may order footnotes unexpectedly.

footnote_options(ref) should not be shared among the header and the body.

# DO NOT SHARE fopts among header and body
fopts <- footnote_options(ref)
... %>%
  colformat_md(part = "header", .footnote_options = fopts) %>%
  colformat_md(part = "body", .footnote_options = fopts)

Images

Images can be inserted optionally with width and/or height attributes. Specifying one of them changes the other while keeping the aspect ratio.

data.frame(
  R = sprintf("![](%s)", file.path(R.home("doc"), "html", "logo.jpg")),
  stringsAsFactors = FALSE
) %>%
  flextable() %>%
  colformat_md() %>%
  flextable::autofit()

R

The R logo is distributed by The R Foundation with the CC-BY-SA 4.0 license.

Line breaks

By default, soft line breaks becomes spaces.

data.frame(linebreak = c("a\nb"), stringsAsFactors = FALSE) %>%
  flextable() %>%
  colformat_md()

linebreak
a b

Pandoc’s markdown supports hard line breaks by adding a backslash or double spaces at the end of a line.

data.frame(linebreak = c("a\\\nb"), stringsAsFactors = FALSE) %>%
  flextable() %>%
  colformat_md()

linebreak
a b

It is also possible to make \n as a hard line break by extending Pandoc’s Markdown.

data.frame(linebreak = c("a\nb"), stringsAsFactors = FALSE) %>%
  flextable() %>%
  colformat_md(md_extensions = "+hard_line_breaks")

linebreak
a b

Markdown treats continuous linebreaks as a separator of blocks such as paragraphs. However, flextable package lacks the support for multiple paragraphs in a cell. To workaround, colformat_md collapses them to a single paragraph with a separator given to .sep (default: \n\n).

data.frame(linebreak = c("a\n\nb"), stringsAsFactors = FALSE) %>%
  flextable() %>%
  colformat_md(.sep = "\n\n")

linebreak
a b

Citations

Citations is experimentally supported. Note that there are no citation lists. It is expected to be produced by using R Markdown.

First, create a ftExtra.bib file like below.

@Manual{R-ftExtra,
  title = {ftExtra: Extensions for Flextable},
  author = {Atsushi Yasumoto},
  year = {2024},
  note = {R package version 0.6.4},
  url = {https://ftextra.atusy.net},
}

Second, specify it, and optionally a CSL file, within the YAML front matter.

---
bibliography: ftExtra.bib
# csl: https://raw.githubusercontent.com/citation-style-language/styles/master/apa.csl
---

Finally, cite the references within tables.

data.frame(
  Cite = c("@R-ftExtra", "[@R-ftExtra]", "[-@R-ftExtra]"),
  stringsAsFactors = FALSE
) %>%
  flextable() %>%
  colformat_md() %>%
  flextable::autofit(add_w = 0.2)

Cite
Yasumoto (2024)
(Yasumoto 2024)
(2024)

If citation style such as Vancouver requires citations be numbered sequentially and consistently with the body, manually offset the number for example by colformat_md(.cite_offset = 5).

Math

The rendering of math is also possible.

data.frame(
  math = "$e^{i\\theta} = \\cos \\theta + i \\sin \\theta$",
  stringsAsFactors = FALSE
) %>%
  flextable() %>%
  colformat_md() %>%
  flextable::autofit(add_w = 0.2)

math
eiθ = cos θ + isin θ

Note that results can be insufficient. This feature relies on Pandoc’s HTML writer, which

render TeX math as far as possible using Unicode characters
https://pandoc.org/MANUAL.html#math-rendering-in-html

Emoji

Pandoc’s markdown provides an extension, emoji. To use it with colformat_md(), specify md_extensions="+emoji".

data.frame(emoji = c(":+1:"), stringsAsFactors = FALSE) %>%
  flextable() %>%
  colformat_md(md_extensions = "+emoji")

emoji
👍

Other input formats

colformat_md supports variety of formats. They can even be HTML despite the name of the function.

data.frame(
  x = "H<sub>2</sub>O",
  stringsAsFactors = FALSE
) %>%
  flextable() %>%
  colformat_md(.from = "html")

x
H2O

Note that multiple paragraphs are not supported if .from is not "markdown". Below is an example with commonmark.

data.frame(
  x = "foo\n\nbar",
  stringsAsFactors = FALSE
) %>%
  flextable() %>%
  colformat_md(.from = "commonmark")

x
foobar