Reviewing The Touchstone R Package
An in-depth review of touchstone, an R tool for benchmarking performance of code
- 1. Introduction
- 2. Concept
- 3. Installation
- 4. Package Usage
- 5. Understanding the script
- 6. Running the script
1. Introduction
Touchstone
is an R tool which provides accurate benchmarking features for testing other R packages. It provides continuous benchmarking with reliable relative measurement and uncertainty reporting.
It is enriched with features which are very useful especially with respect to merging a Pull Request into target branch. Therefore, it is integrated with GitHub Continuous Integration(CI) which helps to automate the whole process.
2. Concept
For your PR branch and the target branch, touchstone
will :
- build two versions of the package in isolated libraries.
- measure the accurate relative differences between the branches. The code under experimentation will be run several times in a random order.
- comment the results of the benchmarking on the Pull Request.
- create visualizations to demonstrate the distribution of the timings for both branches.
3. Installation
Installation can be done in two ways :
-
CRAN :
install.packages("touchstone")
-
GitHub :
devtools::install_github("lorenzwalthert/touchstone")
4. Package Usage
Initialize touchstone by running
touchstone::use_touchstone()
The above line of code will :
-
Create a directory known as the
touchstone
directory in the root of the repository withconfig.json
andscript.R
. Theconfig.json
contains configurations that define how to run your benchmark. Thescript.R
is the script that runs the benchmark. Theheader.R
contains the default PR header whilst thefooter.R
containing the default PR comment footer. -
Populate the file
touchstone-receive.yaml
in.github/workflows/.
andtouchstone-comment.yaml
in.github/workflows/
respectively. -
Add the
touchstone
directory to.Rbuildignore
file in the root directory of your repository.
Write the workflow files you need to invoke touchstone
in new pull requests into .github/workflows/
.
Modify the touchstone
script touchstone/script.R
to run different benchmarks.
5. Understanding the script
There are three default functions inside the script.R
file.
branch_install
(i) touchstone::branch_install()
branch_install
installs each branch
in a separate library for isolation.
Usage
branch_install(
branches = c(branch_get_or_fail("GITHUB_BASE_REF"),
branch_get_or_fail("GITHUB_HEAD_REF")),
path_pkg = ".",
install_dependencies = FALSE
)
The arguments :
branches
are names of the branches in character vector
path_pkg
represents the path to the package which has a default value of "."
install_dependencies
is a boolean which enables dependencies to be installed, has a default value of FALSE
benchmark_run
(ii) touchstone::benchmark_run()
benchmark_run
runs benchmarks for git branches using function calls from your package.
Usage
benchmark_run(
expr_before_benchmark = { },
...,
branches = c(branch_get_or_fail("GITHUB_BASE_REF"),
branch_get_or_fail("GITHUB_HEAD_REF")),
n = 100,
path_pkg = "."
)
The arguments :
expr_before_benchmark
allows an expression to be executed just before benchmark is run
...
is the named expression of length one with code to benchmark
branches
are names of the branches in character vector
n
is the number of times benchmarks should be run for each of the branches.
path_pkg
represents the path to the package
benchmark_analyze
(ii) touchstone::benchmark_analyze()
benchmark_analyze
creates artifacts used downstream in the GitHub Action to turn raw benchmark results into text and figures.
Usage
benchmark_analyze(
branches = c(branch_get_or_fail("GITHUB_BASE_REF"),
branch_get_or_fail("GITHUB_HEAD_REF")),
names = NULL,
ci = 0.95
)
The arguments :
branches
are the names of the branches in character vector under consideration
names
are the names of the branches which is actually used for the analysis.
ci
represents the confidence level which defaults to 0.95 out of 1
6. Running the script
On the R Interactive console, run
touchstone::run_script()
Usage
run_script(
path = "touchstone/script.R",
branch = branch_get_or_fail("GITHUB_HEAD_REF")
)
The arguments :
path
is the path to the script to run
branch
is the name of the branch corresponding to the library
After committing and pushing the workflow files to default branch, Github CI
will run the benchmarks on every pull request and on each commit while that pull request is open.