Programming/R/Debugging R packages written in CPP

From Thalesians Wiki

Debugging R packages written in C++

Suppose you are faced with an R package, the bulk of which is written in C++. Such a package may be monolithic, and without looking at the C++ code it may be difficult to figure out what’s going on.

The R package "specs" (https://github.com/wijler/specs/) is an example of such a package.

First, a word of warning: don’t try to compile this code using Visual Studio. You will run into all sorts of compatibility issues with R headers and libraries and can potentially waste quite a bit of time. This approach is therefore not recommended.

This is what we are going to do instead. First, we are going to fork the R package, thus obtaining https://github.com/sydx/specs/ (sydx is my GitHub username).

Next, we are going to install this package into R, from GitHub. For this we can either use the R package "pak" or the R package "devtools". We will go for the more standard "devtools". It turns out it’s not a trivial task to install "devtools" itself, even on a new AWS instance with a clean Ubuntu installation. "devtools" has various prerequisites and we can get stuck at several steps. Let us consider these prerequisites and what can go wrong during their installation.

install.packages("openssl")

You may get the error message

-------------------------- [ERROR MESSAGE] ---------------------------
tools/version.c:1:10: fatal error: openssl/opensslv.h: No such file or directory
1 | #include <openssl/opensslv.h>
| ^~~~~~~~~~~~~~~~~~~~
compilation terminated.
--------------------------------------------------------------------

This is solved with

sudo apt-get install libssl-dev

Next prerequisite:

install.packages("curl")

You may get the error message

-------------------------- [ERROR MESSAGE] ---------------------------
<stdin>:1:10: fatal error: curl/curl.h: No such file or directory
compilation terminated.
--------------------------------------------------------------------

This is solved with

sudo apt-get install libcurl4-openssl-dev

Next prerequisite: install.packages("usethis") Next prerequisite: install.packages("systemfonts") You may get the error message

-------------------------- [ERROR MESSAGE] ---------------------------
<stdin>:1:10: fatal error: fontconfig/fontconfig.h: No such file or directory
compilation terminated.
--------------------------------------------------------------------

This is solved with

sudo apt-get install libfontconfig1-dev

Next prerequisite:

install.packages("textshaping")

You may get the error message

-------------------------- [ERROR MESSAGE] ---------------------------
<stdin>:1:10: fatal error: hb-ft.h: No such file or directory
compilation terminated.
--------------------------------------------------------------------

This is solved with

sudo apt-get install libharfbuzz-dev
sudo apt-get install libfribidi-dev

Next prerequisite:

install.packages("ragg")

You may run into the error message

-------------------------- [ERROR MESSAGE] ---------------------------
<stdin>:1:10: fatal error: ft2build.h: No such file or directory
compilation terminated.
--------------------------------------------------------------------

And a few other problems when installing this package. This is solved with

sudo apt-get install libfreetype6-dev
sudo apt-get install libfreetype6-dev libpng-dev libtiff5-dev libjpeg-dev

and then installing a development version of this package from the source code repository using the R package "pak":

install.packages("pak")
pak::pak("r-lib/ragg")

(In fact, at this point, we could have installed our R package of interest, "specs", using "pak", but we decided to still install it using "devtools".)

Next prerequisites:

install.packages("pkgdown")
install.packages("rcmdcheck")
install.packages("rversions")
install.packages("urlchecker")

Finally, we can install "devtools" itself:

install.packages("devtools")

Using "devtools", we install the "specs" package from our forked repository:

library("devtools")
devtools::install_github("sydx/specs")

We can then load this package and work with it in R as usual:

library(specs)
unemployment <- Unempl_GT[,1] #Extract the Dutch unemployment levels (x1,000)
GT <- Unempl_GT[,2:11] #Select the first ten Google Trends
my_specs <- specs(unemployment,GT,p=1) #Estimate specs
my_coefs <- my_specs$gammas #store the coefficients
y_d <- my_specs$y_d #Transformed dependent variable
z_l <- my_specs$v #Transformed independent variables

This package does not provide much in terms of diagnostics, so we are going to add some debug output to the package source, which is in C++. First,

remove.packages('specs')

From shell, we

cd ~
mkdir dev
cd dev
git clone git@github.com:sydx/specs.git

(Make sure that we have the correct SSH keys on GitHub and on our operating system – see https://docs.github.com/en/authentication/connecting-to-github-with-ssh/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent)

Now

cd specs/src

and

vim specs.cpp

(Of course, if you prefer emacs, use emacs, etc.) To the top of the file, add

#include <iostream>

To the beginning of the body of the specs_rcpp function, add

std::cout << "In specs_rcpp..." << std::endl;

Now

git add specs.cpp
git commit -m "Added some debug output."
git push

Next time you go back to R

library("devtools")
devtools::install_github("sydx/specs")

and run the same R code, you will see the newly added debug output:

> library(specs)
> unemployment <- Unempl_GT[,1] #Extract the Dutch unemployment levels (x1,000)
GT <- Unempl_GT[,2:11] #Select the first ten Google Trends
my_specs <- specs(unemployment,GT,p=1) #Estimate specs
In specs_rcpp...

Now you can use debug output (or indeed attach a debugger, such as gdb) to understand what the package is doing step by step and relate these actions to the corresponding research paper.