Habitat, R and the Art of the Pull Request

by Paul Adams

A brief introduction to packaging R for Chef Habitat.

While at ChefConf I had another of my now-infamous "how hard can it be?" moments. This time I was talking about creating a core/R plan. As I am now accustomed to discovering, the answer is not necesarilly "easy".

This blog post, however, is not necesarilly about R and the work to package it. Instead, it is more about how I am doing it.

The R Project

For the uninitiated:

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.

I used R extensively during my PhD, mostly for performing my statistical calculations. Any visualisation of that data was typically being done inside GnuPlot, a tool which I simply knew better.

Building R from source is not particularly troublesome for Habitat. It's primary dependencies, with the exeception of Cairo are al available as core packages. Building Cairo, a story for another day, is not necesarilly trivial. In the case of Habitat, we also need to build, Harfbuzz and Pango to make it work.

The Pull Request

I always thought "Pull Request" was a funny name. To me it implies "Hey, I'm done working on this thing. Please take it." I think they are really a lot more powerful than this. Ultimately, the PR is Github's mechanism for peer reivew and that is what PR should really stand for. Here is how Github describes them:

Pull requests let you tell others about changes you've pushed to a repository on GitHub. Once a pull request is opened, you can discuss and review the potential changes with collaborators and add follow-up commits before the changes are merged into the repository.

See that? "Discuss", "review"... collaborate! This is what PR's are really for and, yet, it is incredible how rarely they are used this way.

Take a look at my core/R PR. My opening for the request was as basic as it could be: "New core plan for R". This is self explanatory and did not really need to state anything more than that. My branch built, the unit tests passed and so did the build tests on Travis. A good start. But was I done?

Clearly not. R provides a function for letting you know what that installation of R is capable of. As someone who never used R's graphical capabilities, it comes as no surprise that it is those capabilities which were broken:

Interfaces supported:
External libraries:        readline, curl
Additional capabilities:   NLS
Options enabled:           shared BLAS, R profiling

Capabilities skipped:      PNG, JPEG, TIFF, cairo, ICU
Options not enabled:       memory profiling

Recommended packages:      yes

Adding most of the graphical capabilities simply requies core/libtiff, core/libpng etc to be added to the packge dependecies. That's easy enough to fix. Cairo, needs a new package and so does some of its own dependencies.

More work.

Seeing that more work was on the horizon, I updaed my PR with a comment to say what I ws working on. Despite my initial submission technically working I decided it was better to continue with adding the graphical stacks and I asked the PR to be marked "DO NOT MERGE".

I am now tag-teaming this work with Steve Danna who is continuing to coordinate his work and mine through conversation on the PR. For example, Steve has opened his own PR to patch core/pixmap which is a change I will need to merge to make my own PR work.

Steve and I have noticed a couple of quirks in the way pkg-config works in Habitat. I'm not sure if this is bug or simply the way pkg_pconfig_dirs has (not) been configured in various plans. I suspect the latter.

Either way, Steve and I are pretty close to delivering core/R and I'll write more about this then.