Why Verify Sources?
Typically, when we package a piece of software, we need a good idea of what we are packaging. And it is important that we actually package what we think we are packaging.
There are many simple solutions for this problem and one of the most common is to provide a hash of the released sources. You’ve probably noticed a checksum provided alongside a source tarball you have downloaded. If you hash the tarball and the result matches the provided hash, then the tarball is what it says it is. Simple!
Not just for packaging, but for many reasons, it is pretty important that developers provide such a mechanism for verifying that a release tarball really is what it says it is.
Verification In Habitat
Verification of sources in Habitat could not be simpler. Someone at the start of your plan, you are going to have something like this:
After Habitat automagically downloads the source file for you, it will check it against the checksum you provide. If they do not match, it will stop the build cycle.
As a nice feature, Habitat caches the downloaded source file. So long as it has a local copy of the file which matches the provided checksum, it does not bother to download the file again. Why would it? Nice.
So that’s how you verify a source file in Habitat. We’re done here?
Well… not quite.
In Steps GnuPG
Habitat gives us this handy mechanism for checking that a source file is what it claims to be. What we do not know: does this file come from a trusted source? Anybody could have generated that tarball and provided it.
One of the most common solutions for the “can I trust where this came from?” question is for the producer of the source file to create a signature for the file using GnuPG. If you are new to GnuPG and file signing, you can learn some more over here.
Out of the box, Habitat does not provide a mechanism for checking a source’s signature. That said, it is super simple to achieve. Take a look at this partial plan:
Helpfully, Habitat allows us to overload some of its basic functionality. In this case, I have overloaded the default approaches to downloading sources and verifying them. In both cases, I start by actually carrying out the default behaviour. Then the magic kicks in.
In the case of the do_download() overload, I have also called the
download_file function in order to get my hands on the signature file
that relates to the source I have downloaded. Use of downlod_file() in
this manner is important: it ensures the file will not be downloaded
with every single build so long as the checksum is correct. It is,
therefore, advisable to use download_file() over, say
It is in the do_verify() overload where things get interesting. After performing the default verification, a temporary folder for stashing GnuPG config/output is created. Once this is done, it is super simple to grab the public key of the signer and test the signature before trashing the temporary config altogether. If either of these GnuPG steps had not succeeded, the overall do_verify() would fail and, thus, the build, too.
I’d love to see this, somehow, as core functionality within Habitat. I
pkg_signing_key= option should not be too hard to implement;
portability is certainly not an issue given GnuPG’s availability on
Windows and Mac as well as Linux.
I’m really not sure how elegant, or not, this solution is. I’m happy enough that it works.
A big shout-out to the maintainers of the CrateDB Dockerfile from whom I lifted the core ideas for this approach.