StrongBox: Simple, Safe Data Encryption for Rust

Posted: Wed, 27 August 2025 | permalink | No comments

Some time ago, I wanted to encrypt a bunch of data in an application I was writing in Rust, mostly to be stored in a database, but also session cookies and sensitive configuration variables. Since Rust is widely known as a secure-yet-high-performance programming language, I was expecting that there would be a widely-used crate that gave me a secure, high-level interface to strong, safe cryptography. Imagine my surprise when I discovered that just… didn’t seem to exist.

Don’t get me wrong: Rust is replete with fast, secure, battle-tested cryptographic primitives. The RustCrypto group provides all manner of robust, widely-used crates for all manner of cryptography-related purposes. They’re the essential building blocks for practical cryptosystems, but using them directly in an application is somewhat akin to building a car from individual atoms of iron and carbon.

So I wrote my own high-level data encryption library, called it StrongBox, and have been happily encrypting and decrypting data ever since.

Cryptography So Simple Even I Can’t Get It Wrong

The core of StrongBox is the StrongBox trait, which has only two methods: encrypt and decrypt, each of which takes just two arguments. The first argument is the plaintext (for encrypt) or the ciphertext (for decrypt) to work on. The second argument is the encryption context, for use as Authenticated Additional Data, an important part of many uses of encryption.

There’s essentially no configuration or parameters to get wrong. You can’t choose the encryption algorithm, or block cipher mode, and you don’t have to worry about generating a secure nonce. You create a StrongBox with a key, and then you call encrypt and decrypt. That’s it.

Practical Cryptographic Affordances

Ok, ok… that’s not quite it. Because StrongBox is even easier to use than what I’ve described, thanks to the companion crate, StructBox.

When I started using StrongBox “in the wild”, it quickly became clear that what I almost always wanted to encrypt in my application wasn’t some ethereal “plaintext”. I wanted to encrypt things, specifically structs (and enums). So, through the magic of Rust derive macros, I built StructBox, which provides encrypt and decrypt operations on any Serde-able type. Given that using Serde encoders can be a bit fiddly to use, it’s virtually easier to get an encrypted, serialized struct than it is to get a plaintext serialized struct.

Key Problems in Cryptography

The thing about cryptography is that it largely turns all data security problems into key management problems. All the fancy cryptographic wonkery is for naught if you don’t manage the encryption keys well.

So, most of the fancy business in StrongBox isn’t the encryption and decryption, but instead solving problems around key management.

Different Keys for Different Purposes

Using the same key for all of your cryptographic needs is generally considered a really bad idea. It opens up all manner of risks, that are easily avoided if you use different keys for different things. However, having to maintain a big pile of different keys is a nightmare, so nobody’s going to do that.

Enter: key derivation. Create one safe, secure “root” key, and then use a key derivation function to spawn as many other keys as you need. Different keys for each database column, another one to encrypt cookies, and so on.

StrongBox supports this through the StemStrongBox type. You’ll typically start off by creating a StemStrongBox with the “root” key, and then derive whatever other StrongBoxes you need, for encrypting and decrypting different kinds of data.

You Spin Me Round…

Sometimes, keys need to be rotated. Whether that’s because you actually know (or even have reason to suspect) someone has gotten the key, or just because you’re being appropriately paranoid, sometimes key rotation has to happen.

As someone who has had to rotate keys in situations where such an eventuality was not planned for, I can say with some degree of authority: it absolutely sucks to have to do an emergency key rotation in a system that isn’t built to make that easy. That’s why StrongBox natively supports key rotation. Every StrongBox takes one encryption key, and an arbitrary number of decryption keys, and will automatically use the correct key to decrypt ciphertexts.

Will You Still Decrypt Me, Tomorrow?

In addition to “manual” key rotation, StrongBox also supports time-based key rotation with the RotatingStrongBox type. This comes in handy when you’re encrypting a lot of “ephemeral” data, like cookies (or server-side session data). It provides a way to automatically “expire” old data, and prevents attacks that become practical when large amounts of data are encrypted using a single key.

Invasion of the Invisible Salamanders!

I mostly mention this just because I love the name, but there is a kind of attack possible in common AEAD modes called the invisible salamanders attack. StrongBox implements mitigations against this, by committing to the key being used so that an attacker can’t forge a ciphertext that decrypts validly to different plaintexts when using different keys. This is why I love cryptography: everything sounds like absolute goddamn magic.

Call Me Crazy, Support Me Maybe?

If you’re coding in Rust (which you probably should be), encrypting your stored data (which you definitely should be), and StrongBox makes your life easier (which it really will), you can show your appreciation for my work by contributing to my open source code-fund. Simply by shouting me a refreshing beverage, you’ll be helping me out, and helping to grow the software commons. Alternately, if you’re looking for someone to Speak Rust to Computers on a professional basis, I’m available for contracts or full-time remote positions.

Progress on my open source funding experiment

Posted: Thu, 21 August 2025 | permalink | No comments

When I recently announced that I was starting an open source crowd-funding experiment, I wasn’t sure what would happen. Perhaps there’d be radio silence, or a huge out-pouring of interest from people who wanted to see more open source code in the world. What’s happened so far has been… interesting.

I chose to focus on action-validator because it’s got a number of open feature requests, and it solves a common problem that people have. The thing is, I’ve developed and released a lot of open source over the multiple decades I’ve been noodling around with computers. Much of that has been of use to many people, the overwhelming majority of whom I will never, ever meet, hear from, or even know that I’ve helped them out.

One person, however, I do know about – a generous soul named Andy, who (as far as I know) doesn’t use action-validator, but who does use another tool I wrote some years ago: lvmsync. It’s somewhat niche, essentially “rsync for LVM-backed block devices”, so I’m slightly surprised that it’s my most-starred repository, at nearly 400(!) stars. Andy is one of the people who finds it useful, and he was kind enough to reach out and offer a contribution in thanks for lvmsync existing.

In the spirit of my open source code-fund, I applied Andy’s contribution to the “general” pool, and as a result have just released action-validator v0.8.0, which supports a new --rootdir command-line option, fixing action-validator issue #54. Everyone who uses --rootdir in their action-validator runs has Andy to thank, and I thank him too.

This is, of course, still early days in my experiment. You can be like Andy, and make the open source world a better place, by contributing to my code-fund, and you can get your name up in lights, too. Whether you’re an action-validator user, have gotten utility from any of the other things I’ve written, or just want to see more open source code in the world, your contribution is greatly appreciated.

I'm trying an open source funding experiment

Posted: Wed, 6 August 2025 | permalink | No comments

As I’m currently somewhat underemployed, and could do with some extra income, I’m starting an open source crowd-funding experiment. My hypothesis is that the open source community, and perhaps a community-minded company or two, really wants more open source code in the world, and is willing to put a few dollars my way to make that happen.

To begin with, I’m asking for contributions to implement a bunch of feature requests on action-validator, a Rust CLI tool I wrote to validate the syntax of GitHub actions and workflows. The premise is quite simple: for every AU$150 (about US$100) I receive in donations, I’ll implement one of the nominated feature requests. If people want a particular feature implemented, they can nominate a feature in their donation message, otherwise when “general” donations get to AU$150, I’ll just pick a feature that looks interesting. More details are on my code fund page.

In the same spirit of simplicity, donations can be made through my Ko-fi page, and I’ll keep track of the various totals in a hand-written HTML table.

So, in short, if you want more open source code to exist, now would be a good time to visit my Ko-fi page and chip in a few dollars. If you’re curious to know more, my code fund page has a list of Foreseeably Anticipated Questions that might address your curiosity. Otherwise, ask your questions in the comments or email me.

Object deserialization attacks using Ruby's Oj JSON parser

Posted: Sat, 26 July 2025 | permalink | No comments

tl;dr: there is an attack in the wild which is triggering dangerous-but-seemingly-intended behaviour in the Oj JSON parser when used in the default and recommended manner, which can lead to everyone’s favourite kind of security problem: object deserialization bugs! If you have the oj gem anywhere in your Gemfile.lock, the quickest mitigation is to make sure you have Oj.default_options = { mode: :strict } somewhere, and that no library is overwriting that setting to something else.

Prologue

As a sensible sysadmin, all the sites I run send me a notification if any unhandled exception gets raised. Mostly, what I get sent is error-handling corner cases I missed, but now and then… things get more interesting.

In this case, it was a PG::UndefinedColumn exception, which looked something like this:

PG::UndefinedColumn: ERROR:  column "xyzzydeadbeef" does not exist

This is weird on two fronts: firstly, this application has been running for a while, and if there was a schema problem, I’d expect it to have made itself apparent long before now. And secondly, while I don’t profess to perfection in my programming, I’m usually better at naming my database columns than that.

Something is definitely hinky here, so let’s jump into the mystery mobile!

The column name is coming from outside the building!

The exception notifications I get sent include a whole lot of information about the request that caused the exception, including the request body. In this case, the request body was JSON, and looked like this:

{"name":":xyzzydeadbeef", ...}

The leading colon looks an awful lot like the syntax for a Ruby symbol, but it’s in a JSON string. Surely there’s no way a JSON parser would be turning that into a symbol, right? Right?!?

Immediately, I thought that that possibly was what was happening, because I use Sequel for my SQL database access needs, and Sequel treats symbols as database column names. It seemed like too much of a coincidence that a vaguely symbol-shaped string was being sent in, and the exact same name was showing up as a column name.

But how the flying fudgepickles was a JSON string being turned into a Ruby symbol, anyway? Enter… Oj.

Oj? I barely know… aj

A long, long time ago, the “standard” Ruby JSON library had a reputation for being slow. Thus did many competitors flourish, claiming more features and better performance. Strong amongst the contenders was oj (for “Optimized JSON”), touted as “The fastest JSON parser and object serializer”. Given the history, it’s not surprising that people who wanted the best possible performance turned to Oj, leading to it being found in a great many projects, often as a sub-dependency of a dependency of a dependency (which is how it ended up in my project).

You might have noticed in Oj’s description that, in addition to claiming “fastest”, it also describes itself as an “object serializer”. Anyone who has kept an eye on the security bug landscape will recall that “object deserialization” is a rich vein of vulnerabilities to mine. Libraries that do object deserialization, especially ones with a history that goes back to before the vulnerability class was well-understood, are likely to be trouble magnets.

And thus, it turns out to be with Oj.

By default, Oj will happily turn any string that starts with a colon into a symbol:


>> require "oj"
>> Oj.load('{"name":":xyzzydeadbeef","username":"bob","answer":42}')
=> {"name"=>:xyzzydeadbeef, "username"=>"bob", "answer"=>42}

How that gets exploited is only limited by the creativity of an attacker. Which I’ll talk about more shortly – but first, a word from my rant cortex.

Insecure By Default is a Cancer

While the object of my ire today is Oj and its fast-and-loose approach to deserialization, it is just one example of a pervasive problem in software: insecurity by default. Whether it’s a database listening on 0.0.0.0 with no password as soon as its installed, or a library whose default behaviour is to permit arbitrary code execution, it all contributes to a software ecosystem that is an appalling security nightmare.

When a user (in this case, a developer who wants to parse JSON) comes across a new piece of software, they have – by definition – no idea what they’re doing with that software. They’re going to use the defaults, and follow the most easily-available documentation, to achieve their goal. It is unrealistic to assume that a new user of a piece of software is going to do things “the right way”, unless that right way is the only way, or at least the by-far-the-easiest way.

Conversely, the developer(s) of the software is/are the domain experts. They have knowledge of the problem domain, through their exploration while building the software, and unrivalled expertise in the codebase.

Given this disparity in knowledge, it is tantamount to malpractice for the experts – the developer(s) – to off-load the responsibility for the safe and secure use of the software to the party that has the least knowledge of how to do that (the new user).

To apply this general principle to the specific case, take the “Using” section of the Oj README. The example code there calls Oj.load, with no indication that this code will, in fact, parse specially-crafted JSON documents into Ruby objects. The brand-user user of the library, no doubt being under pressure to Get Things Done, is almost certainly going to look at this “Using” example, get the apparent result they were after (a parsed JSON document), and call it a day.

It is unlikely that a brand-new user will, for instance, scroll down to the “Further Reading” section, find the second last (of ten) listed documents, “Security.md”, and carefully peruse it. If they do, they’ll find an oblique suggestion that parsing untrusted input is “never a good idea”. While that’s true, it’s also rather unhelpful, because I’d wager that by far the majority of JSON parsed in the world is “untrusted”, in one way or another, given the predominance of JSON as a format for serializing data passing over the Internet. This guidance is roughly akin to putting a label on a car’s airbags that “driving at speed can be hazardous to your health”: true, but unhelpful under the circumstances.

The solution is for default behaviours to be secure, and any deviation from that default that has the potential to degrade security must, at the very least, be clearly labelled as such. For example, the Oj.load function should be named Oj.unsafe_load, and the Oj.load function should behave as the Oj.safe_load function does presently. By naming the unsafe function as explicitly unsafe, developers (and reviewers) have at least a fighting chance of recognising they’re doing something risky. We put warning labels on just about everything in the real world; the same should be true of dangerous function calls.

OK, rant over. Back to the story.

But how is this exploitable?

So far, I’ve hopefully made it clear that Oj does some Weird Stuff with parsing certain JSON strings. It caused an unhandled exception in a web application I run, which isn’t cool, but apart from bombing me with exception notifications, what’s the harm?

For starters, let’s look at our original example: when presented with a symbol, Sequel will interpret that as a column name, rather than a string value. Thus, if our “save an update to the user” code looked like this:


# request_body has the JSON representation of the form being submitted
body = Oj.load(request_body)
DB[:users].where(id: user_id).update(name: body["name"])

In normal operation, this will issue an SQL query along the lines of UPDATE users SET name='Jaime' WHERE id=42. If the name given is “Jaime O’Dowd”, all is still good, because Sequel quotes string values, etc etc. All’s well so far.

But, imagine there is a column in the users table that normally users cannot read, perhaps admin_notes. Or perhaps an attacker has gotten temporary access to an account, and wants to dump the user’s password hash for offline cracking. So, they send an update claiming that their name is :admin_notes (or :password_hash).

In JSON, that’ll look like {"name":":admin_notes"}, and Oj.load will happily turn that into a Ruby object of {"name"=>:admin_notes}. When run through the above “update the user” code fragment, it’ll produce the SQL UPDATE users SET name=admin_notes WHERE id=42. In other words, it’ll copy the contents of the admin_notes column into the name column – which the attacker can then read out just by refreshing their profile page.

But Wait, There’s More!

That an attacker can read other fields in the same table isn’t great, but that’s barely scratching the surface.

Remember before I said that Oj does “object serialization”? That means that, in general, you can create arbitrary Ruby objects from JSON. Since objects contain code, it’s entirely possible to trigger arbitrary code execution by instantiating an appropriate Ruby object. I’m not going to go into details about how to do this, because it’s not really my area of expertise, and many others have covered it in detail. But rest assured, if an attacker can feed input of their choosing into a default call to Oj.load, they’ve been handed remote code execution on a platter.

Mitigations

As Oj’s object deserialization is intended and documented behaviour, don’t expect a future release to make any of this any safer. Instead, we need to mitigate the risks. Here are my recommended steps:

Look in your Gemfile.lock (or SBOM, if that’s your thing) to see if the oj gem is anywhere in your codebase. Remember that even if you don’t use it directly, it’s popular enough that it is used in a lot of places. If you find it in your transitive dependency tree anywhere, there’s a chance you’re vulnerable, limited only by the ingenuity of attackers to feed crafted JSON into a deeply-hidden Oj.load call.
If you depend on oj directly and use it in your project, consider not doing that. The json gem is acceptably fast, and JSON.parse won’t create arbitrary Ruby objects.
If you really, really need to squeeze the last erg of performance out of your JSON parsing, and decide to use oj to do so, find all calls to Oj.load in your code and switch them to call Oj.safe_load.
It is a really, really bad idea to ever use Oj to deserialize JSON into objects, as it lacks the safety features needed to mitigate the worst of the risks of doing so (for example, restricting which classes can be instantiated, as is provided by the permitted_classes argument to Psych.load). I’d make it a priority to move away from using Oj for that, and switch to something somewhat safer (such as the aforementioned Psych). At the very least, audit and comment heavily to minimise the risk of user-provided input sneaking into those calls somehow, and pass mode: :object as the second argument to Oj.load, to make it explicit that you are opting-in to this far more dangerous behaviour only when it’s absolutely necessary.
To secure any unsafe uses of Oj.load in your dependencies, consider setting the default Oj parsing mode to :strict, by putting Oj.default_options = { mode: :strict } somewhere in your initialization code (and make sure no dependencies are setting it to something else later!). There is a small chance that this change of default might break something, if a dependency is using Oj to deliberately create Ruby objects from JSON, but the overwhelming likelihood is that Oj’s just being used to parse “ordinary” JSON, and these calls are just RCE vulnerabilities waiting to give you a bad time.

Is Your Bacon Saved?

If I’ve helped you identify and fix potential RCE vulnerabilities in your software, or even just opened your eyes to the risks of object deserialization, please help me out by buying me a refreshing beverage. I would really appreciate any support you can give. Alternately, if you’d like my help in fixing these (and many other) sorts of problems, I’m looking for work, so email me.

Your Release Process Sucks

Posted: Sat, 23 November 2024 | permalink | No comments

For the past decade-plus, every piece of software I write has had one of two release processes.

Software that gets deployed directly onto servers (websites, mostly, but also the infrastructure that runs Pwnedkeys, for example) is deployed with nothing more than git push prod main. I’ll talk more about that some other day.

Today is about the release process for everything else I maintain – Rust / Ruby libraries, standalone programs, and so forth. To release those, I use the following, extremely intricate process:

Create an annotated git tag, where the name of the tag is the software version I’m releasing, and the annotation is the release notes for that version.
Run git release in the repository.
There is no step 3.

Yes, it absolutely is that simple. And if your release process is any more complicated than that, then you are suffering unnecessarily.

But don’t worry. I’m from the Internet, and I’m here to help.

The annotated tag is one git’s best-kept secrets. They’ve been available in git for practically forever (I’ve been using them since at least 2014, which is “practically forever” in software development), yet almost everyone I mention them to has never heard of them.

A “tag”, in git parlance, is a repository-unique named label that points to a single commit (as identified by the commit’s SHA1 hash). Annotating a tag is simply associating a block of free-form text with that tag.

Creating an annotated tag is simple-sauce: git tag -a tagname will open up an editor window where you can enter your annotation, and git tag -a -m "some annotation" tagname will create the tag with the annotation “some annotation”. Retrieving the annotation for a tag is straightforward, too: git show tagname will display the annotation along with all the other tag-related information.

Now that we know all about annotated tags, let’s talk about how to use them to make software releases freaking awesome.

Step 1: Create the Annotated Git Tag

As I just mentioned, creating an annotated git tag is pretty simple: just add a -a (or --annotate, if you enjoy typing) to your git tag command, and WHAM! annotation achieved.

Releases, though, typically have unique and ever-increasing version numbers, which we want to encode in the tag name. Rather than having to look at the existing tags and figure out the next version number ourselves, we can have software do the hard work for us.

Enter: git-version-bump. This straightforward program takes one mandatory argument: major, minor, or patch, and bumps the corresponding version number component in line with Semantic Versioning principles. If you pass it -n, it opens an editor for you to enter the release notes, and when you save out, the tag is automagically created with the appropriate name.

Because the program is called git-version-bump, you can call it as a git command: git version-bump. Also, because version-bump is long and unwieldy, I have it aliased to vb, with the following entry in my ~/.gitconfig:

[alias]
    vb = version-bump -n

Of course, you don’t have to use git-version-bump if you don’t want to (although why wouldn’t you?). The important thing is that the only step you take to go from “here is our current codebase in main” to “everything as of this commit is version X.Y.Z of this software”, is the creation of an annotated tag that records the version number being released, and the metadata that goes along with that release.

Step 2: Run `git release`

As I said earlier, I’ve been using this release process for over a decade now. So long, in fact, that when I started, GitHub Actions didn’t exist, and so a lot of the things you’d delegate to a CI runner these days had to be done locally, or in a more ad-hoc manner on a server somewhere.

This is why step 2 in the release process is “run git release”. It’s because historically, you can’t do everything in a CI run. Nowadays, most of my repositories have this in the .git/config:

[alias]
    release = push --tags

Older repositories which, for one reason or another, haven’t been updated to the new hawtness, have various other aliases defined, which run more specialised scripts (usually just rake release, for Ruby libraries), but they’re slowly dying out.

The reason why I still have this alias, though, is that it standardises the release process. Whether it’s a Ruby gem, a Rust crate, a bunch of protobuf definitions, or whatever else, I run the same command to trigger a release going out. It means I don’t have to think about how I do it for this project, because every project does it exactly the same way.

The Wiring Behind the Button

It wasn’t the button that was the problem. It was the miles of wiring, the hundreds of miles of cables, the circuits, the relays, the machinery. The engine was a massive, sprawling, complex, mind-bending nightmare of levers and dials and buttons and switches. You couldn’t just slap a button on the wall and expect it to work. But there should be a button. A big, fat button that you could press and everything would be fine again. Just press it, and everything would be back to normal.

Red Dwarf: Better Than Life

Once you’ve accepted that your release process should be as simple as creating an annotated tag and running one command, you do need to consider what happens afterwards. These days, with the near-universal availability of CI runners that can do anything you need in an isolated, reproducible environment, the work required to go from “annotated tag” to “release artifacts” can be scripted up and left to do its thing.

What that looks like, of course, will probably vary greatly depending on what you’re releasing. I can’t really give universally-applicable guidance, since I don’t know your situation. All I can do is provide some of my open source work as inspirational examples.

For starters, let’s look at a simple Rust crate I’ve written, called strong-box. It’s a straightforward crate, that provides ergonomic and secure cryptographic functionality inspired by the likes of NaCl. As it’s just a crate, its release script is very straightforward. Most of the complexity is working around Cargo’s inelegant mandate that crate version numbers are specified in a TOML file. Apart from that, it’s just a matter of building and uploading the crate. Easy!

Slightly more complicated is action-validator. This is a Rust CLI tool which validates GitHub Actions and Workflows (how very meta) against a published JSON schema, to make sure you haven’t got any syntax or structural errors. As not everyone has a Rust toolchain on their local box, the release process helpfully build binaries for several common OSes and CPU architectures that people can download if they choose. The release process in this case is somewhat larger, but not particularly complicated. Almost half of it is actually scaffolding to build an experimental WASM/NPM build of the code, because someone seemed rather keen on that.

Moving away from Rust, and stepping up the meta another notch, we can take a look at the release process for git-version-bump itself, my Ruby library and associated CLI tool which started me down the “Just Tag It Already” rabbit hole many years ago. In this case, since gemspecs are very amenable to programmatic definition, the release process is practically trivial. Remove the boilerplate and workarounds for GitHub Actions bugs, and you’re left with about three lines of actual commands.

These approaches can certainly scale to larger, more complicated processes. I’ve recently implemented annotated-tag-based releases in a proprietary software product, that produces Debian/Ubuntu, RedHat, and Windows packages, as well as Docker images, and it takes all of the information it needs from the annotated tag. I’m confident that this approach will successfully serve them as they expand out to build AMIs, GCP machine images, and whatever else they need in their release processes in the future.

Objection, Your Honour!

I can hear the howl of the “but, actuallys” coming over the horizon even as I type. People have a lot of Big Feelings about why this release process won’t work for them. Rather than overload this article with them, I’ve created a companion article that enumerates the objections I’ve come across, and answers them. I’m also available for consulting if you’d like a personalised, professional opinion on your specific circumstances.

DVD Bonus Feature: Pre-releases

Unless you’re addicted to surprises, it’s good to get early feedback about new features and bugfixes before they make it into an official, general-purpose release. For this, you can’t go past the pre-release.

The major blocker to widespread use of pre-releases is that cutting a release is usually a pain in the behind. If you’ve got to edit changelogs, and modify version numbers in a dozen places, then you’re entirely justified in thinking that cutting a pre-release for a customer to test that bugfix that only occurs in their environment is too much of a hassle.

The thing is, once you’ve got releases building from annotated tags, making pre-releases on every push to main becomes practically trivial. This is mostly due to another fantastic and underused Git command: git describe.

How git describe works is, basically, that it finds the most recent commit that has an associated annotated tag, and then generates a string that contains that tag’s name, plus the number of commits between that tag and the current commit, with the current commit’s hash included, as a bonus. That is, imagine that three commits ago, you created an annotated release tag named v4.2.0. If you run git describe now, it will print out v4.2.0-3-g04f5a6f (assuming that the current commit’s SHA starts with 04f5a6f).

You might be starting to see where this is going. With a bit of light massaging (essentially, removing the leading v and replacing the -s with .s), that string can be converted into a version number which, in most sane environments, is considered “newer” than the official 4.2.0 release, but will be superceded by the next actual release (say, 4.2.1 or 4.3.0). If you’re already injecting version numbers into the release build process, injecting a slightly different version number is no work at all.

Then, you can easily build release artifacts for every commit to main, and make them available somewhere they won’t get in the way of the “official” releases. For example, in the proprietary product I mentioned previously, this involves uploading the Debian packages to a separate component (prerelease instead of main), so that users that want to opt-in to the prerelease channel simply modify their sources.list to change main to prerelease. Management have been extremely pleased with the easy availability of pre-release packages; they’ve been gleefully installing them willy-nilly for testing purposes since I rolled them out.

In fact, even while I’ve been writing this article, I was asked to add some debug logging to help track down a particularly pernicious bug. I added the few lines of code, committed, pushed, and went back to writing. A few minutes later (next week’s job is to cut that in-process time by at least half), the person who asked for the extra logging ran apt update; apt upgrade, which installed the newly-built package, and was able to progress in their debugging adventure.

Continuous Delivery: It’s Not Just For Hipsters.

“+1, Informative”

Hopefully, this has spurred you to commit your immortal soul to the Church of the Annotated Tag. You may tithe by buying me a refreshing beverage. Alternately, if you’re really keen to adopt more streamlined release management processes, I’m available for consulting engagements.

Invalid Excuses for Why Your Release Process Sucks

Posted: Sat, 23 November 2024 | permalink | 1 Comment

In my companion article, I made the bold claim that your release process should consist of no more than two steps:

Create an annotated Git tag;
Run a single command to trigger the release pipeline.

As I have been on the Internet for more than five minutes, I’m aware that a great many people will have a great many objections to this simple and straightforward idea. In the interests of saving them a lot of wear and tear on their keyboards, I present this list of common reasons why these objections are invalid.

If you have an objection I don’t cover here, the comment box is down the bottom of the article. If you think you’ve got a real stumper, I’m available for consulting engagements, and if you turn out to have a release process which cannot feasibly be reduced to the above two steps for legitimate technical reasons, I’ll waive my fees.

“But I automatically generate my release notes from commit messages!”

This one is really easy to solve: have the release note generation tool feed directly into the annotation. Boom! Headshot.

“But all these files need to be edited to make a release!”

No, they absolutely don’t. But I can see why you might think you do, given how inflexible some packaging environments can seem, and since “that’s how we’ve always done it”.

Language Packages

Most languages require you to encode the version of the library or binary in a file that you want to revision control. This is teh suck, but I’m yet to encounter a situation that can’t be worked around some way or another.

In Ruby, for instance, gemspec files are actually executable Ruby code, so I call code (that’s part of git-version-bump, as an aside) to calculate the version number from the git tags. The Rust build tool, Cargo, uses a TOML file, which isn’t as easy, but a small amount of release automation is used to take care of that.

Distribution Packages

If you’re building Linux distribution packages, you can easily apply similar automation faffery. For example, Debian packages take their metadata from the debian/changelog file in the build directory. Don’t keep that file in revision control, though: build it at release time. Everything you need to construct a Debian (or RPM) changelog is in the tag – version numbers, dates, times, authors, release notes. Use it for much good.

The Dreaded Changelog

Finally, there’s the CHANGELOG file. If it’s maintained during the development process, it typically has an archive of all the release notes, under version numbers, with an “Unreleased” heading at the top. It’s one more place to remember to have to edit when making that “preparing release X.Y.Z” commit, and it is a gift to the Demon of Spurious Merge Conflicts if you follow the policy of “every commit must add a changelog entry”.

My solution: just burn it to the ground. Add a line to the top with a link to wherever the contents of annotated tags get published (such as GitHub Releases, if that’s your bag) and never open it ever again.

“But I need to know other things about my release, too!”

For some reason, you might think you need some other metadata about your releases. You’re probably wrong – it’s amazing how much information you can obtain or derive from the humble tag – so think creatively about your situation before you start making unnecessary complexity for yourself.

But, on the off chance you’re in a situation that legitimately needs some extra release-related information, here’s the secret: structured annotation. The annotation on a tag can be literally any sequence of octets you like. How that data is interpreted is up to you.

So, require that annotations on release tags use some sort of structured data format (say YAML or TOML – or even XML if you hate your release manager), and mandate that it contain whatever information you need. You can make sure that the annotation has a valid structure and contains all the information you need with an update hook, which can reject the tag push if it doesn’t meet the requirements, and you’re sorted.

“But I have multiple packages in my repo, with different release cadences and versions!”

This one is common enough that I just refer to it as “the monorepo drama”. Personally, I’m not a huge fan of monorepos, but you do you, boo. Annotated tags can still handle it just fine.

The trick is to include the package name being released in the tag name. So rather than a release tag being named vX.Y.Z, you use foo/vX.Y.Z, bar/vX.Y.Z, and baz/vX.Y.Z. The release automation for each package just triggers on tags that match the pattern for that particular package, and limits itself to those tags when figuring out what the version number is.

“But we don’t semver our releases!”

Oh, that’s easy. The tag pattern that marks a release doesn’t have to be vX.Y.Z. It can be anything you want.

Relatedly, there is a (rare, but existent) need for packages that don’t really have a conception of “releases” in the traditional sense. The example I’ve hit most often is automatically generated “bindings” packages, such as protobuf definitions. The source of truth for these is a bunch of .proto files, but to be useful, they need to be packaged into code for the various language(s) you’re using. But those packages need versions, and while someone could manually make releases, the best option is to build new per-language packages automatically every time any of those definitions change.

The versions of those packages, then, can be datestamps (I like something like YYYY.MM.DD.N, where N starts at 0 each day and increments if there are multiple releases in a single day).

This process allows all the code that needs the definitions to declare the minimum version of the definitions that it relies on, and everything is kept in sync and tracked almost like magic.

Th-th-th-th-that’s all, folks!

I hope you’ve enjoyed this bit of mild debunking. Show your gratitude by buying me a refreshing beverage, or purchase my professional expertise and I’ll answer all of your questions and write all your CI jobs.

Health Industry Company Sues to Prevent Certificate Revocation

Posted: Wed, 31 July 2024 | permalink | 2 Comments

It’s not often that a company is willing to make a sworn statement to a court about how its IT practices are incompatible with the needs of the Internet, but when they do… it’s popcorn time.

The Combatants

In the red corner, weighing in at… nah, I’m not going to do that schtick.

The plaintiff in the case is Alegeus Technologies, LLC, a Delaware Corporation that, according to their filings, “is a leading provider of a business-tobusiness, white-label funding and payment platform for healthcare carriers and third-party administrators to administer consumer-directed employee benefit programs”. Not being subject to the US’ bonkers health care system, I have only a passing familiarity with the sorts of things they do, but presumably it involves moving a lot of money around, which is sometimes important.

The defendant is DigiCert, a CA which, based on analysis I’ve done previously, is the second-largest issuer of WebPKI certificates by volume.

The History

According to a recently opened Mozilla CA bug, DigiCert found an issue in their “domain control validation” workflow, that meant it may have been possible for a miscreant to have certificates issued to them that they weren’t legitimately entitled to. Given that validating domain names is basically the “YOU HAD ONE JOB!” of a CA, this is a big deal.

The CA/Browser Forum Baseline Requirements (BRs) (which all CAs are required to adhere to, by virtue of their being included in various browser and OS trust stores), say that revocation is required within 24 hours when “[t]he CA obtains evidence that the validation of domain authorization or control for any Fully‐Qualified Domain Name or IP address in the Certificate should not be relied upon” (section 4.9.1.1, point 5).

DigiCert appears to have at least tried to do the right thing, by opening the above Mozilla bug giving some details of the problem, and notifying their customers that their certificates were going to be revoked. One may quibble about how fast they’re doing it, but they’re giving it a decent shot, at least.

A complicating factor in all this is that, only a touch over a month ago, Google Chrome announced the removal of another CA, Entrust, from its own trust store program, citing “a pattern of compliance failures, unmet improvement commitments, and the absence of tangible, measurable progress in response to publicly disclosed incident reports”. Many of these compliance failures were failures to revoke certificates in a timely manner. One imagines that DigiCert would not like to gain a reputation for tardy revocation, particularly at the moment.

The Legal Action

Now we come to Alegeus Technologies. They’ve opened a civil case whose first action is to request the issuance of a Temporary Restraining Order (TRO) that prevents DigiCert from revoking certificates issued to Alegeus (which the court has issued). This is a big deal, because TROs are legal instruments that, if not obeyed, constitute contempt of court (or something similar) – and courts do not like people who disregard their instructions. That means that, in the short term, those certificates aren’t getting revoked, despite the requirement imposed by root stores on DigiCert that the certificates must be revoked. DigiCert is in a real “rock / hard place” situation here: revoke and get punished by the courts, or don’t revoke and potentially (though almost certainly not, in the circumstances) face removal from trust stores (which would kill, or at least massively hurt, their business).

The reasons that Alegeus gives for requesting the restraining order is that “[t]o Reissue and Reinstall the Security Certificates, Alegeus must work with and coordinate with its Clients, who are required to take steps to rectify the certificates. Alegeus has hundreds of such Clients. Alegeus is generally required by contract to give its clients much longer than 24 hours’ notice before executing such a change regarding certification.”

In the filing, Alegeus does acknowledge that “DigiCert is a voluntary member of the Certification Authority Browser Forum (CABF), which has bylaws stating that certificates with an issue in their domain validation must be revoked within 24 hours.” This is a misstatement of the facts, though. It is the BRs, not the CABF bylaws, that require revocation, and the BRs apply to all CAs that wish to be included in browser and OS trust stores, not just those that are members of the CABF. In any event, given that Alegeus was aware that DigiCert is required to revoke certificates within 24 hours, one wonders why Alegeus went ahead and signed agreements with their customers that required a lengthy notice period before changing certificates.

What complicates the situation is that there is apparently a Master Services Agreement (MSA) that states that it “constitutes the entire agreement between the parties” – and that MSA doesn’t mention certificate revocation anywhere relevant. That means that it’s not quite so cut-and-dried that DigiCert does, in fact, have the right to revoke those certificates. I’d expect a lot of “update to your Master Services Agreement” emails to be going out from DigiCert (and other CAs) in the near future to clarify this point.

Not being a lawyer, I can’t imagine which way this case might go, but there’s one thing we can be sure of: some lawyers are going to able to afford that trip to a tropical paradise this year.

The Security Issues

The requirement for revocation within 24 hours is an important security control in the WebPKI ecosystem. If a certificate is misissued to a malicious party, or is otherwise compromised, it needs to be marked as untrustworthy as soon as possible. While revocation is far from perfect, it is the best tool we have.

In this court filing, Alegeus has claimed that they are unable to switch certificates with less than 24 hours notice (due to “contractual SLAs”). This is a pretty big problem, because there are lots of reasons why a certificate might need to be switched out Very Quickly. As a practical example, someone with access to the private key for your SSL certificate might decide to use it in a blog post. Letting that sort of problem linger for an extended period of time might end up being a Pretty Big Problem of its own. An organisation that cannot respond within hours to a compromised certificate is playing chicken with their security.

The Takeaways

Contractual obligations that require you to notify anyone else of a certificate (or private key) changing are bonkers, and completely antithetical to the needs of the WebPKI. If you have to have them, you’re going to want to start transitioning to a private PKI, wherein you can do whatever you darn well please with revocation (or not). As these sorts of problems keep happening, trust stores (and hence CAs) are going to crack down on this sort of thing, so you may as well move sooner rather than later.

If you are an organisation that uses WebPKI certificates, you’ve got to be able to deal with any kind of certificate revocation event within hours, not days. This basically boils down to automated issuance and lifecycle management, because having someone manually request and install certificates is terrible on many levels. There isn’t currently a completed standard for notifying subscribers if their certificates need premature renewal (say, due to needing to be revoked), but the ACME Renewal Information Extension is currently being developed to fill that need. Ask your CA if they’re tracking this standards development, and when they intend to have the extension available for use. (Pro-tip: if they say “we’ll start doing development when the RFC is published”, run for the hills; that’s not how responsible organisations work on the Internet).

The Givings

If you’ve found this helpful, consider shouting me a refreshing beverage. Reading through legal filings is thirsty work!

Checking for Compromised Private Keys has Never Been Easier

Posted: Fri, 28 June 2024 | permalink | No comments

As regular readers would know, since I never stop banging on about it, I run Pwnedkeys, a service which finds and collates private keys which have been disclosed or are otherwise compromised. Until now, the only way to check if a key is compromised has been to use the Pwnedkeys API, which is not necessarily trivial for everyone.

Starting today, that’s changing.

The next phase of Pwnedkeys is to start offering more user-friendly tools for checking whether keys being used are compromised. These will typically be web-based or command-line tools intended to answer the question “is the key in this (certificate, CSR, authorized_keys file, TLS connection, email, etc) known to Pwnedkeys to have been compromised?”.

Opening the Toolbox

Available right now are the first web-based key checking tools in this arsenal. These tools allow you to:

Check the key in a PEM-format X509 data structure (such as a CSR or certificate);
Check the keys in an authorized_keys file you upload; and
Check the SSH keys used by a user at any one of a number of widely-used code-hosting sites.

Further planned tools include “live” checking of the certificates presented in TLS connections (for HTTPS, etc), SSH host keys, command-line utilities for checking local authorized_keys files, and many other goodies.

If You Are Intrigued By My Ideas…

… and wish to subscribe to my newsletter, now you can!

I’m not going to be blogging every little update to Pwnedkeys, because that would probably get a bit tedious for readers who aren’t as intrigued by compromised keys as I am. Instead, I’ll be posting every little update in the Pwnedkeys newsletter. So, if you want to keep up-to-date with the latest and greatest news and information, subscribe to the newsletter.

Supporting Pwnedkeys

All this work I’m doing on my own time, and I’m paying for the infrastructure from my own pocket. If you’ve got a few dollars to spare, I’d really appreciate it if you bought me a refreshing beverage. It helps keep the lights on here at Pwnedkeys Global HQ.

Information Security: "We Can Do It, We Just Choose Not To"

Posted: Fri, 14 June 2024 | permalink | 2 Comments

Whenever a large corporation disgorges the personal information of millions of people onto the Internet, there is a standard playbook that is followed.

“Security is our top priority”.

“Passwords were hashed”.

“No credit card numbers were disclosed”.

record scratch

Let’s talk about that last one a bit.

A Case Study

This post could have been written any time in the past… well, decade or so, really. But the trigger for my sitting down and writing this post is the recent breach of wallet-finding and criminal-harassment-enablement platform Tile. As reported by Engadget, a statement attributed to Life360 CEO Chris Hulls says

The potentially impacted data consists of information such as names, addresses, email addresses, phone numbers, and Tile device identification numbers.

But don’t worry though; even though your home address is now public information

It does not include more sensitive information, such as credit card numbers

Aaaaaand here is where I get salty.

Why Credit Card Numbers Don’t Matter

Describing credit card numbers as “more sensitive information” is somewhere between disingenuous and a flat-out lie. It was probably included in the statement because it’s part of the standard playbook. Why is it part of the playbook, though?

Not being a disaster comms specialist, I can’t say for sure, but my hunch is that the post-breach playbook includes this line because (a) credit cards are less commonly breached these days (more on that later), and (b) it’s a way to insinuate that “all your financial data is safe, no need to worry” without having to say that (because that statement would absolutely be a lie).

The thing that not nearly enough people realise about credit card numbers is:

The credit card holder is not usually liable for most fraud done via credit card numbers; and
In terms of actual, long-term damage to individuals, credit card fraud barely rates a mention. Identity fraud, Business Email Compromise, extortion, and all manner of other unpleasantness is far more damaging to individuals.

Why Credit Card Numbers Do Matter

Losing credit card numbers in a data breach is a huge deal – but not for the users of the breached platform. Instead, it’s a problem for the company that got breached.

See, going back some years now, there was a wave of huge credit card data breaches. If you’ve been around a while, names like Target and Heartland will bring back some memories.

Because these breaches cost issuing banks and card brands a lot of money, the Payment Card Industry Security Standards Council (PCI-SSC) and the rest of the ecosystem went full goblin mode. Now, if you lose credit card numbers in bulk, it will cost you big. Massive fines for breaches (typically levied by the card brands via the acquiring bank), increased transaction fees, and even the Credit Card Death Penalty (being banned from charging credit cards), are all very big sticks.

Now Comes the Finding Out

In news that should not be surprising, when there are actual consequences for failing to do something, companies take the problem seriously. Which is why “no credit card numbers were disclosed” is such an interesting statement.

Consider why no credit card numbers were disclosed. It’s not that credit card numbers aren’t valuable to criminals – because they are. Instead, it’s because the company took steps to properly secure the credit card data.

Next, you’ll start to consider why, if the credit card numbers were secured, why wasn’t the personal information that did get disclosed similarly secured? Information that is far more damaging to the individuals to whom that information relates than credit card numbers.

The only logical answer is that it wasn’t deemed financially beneficial to the company to secure that data. The consequences of disclosure for that information isn’t felt by the company which was breached. Instead, it’s felt by the individuals who have to spend weeks of their life cleaning up from identity fraud committed against them. It’s felt by the victim of intimate partner violence whose new address is found in a data dump, letting their ex find them again.

Until there are real, actual consequences for the companies which hemorrhage our personal data (preferably ones that have “percentage of global revenue” at the end), data breaches will continue to happen. Not because they’re inevitable – because as credit card numbers show, data can be secured – but because there’s no incentive for companies to prevent our personal data from being handed over to whoever comes along.

Support my Salt

My salty takes are powered by refreshing beverages. If you’d like to see more of the same, buy me one.

GitHub's Missing Tab

Posted: Thu, 30 May 2024 | permalink | 7 Comments

Visit any GitHub project page, and the first thing you see is something that looks like this:

screenshot of the GitHub repository page, showing the Code, Issues, and Pull Requests tabs

“Code”, that’s fairly innocuous, and it’s what we came here for. The “Issues” and “Pull Requests” tabs, with their count of open issues, might give us some sense of “how active” the project is, or perhaps “how maintained”. Useful information for the casual visitor, undoubtedly.

However, there’s another user community that visits this page on the regular, and these same tabs mean something very different to them.

I’m talking about the maintainers (or, more commonly, maintainer, singular). When they see those tabs, all they see is work. The “Code” tab is irrelevant to them – they already have the code, and know it possibly better than they know their significant other(s) (if any). “Issues” and “Pull Requests” are just things that have to be done.

I know for myself, at least, that it is demoralising to look at a repository page and see nothing but work. I’d be surprised if it didn’t contribute in some small way to maintainers just noping the fudge out.

A Modest Proposal

So, here’s my thought. What if instead of the repo tabs looking like the above, they instead looked like this:

modified screenshot of the GitHub repository page, showing a new Kudos tab, with a smiley face icon, between the Code and Issues tabs

My conception of this is that it would, essentially, be a kind of “yearbook”, that people who used and liked the software could scribble their thoughts on. With some fairly straightforward affordances elsewhere to encourage its use, it could be a powerful way to show maintainers that they are, in fact, valued and appreciated.

There are a number of software packages I’ve used recently, that I’d really like to say a general “thanks, this is awesome!” to. However, I’m not about to make the Issues tab look even scarier by creating an “issue” to say thanks, and digging up an email address is often surprisingly difficult, and wouldn’t be a public show of my gratitude, which I believe is a valuable part of the interaction.

You Can’t Pay Your Rent With Kudos

Absolutely you cannot. A means of expressing appreciation in no way replaces the pressing need to figure out a way to allow open source developers to pay their rent. Conversely, however, the need to pay open source developers doesn’t remove the need to also show those people that their work is appreciated and valued by many people around the world.

Anyway, who knows a senior exec at GitHub? I’ve got an idea I’d like to run past them…

Brane Dump