Your "Infrastructure as Code" is still code!
Posted: Thu, 13 August 2015 | permalink | 4 Comments
Whether you’re a TDD zealot, or you just occasionally write a quick script to reproduce some bug, it’s a rare coder who doesn’t see value in some sort of automated testing. Yet, somehow, in all of the new-age “Infrastructure as Code” mania, we appear to have forgotten this, and the tools that are commonly used for implementing “Infrastructure as Code” have absolutely woeful support for developing your Infrastructure Code. I believe this has to change.
At present, the state of the art in testing system automation code appears to be, “spin up a test system, run the manifest/state/whatever, and then use something like serverspec or testinfra to SSH in and make sure everything looks OK”. It’s automated, at least, but it isn’t exactly a quick process. Many people don’t even apply that degree of rigour to their system config systems, and rely on manual testing, or even just “doing it live!”, to shake out the bugs they’ve introduced.
Speed in testing is essential. As the edit-build-test-debug cycle gets longer, frustration grows exponentially. If it takes two minutes to get a “something went wrong” report out of my tests, I’m not going to run them very often. If I’m not running my tests very often, then I’m not likely to write tests much, and suddenly… oops. Everything’s on fire. In “traditional” software development, the unit tests are the backbone of the “fast feedback” cycle. You’re running thousands of tests per second, ideally, and an entire test run might take 5-10 seconds. That’s the sweet spot, where you can code and test in a rapid cycle of ever-increasing awesomeness.
Interestingly, when I ask the users of most infrastructure management systems about unit testing, they either get a blank look on their face, or, at best, point me in the direction of the likes of Test Kitchen, which is described quite clearly as an integration platform, not a unit testing platform.
Puppet has rspec-puppet, which is a pretty solid unit testing framework for Puppet manifests – although it isn’t widely used. Others, though… nobody seems to have any ideas. The “blank look” is near-universal.
If “infrastructure developers” want to be taken seriously, we need to learn a little about what’s involved in the “development” part of the title we’ve bestowed upon ourselves. This means knowing what the various types of testing are, and having tools which enable and encourage that testing. It also means things like release management, documentation, modularity and reusability, and considering backwards compatibility.
All of this needs to apply to everything that is within the remit of the infrastructure developer. You don’t get to hand-wave away any of this just because “it’s just configuration!”. This goes double when your “just configuration!” is a hundred lines of YAML interspersed with templating directives (SaltStack, I’m looking at you).
4 Comments
From: Christian
2015-08-13 11:11
For configuration managment systems, I would expect unit testing to occur in the framework, but not necessarily from the configuration files. So in the salt case, I would hope they do testing on the python code that performs the pkg.install steps, but would not necessarily expect unit test the yaml conf files. That sounds a little bit like unit testing apache conf files.
It does appear that there are unit tests for the saltstack code: https://salt.readthedocs.org/en/v2014.1.1/topics/tests/unit.html but I’ll guess they could always improve coverage.
From: alex
2015-08-13 17:56
The big problem is that infra as code results in unnecessarily complex code which sucks to test and make modular. Puppet manifests should be simple install package/put config files.
Instead, we install unpackaged software and we suffer.
We are solving the problem with the wrong tool…
From: Matt Palmer
2015-08-14 08:23
Hi Christian, Alex, thanks for your comments.
Christian, the code you write needs unit testing just as much as the code that other people write. The fact that it’s written in templated YAML rather than Python doesn’t change that. I’m yet to see a config management system that can get by without conditionals in it somewhere; that means that it needs tests to ensure that the conditional logic doesn’t adversely change in the future.
Your analogy to Apache config files is appropriate, but perhaps not for the reasons you imagine. We should be unit testing those, too. I’ve seen no shortage of Apache (and nginx) config files which were complicated enough that nobody wanted to go near them if they could at all avoid it, because the damn things seemed to break as soon as you loaded them in your editor.
Alex, if you want to put all your logic into package maintainer scripts, feel free to do so. It’s still code, though, and it still needs unit testing. You don’t remove the complexity, you are simply putting it in a different place within the system. The config files you install will be templated to account for different use cases, and thus you will need to unit test that logic, too.
From: Jamie
2015-08-13 21:18
Bonus points if your infrastructure code can be debugged locally, and has refactoring tools.
Post a comment
All comments are held for moderation; markdown formatting accepted.