Testing the Testing of Your Tests
Posted: Wed, 24 May 2006 | permalink | No comments
I'm writing an application in Ruby that makes heavy use of XMLRPC (the application of XMLRPC-over-Unix-socket fame), and so I'm using Mock Objects to mock out the XMLRPC client proxy and provide an interface that doesn't involve having a whole other infrastructure running on my laptop so I can run my test suite.
For those who don't know, mock objects are objects which look and superficially act like the real objects you might use to communicate with an external service, but don't actually do the communication -- instead, they return canned responses to your calls.
Why is this useful? Well, imagine that the real object talks to a SOAP service on the 'net (such as the Google search API). It means that every time you run your test suite, you need access to the Internet. The external service may also cost something everytime you talk to it -- or it may be impossible to use to test, since there's statistics generation or something going on that means that hammering it for test purposes is just impractical.
In addition to returning values to your application, mock objects will usually record what has been called, so that after your test run is complete, you can also verify that all of the methods you wanted called have actually been called. This means that you can have a good idea that what you wanted to happen to the service actually would have been done. Handy.
Some people have a (justified) ditrust of Mocks, because it's easy to make a mock object that doesn't perform like the real one (and hence invalidate your tests), but they provide a necessary evil in cases where you just can't use the real object when running your test suite.
The XMLRPC proxy I'm using has a bit of a quirk. Everything runs through one method on that object -- XMLRPC::Client#call(). This seems really limiting, especially in a wonderfully dynamic language like Ruby. In reality, it's not too bad -- it makes the object interface easier to mock, and you couldn't emulate the whole thing with method_missing anyway, because you can have various Ruby-illegal characters in XMLRPC method names.
The downside to this all-in-one-method thing, though, is that simply verifying that #call() has been invoked a dozen times doesn't really tell me much about whether the mock was used the way I wanted it to be used. My chosen mock object suite, Test::Unit::MockObject, whilst otherwise excellent, is (or rather, was) missing the ability to verify the arguments that are passed to your calls -- you can say "I need methods x(), y(), and z() called", but you couldn't say "I need the call y('foo', 'bar', 42) made".
Of course, this change the mock object class is non-trivial, and needs to be tested itself (it'd be pretty lame if a class intended to facilitate automated testing lacked a test suite of it's own). I added a bunch of tests to the test suite, and got some good results. I had a couple of false starts, where I wasn't checking most of the corner cases (where bugs lurk in the shadows), but eventually I got a test suite that was fairly complete, which drove the new features I needed to a good place.
So I carted my newly enhanced mock object class across to my production project, and off I went. Except that I was getting some downright weird failure messages when my code wasn't invoking the mock object the right way. Why? Because I hadn't tested that my failure messages were appropriate. Why was that? Because the assert_raise assertion in Test::Unit doesn't have a facility to make sure that you're passing back the right exception message, only that you're passing back the right exception class.
Whisky, Tango, Foxtrot? Over.
(This space intentionally left blank for me to insert the correction to this claim that the lazyweb is sure to provide)
So I had to write a new assertion (assert_exception_message) which checked that both the right class and the right message came back. This code, of course, needs to be tested too, because otherwise you can't be sure that your test code is performing right.
At this point, I was writing test code to test a new assertion I'd written so I could test the mock object class I had modified so I could test my application.
At this point, things got a bit "meta". It got a bit difficult to separate out the tests from the tests' tests. But I fought through the brain melt, and shortly thereafter came out the other end with a properly working piece of actual application, and all was well in the world.
By now, a certain Mr. Tendys is probably rubbing his hands with glee, as he gets ready to espouse at length about how this whole post just proves how utterly useless test-first programming (or even testing in general) is for producing economically-viable software.
Let me analyse what I "spent" on getting all this functionality, though -- about 3 train trips home -- maybe 4 hours of time total, including writing the extra tests in my application to use the wonderful new features I'd made. That's not much, in the grand scheme of things. I've wasted that much time in a day skiving off work in the past because I was frustrated with what I was trying (and failing) to program, usually because I was swamped with bugs.
What did that four hours of my time buy me? Heaps. I can now better verify my software works as anticipated. I have confidence that my software is a lot more robust than it was before. My tests identified several issues that passed my eyeball check, but blew up when I tried to actually use them in a particular way. But with the help of my tests, I identified the problems quickly and went on my way. Since I was working in a localised area of the application, I was immediately familiar with the code, and could stomp the bug almost immediately. If I'd only done end-to-end testing at the end of development (or worse, released it with the bugs still in there), the debugging would have been much harder because I'd forgotten little details about the code. I reckon that benefit, alone, has probably saved me 4 hours already, since I've spent whole days hunting down little bugs in an ad-hoc fashion in the past. I'm fairly sure most any programmer can empathise with me there.
Post a comment
All comments are held for moderation; markdown formatting accepted.