Double D problems

I’ve been trying to come up with a cool name for problems where the devil is in the detail; the best I’ve come up with so far is “double D” but that doesn’t really do it justice. Perhaps D^2 would be better. Anyway, you know the kind of thing I’m talking about: the problem seems simple but it just gets harder and harder the more work that you do. Instead of the solution covering more of the problem, it seems to get bigger and bigger without really getting the 100% coverage that you really need. First you find the edge cases where things don’t work then, if you are lucky, you find the corner cases where two edge cases meet. If you aren’t lucky you find them only when someone tries to do something and it all crashes horribly, data lost, reputation ruined.

My current example that I’m bashing my head against is a testing framework. And right there you have the problem. It’s a framework. I didn’t want to create a framework, but I though it would be OK as it was only one interface. So I wrote a set of tools to generate test cases from the interface. The test cases would be in an XML file so more tests could be added by hand (or by excel sheet in the manner of FitNesse). Fine. Good. Except that it wasn’t. The simple cases got done very fast, just a day to knock something up. Then the slightly annoying cases (parameters that were arrays, etc). Then the really awkward ones: System.Int32&, out parameters that were datatables and callback objects. Tricky. So I fiddled around for 3-4 days trying to get it all automated and then finally realised that the test generation code I was sweating over was for 1% of the interface! Clearly automation wasn’t saving as much time as it was. So I just generated the test data once, excluded the cases that didn’t work by hand and then checked the file into source control. Done.

In cases like this; the Pareto principle is your friend. Whilst I’m a big fan of automation in general and it is very useful to automate the generation or consumption of documents like test cases or diagrams of database schemas, there is a point where you need to stop. So what are the options when the automation time-saving curve begins to bend:

Well my favourite options are
1) exclude it: you should notify loud and early when you get to a case that isn’t automated, just so you remember it
2) hardcode it: be honest: how often is that thing going to change? if changing the hardcoded value is quicker than rewriting the code to handle all the cases then hardcode it. Of course, remember that other classic of time-management: the rule of 3. If you do the same thing 3 times then find another way.
3) special case sister system: if you really have to automate the special cases, then create a different system which is designed entirely around those special cases. In my case my “one excel row=one call to a method by reflection” wasn’t going to work for methods that took callback objects to subscribe to events, but it was easy to create a test client that only did callback functions. Then you can get back to your high-productivity 20% effort for 80% results programming.

Leave a comment