Recently, I was asked whether I had any pet peeves. I thought about it for a couple of seconds, and since I couldn’t immediately think of anything, I just ranted a little bit about some of the minor annoyances around code quality I had experienced the same day or something like that.
But when I couldn’t sleep early this morning, I remembered one of my favorite pet peeves: Code that doesn’t model stuff right. Let me explain that with a couple of real-life scenarios…
I was working on a system with my team, and our product owner(s) – a team of really really smart real-time regulation experts – came to us with some requirements regarding modeling some physical processes of some kind. They explained these things to us, and they showed us some graphic models[1. – and you’d probably think that this would trigger a few light bulbs…] of how they thought of this thing that we were supposed to start working on.
When we later got some more specific requirements, they didn’t resemble those graphic models that were originally presented to us. I mean, some of the concepts were brought over, but there was no clear mapping between the graphic models and the model proposed in the requirements.
Somehow, we ended up implementing things like they were specified to us the second time, although I did hesitate and express some concerns several times along the way.
Now, one-and-a-half years later, we’re faced with our implementation’s limited capability of expressing the domain to a high enough degree. We’re constantly forced to handle special cases in various parts of the system, whereas in other parts we have to spray logic all over the place to implement even simple features. And if you’re used to unit testing your stuff, you can probably imagine that our huge-ball-of-mud-model-with-limited-expressive-power has so many different combinations of flags and settings that they’re practically impossible to cover with tests.
In the same system, we had a pretty complex specification of rules that some part of the system should act in accordance with. After some iterations during the specification, clarification, and breakdown of the feature, we ended up with a specification that pretty much consisted of a 5-6 levels deep decision tree, including some floating point values that needed to be considered as well.
At this point, some team members went on an implemented the thing – with a 5-6 levels deep nested if structure, that was implemented 100% in accordance with the specification, including code comments with cross references to nodes in the decision tree diagram. At first glance, this seemed O.K. – I mean, it was fairly easy to verify that the tree was implemented correct, due to the 1-to-1 mapping that could be made from the if structure to the decision tree diagram, and vice versa. So this is much better than scenario 1, right?
Well, if one if statement results in two possible passes through a piece of code, and thus two test cases that need to be written, then it follows that 5-6 levels of if statements yields 25 – 26 possible combinations. Combine this with 5 to 10 floating point values that need to be combined and limited as well, depending on their value and the path followed through the decision tree, and then we have yet another completely untestable mess!
What to learn from this?
This should not come as a surprise to you, especially if you still – in spite of all the functional commotion that has been going on the last 5 years – believe in object-oriented programming… because that’s what the objects are for: Modeling stuff!
But somehow, I still often see developers – even seasoned ones – go and implement models that 1) model only a subset of what’s actually required, 2) somehow flatten the model and don’t respect the inherent structure of hierarchy and cardinality, 3) apply some other non-invertible transformation to what they’re shown, and thus end up truncating the domain, which in turn leads to having to resort to hacks like flags and an ever-increasing level of if statements and special cases. The end result is that it is hard to understand the relation between the model and the stuff it is supposed to model.
In hindsight, it is pretty easy to see that the first scenario should have been modeled with classes representing each “thing” in the original diagrams presented to us. The reason we were shown those drawings was that they were a pretty good model of what’s supposed to go on. Duh!!
And the second scenario was a pretty good candidate for some kind of simple boolean rules engine, where a representation of the tree could have been built in memory with instances of some class for each node, and then the tree could just have been “invoked” – just like that.
In both cases, we would have had the ability to test each class in isolation, and then do a few integration tests to verify that they were capable of interacting like expected. And then, lastly, we could have written a verification of our code’s ability to build the model in the form that would end up being executed.
To sum it up as a one-liner: If you can draw it like that, the just f*n code it like that!.