Rewrite things

There is a piece of advice in the software development world, which is pretty clearly expressed in this blog post by Joel Spolsky. It basically says that you should never rewrite your code from scratch.

I agree with this advice as a general rule to avoid running with the impulse of throwing out what you are working on sometimes, because you might feel like starting over fresh.

pooIt could be when new technologies come around, making your legacy system look old-fashioned. It could be when you are sitting there at your job, frustrated and tired of shoveling code turds, trying to understand a legacy system that has had all sorts of bad design decisions implemented in it.

There can be many reasons why one might feel like starting over, because THIS TIME better decisions will be made, and everything will be better.

The reason why this is generally not such a great idea, is that you tend to forget how much work and hard-earned knowledge is actually baked into the systems we maintain. Sometimes, little bug fixes and unit tests of esoteric corner cases are the result of weeks of real-world testing and subsequent reproduction by developers, which is not so easily redone.

While the advice to avoid rewrites makes good sense most of the time, all rules come with exceptions!

At the other end of the scale we find something that is slowly becoming one of my favorite pet peeves: Software developers who go to great lengths to preserve code that they have already written, to avoid having to write it again.

Symptoms include having code repositories, directories of text files, GitHub Gists, maybe even OneNote projects, containing code snippets, which are then consulted or copy/pasted from whenever it is time to build something “new”. Other symptom is when “new” projects are started out by bringing in loads of code files of utilities, helpers, primitives, etc.

skaermbillede-2016-11-18-kl-14-52-39This type of developer might claim that it helps them “hit the ground running” – but my experience is, that it is more likely to make them HIT THE GROUND WITH A BACKPACK FULL OF BRICKS.

The thing is that this “new” project will not be built with new code or new techniques learned since the last time the developer worked on something greenfield – it will be built with all the OLD knowledge, the OLD techniques, and the OLD technologies.

When you avoid rewriting things by reusing your old code snippets, you are robbing yourself of a great self-improvement opportunity, because you don’t let yourself re-do something that you have already done before with the knowledge that you have accumulated since the last time you did it.

If, on the other hand, you take on rewriting things (often, if not) every time you build something new, even things like trivial CRUDdy coding tasks and other mundane endeavors can end up working like a form of code kata where each repetition can bring a new dimension or new refinements into the solution.

I should probably end this post by saying that I don’t think you should reinvent the wheel – or “invent the soup plate” as we say in Danish 😉 – every single time you need a wheel or a soup plate. Just be observant that you are not missing out on a great, new, improved soup plate, just because you thought that the one you got from your parents when you moved out at 19 was good enough.

PS: After having written this, I saw this post by Dennis Forbes, who seems to share my pet peeve, as he touches on that type of company that treats all of its homegrown code and libraries as an asset, when it is in fact a brick in the backpack. Read his post – it’s much better than mine 🙂

PPS: Inventing wheels and soup plates are terrible analogies to writing software. I will bet you almost anything that you can look at any piece of code you wrote a year ago and come up with several things you would do in a better way if you were to rewrite it. However, if this is not the case, I think you have another problem 🙂

I am not usually that judgmental

but I did write a blog post called “I judge developers by how many toolbars are showing in their IDE”, which could also have been called “if you reach for the mouse, you are n00b scum to me”.

I wrote the original blog post on how to strip some of the disturbing elements off of Visual Studio and get started using the keyboard for navigation because I was frustrated when I saw colleagues helplessly wiggle the mouse around and try to click small pathetic buttons among hundreds of other buttons in huge menu bars in integrated development environments…

So, today I got a tip from Brendan J. Baker who suggested I install the “Hide Main Menu” VSIX and get an auto-collapsing menu bar! Pretty cool, actually – it shows up again when you hit Alt, just like you expect it to after having used the keyboard for navigating Sublime all day.

The menu bar might not take up a lot of space, but it’s still 30 px in the expensive direction, at the entire width of the screen, holding a BUNCH OF YELLED WORDS – but most of all, it’s clutter!

Thanks, Brendan – you made my Visual Studio even more clean and good looking 🙂

screenshot

I judge developers by how many toolbars are showing in their IDE

Usually, I try not to be judgmental about stuff. I like to keep my mind open and to accept people as the beautifully unique snowflakes they are… or something… 🙂

There’s one thing that irritates me though, and that’s C# developers who constantly reach for the mouse to click the tiny crappy toolbar buttons that for some reason seem to have survived in Microsoft IDEs since 1995 VB4. Yeah I’m looking at you! You’re crap!

There is nothing more annoying than pair programming with someone, who cannot even go to another file without having to scroll up and down in Solution Explorer, looking for that file to double-click. And then comes the time to re-run the current unit test… Sigh!!!

Now, if you have any ambition as a C# developer, I recommend you start out every new installation of Visual Studio by

  • Hiding all toolbars (which, unfortunately, cannot easily be done at once – new ones pop up every time you open a new kind of file for the first time).
  • Making all tool windows auto-hide (i.e. click the little pin on e.g. Solution Explorer, making it collapse – usually to the right side of the screen).

That will make your work environment resemble the picture on the right (especially if you have a 1337 dark color scheme like mine) – see: no clutter! No stinking buttons to disturb your vision while you’re swinging the code hammer! And, it will serve as an incentive to start using the keyboard some more.

Now, in order to be able to actually work like this, it’s essential that you know how to navigate using the keyboard only. Therefore, here’s a few very basic shortcuts to get you started[1. Assuming of course that you’re using Visual Studio with standard keyboard settings and R# with Visual Studio keyboard scheme]:

  • Navigate to any open window in the environment: Ctrl + Tab + arrows while holding Ctrl.
  • Jump to file currently being edited in the Solution Explorer: Shift + Alt + L.
  • Jump to the R# test runner: Ctrl + Alt + T.
  • Pop open the context menu: Shift + F10.

Now, with these in place I think it should be possible to start doing all navigation with the keyboard only. And then, when you get tired pressing Shift + F10 and choosing stuff in the menus, you can start learning the real shortcuts to everything.

Using the keyboard for the majority of all tasks has several advantages – in addition to relieving the strain on the right wrist, arm, and shoulder, you also get the advantage that your navigation and execution of common workflows is sped up, allowing your work pace to better match the pace of your train of thought.

Also, I won’t judge you 🙂

If you can draw it like that, then just f*n code it like that!

Recently, I was asked whether I had any pet peeves. I thought about it for a couple of seconds, and since I couldn’t immediately think of anything, I just ranted a little bit about some of the minor annoyances around code quality I had experienced the same day or something like that.

But when I couldn’t sleep early this morning, I remembered one of my favorite pet peeves: Code that doesn’t model stuff right. Let me explain that with a couple of real-life scenarios…

Scenario 1

I was working on a system with my team, and our product owner(s) – a team of really really smart real-time regulation experts – came to us with some requirements regarding modeling some physical processes of some kind. They explained these things to us, and they showed us some graphic models[1. – and you’d probably think that this would trigger a few light bulbs…] of how they thought of this thing that we were supposed to start working on.

When we later got some more specific requirements, they didn’t resemble those graphic models that were originally presented to us. I mean, some of the concepts were brought over, but there was no clear mapping between the graphic models and the model proposed in the requirements.

Somehow, we ended up implementing things like they were specified to us the second time, although I did hesitate and express some concerns several times along the way.

Now, one-and-a-half years later, we’re faced with our implementation’s limited capability of expressing the domain to a high enough degree. We’re constantly forced to handle special cases in various parts of the system, whereas in other parts we have to spray logic all over the place to implement even simple features. And if you’re used to unit testing your stuff, you can probably imagine that our huge-ball-of-mud-model-with-limited-expressive-power has so many different combinations of flags and settings that they’re practically impossible to cover with tests.

WTF?!

Scenario 2

In the same system, we had a pretty complex specification of rules that some part of the system should act in accordance with. After some iterations during the specification, clarification, and breakdown of the feature, we ended up with a specification that pretty much consisted of a 5-6 levels deep decision tree, including some floating point values that needed to be considered as well.

At this point, some team members went on an implemented the thing – with a 5-6 levels deep nested if structure, that was implemented 100% in accordance with the specification, including code comments with cross references to nodes in the decision tree diagram. At first glance, this seemed O.K. – I mean, it was fairly easy to verify that the tree was implemented correct, due to the 1-to-1 mapping that could be made from the if structure to the decision tree diagram, and vice versa. So this is much better than scenario 1, right?

Well, if one if statement results in two possible passes through a piece of code, and thus two test cases that need to be written, then it follows that 5-6 levels of if statements yields 25 – 26 possible combinations. Combine this with 5 to 10 floating point values that need to be combined and limited as well, depending on their value and the path followed through the decision tree, and then we have yet another completely untestable mess!

WTF?!?!?!

What to learn from this?

This should not come as a surprise to you, especially if you still – in spite of all the functional commotion that has been going on the last 5 years – believe in object-oriented programming… because that’s what the objects are for: Modeling stuff!

But somehow, I still often see developers – even seasoned ones – go and implement models that 1) model only a subset of what’s actually required, 2) somehow flatten the model and don’t respect the inherent structure of hierarchy and cardinality, 3) apply some other non-invertible transformation to what they’re shown, and thus end up truncating the domain, which in turn leads to having to resort to hacks like flags and an ever-increasing level of if statements and special cases. The end result is that it is hard to understand the relation between the model and the stuff it is supposed to model.

In hindsight, it is pretty easy to see that the first scenario should have been modeled with classes representing each “thing” in the original diagrams presented to us. The reason we were shown those drawings was that they were a pretty good model of what’s supposed to go on. Duh!!

And the second scenario was a pretty good candidate for some kind of simple boolean rules engine, where a representation of the tree could have been built in memory with instances of some class for each node, and then the tree could just have been “invoked” – just like that.

In both cases, we would have had the ability to test each class in isolation, and then do a few integration tests to verify that they were capable of interacting like expected. And then, lastly, we could have written a verification of our code’s ability to build the model in the form that would end up being executed.

To sum it up as a one-liner: If you can draw it like that, the just f*n code it like that!.

How to loop good

…or “Short rant on why C-style for-loops are almost always unnecessary “…

If you’ve been with C# for the last couple of years, your coding style has probably evolved in a more functional direction since the introduction of C# 3/.NET 3.5 with all its LINQ and lambda goodness.

In my own code, I have almost completely dismissed the old C-style for-loop in favor of .Select, .Where etc., making the code way more readable and maintainable.

The readability and maintainability is improved because long sequences for for/ foreach and collection-juggling can now be replaced by oneliners of selection, mapping, and projection.

One place, however, where I often see the for-loop used, is when people need to build sequences of repeated objects, or where some variable is incremented for each iteration.

Now, here is a message to those developers: Please realize that what you actually want to do is to repeat an object or map a sequence of numbers to a sequence of objects.

Having realized that, please just implement it like that:

Same thing goes for sequences of elements where something is incremented:

There’s just too many for-loops in the world!

Not the same thing, but still somehow related to that, is when people need to collect stuff from multiple properties and ultimately subject each value to the same treatment – please just implement it like that:

</rant>

It’s just annoying that Joel Spolsky is such is great writer…

…because I almost never agree with him!

For instance, in this post – inspired by the book, Coders At Work – he describes the character he calls “the duct tape programmer”, who is the archetype of that annoyingly great programmer, who just seems to be able to solve any kind of problem with only a few tools.

One of the attributes of “the duct tape programmer” is that he seldomly wastes time writing tests, because “(…) the customer isn’t going to complain about that”. Another is that he does not succumb to hype and trends in software development, he is confident that what he knows is enough.

And so Joel goes, on and on about this fantastic programmer, who’s capable of delivering what the customer wants without all the fancy hypes brand-spanking-new technologies and methodologies that all the other programmers seem to waste their time with.

The problems, as I see it, are: a) Very few programmers are like that (you know, that Linus Torvals/Bjarne Stroustrup kind of guy), and b) I will NEVER be the guy who maintains his code!

The thing is – to continue Joel’s analogy – that “the duct tape programmer” might be able to win a go-cart race with a go-cart made of duct tape and WD-40, but I would NOT trust him to make a car for me to drive on the highways! – because I would not expect that car to be safe at high speed, and I would not expect it to last for years, and I would not expect it to withstand rain, snow, sand, etc.

In my opinion, automated unit tests and integration tests serve many purposes. One (obvious) one is to continually verify the behavior of a system, given that implementations may change and bugs be fixed and so on. Another is to DOCUMENT the behavior of the system, like in “we agreed with the customer that it should work like this”. Yet another is to aid the programmer while writing his code into separating concerns, because doing that just causes less pain when writing tests. Joel should stop bashing TDD.

And when it comes to new technologies and methodologies, I think it is SO IMPORTANT to keep an eye open for new AND old better ways to do stuff. My opinion is that I get smarter all the time by keeping an open eye on what other people are thinking and doing, and I get smarter all the time by putting some of it to use. Joel is just annoying here, because his statements lack nuance – keeping up with new stuff is NOT the same as bringing in new stuff unconsciously, but Joel does not seem to be capable of drawing this distinction.

In my opinion, there is no place for “the duct tape programmer” on a team. At least not on my team.

So if you care about your teammates, about the quality AS WELL AS THE MAINTAINABILITY AND EXPLAINABILITY of your code, please DO write tests!

And please DO separate concerns and take the time to factor out stuff into easy-to-understand narrow and focused classes!

And please, please DO use new as well as old technologies and practices where it makes sense – don’t put stuff off because it’s new, and you are used to hand-rolling your own linked lists and string structs in C and the C++ STL is just another fancy new hype to you.

Orthogonality (again)

There’s one thing, that almost always makes we want to assume the foetal position and cry: developers, who are ignorant to the fact that there is a difference between application logic and application framework.

I must admit that I have only recently started being (this) conscious about this difference, so I have written a buttload of code the last few years that violates almost everything that I stand for now, which really makes me sad inside. But it’s never too late to improve.

Once I realized this and started to try to adhere to it to keep the two things separated, I started seeing things so very clearly, and then other people’s ignorance of this fact just started to gnaw and irritate me. Hence this post – I need to get this off my chest – I need to write another post in the “rant” category…

An incredibly insightful (as always) post was made by Ayende a couple of months ago: Application structure: Concepts & features. Ayende’s post explains it so well, but basically he distinguished between concepts in a system, which is stuff that requires design, and features which merely build and/or augment existing concepts. I just want to add a personal experience to the rant in the context of Ayende’s post.

Here’s the setting

I am currently on a team that develops and maintains a mortgage deed trading and administration system. Part of this system is an extensive suite of automated nightly jobs and automatic reports.

Naturally, some of the reports are run every night, e.g. summing up the numbers recorded the previous day for automated export to accounting systems, other reports are run every week/month/year, some on bank days, some relative to bank days, etc.

Some jobs make changes to the state of the system (lige e.g. updating the particulars of people for whom we receive updates from the Central Office of Civil Registration, remembering that a particular batch of transactions have been exported to the accounting system, or remembering that information on interest fees for the previous year have been reported to the National Bank etc.), and some are just (idempotent) reports and exports.

Most jobs run automatically in the night. Some jobs can also be initiated by the user of our Windows Forms frontend through a web service call. And all reports should also be accessible through the built-in reporting frontend in the Winforms app.

Here’s our current solution

All reports are run through a web service call by instantiating a ReportCommand, which is capable of getting the names of all the reports accessible to the current user. And then, given a report name, the command can get all the parameters for that report. And then, given a report name and a set of parameters, it can run the report and output a file in SpreadSheetML format. This allows our frontend to dynamically build a GUI for all reports. No GUI work is needed whenever we need to create a new report, which is great.

The majority of our nightjobs are initiated by the Windows Scheduler, which stems from an old pragmatic solution to the very first automated job we needed almost three years ago. This has not changed, so jobs are still scheduled manually through the Windows Scheduler. Our job runner is an ordinary .NET Windows .exe, which gets executed with one or more arguments. Exporting transactions to accounting could look like this:

– which would use the string to look up a class that implements ITask, e.g. ExportTransactionsTask, which exports all non-exported transactions in a .csv file to some preconfigured location.

To be able to schedule reports to run, we have made a job named “report”, which is capable of invoking the ReportCommand directly, setting the parameters of the report from the given command line arguments. Running a report could look like this:

– which runs the ReportTask, which is fed a dictionary containing {{"name", "transactions"}, {"caseNo", "10000:20000"}, {"recordDate", "today-1:today"}, {"mail", "[email protected]"}}, which in turn runs the ReportCommand with the given report name and parameters, extrapolating macros like “today” into today’s date + some more simple stuff.

If this sounds complicated to you, then yes! It is actually pretty complicated! Not because it should be, but because it’s pretty legacy and implemented in a pretty messy way. And now the need has come for the users to be able to schedule jobs and reports from the Winforms frontend. Damn! This is a good time to reflect on how I wish we had done it (and how I plan to do stuff like this in the future).

How I wish we had done it

The problem above, as I see it, is that features and concepts are mixed together in a gray matter that noone can oversee. I wish we had thought more about separating the concepts (command, nightjob, report) from the features (implementation of the different jobs and reports). I wish implementing a new report was easy like this ( #region...#endregion added as explanation):

and then all possible implementations of the Report class would be picked up by the various concepts in the system, like e.g. our report command, the report scheduler, and an ad hoc reporting task runner… and then, in a similar fashion, I want to subclass an IdempotentTask class for all tasks, that do not change anything in the system, and a TaskWithSideEffects class for all tasks that change the state of the world.

This way, implementing the logic inside of reports and tasks will be orthogonal to implementing the capabilities of the reports and tasks and their scheduling.

An opinion on “integrated solutions” like TFS and VSTS

As a response to Ben Scheirman’s post, Benjamin Day kindly apologized and summed up why he likes Visual Studio Team System and Team Foundation Server.

I am not going into the debate on whether it was right or wrong to delete that comment, because a lot of people already did that, and I agree with those who think that deleting the comment was kind of wrong. Calling it “unethical behaviour“, however, seems to be a little too harsh. Moderating news channels discussing politics in China is unethical – deleting a comment, because the blog author disagrees, is just weird and a little bit annoying.

Instead, I just wanted to chime in with my 2 cents on why I think Visual Studio Team System and Team Foundation Server are inferior, compared to ALL of the free alternatives that I know of – it’s because I believe in one of the finest principles of software engineering, which was coined by Edsger Dijkstra: Separation Of Concerns.

Separation Of Concerns can be low level, as in Uncle Bob’s single responsibility principle, or higher level as in service-orientation, or even higher level as in there’s NO WAY I’m gonna buy an oven, which insists on also being my washing machine and a pair of roller blades. No way!

This principle is so inherent in all the good disciplines of software engineering, heck in LIFE even, that I simply had to reply!

So I like to use CruiseControl, TeamCity, Subversion, Git, ReSharper, TestDriven, NUnit, xUnit.net, Jira, Redmine, Basecamp, MSBuild, Rake, NAnt etc. etc. because they let me switch each one of them out for anyone of the other whenever I feel like it. And, more importantly, whenever it fits the task better.

The fact that some of the tools are FREE and have their source code available for me to look at, is just an added plus. But the primary reason to use those tools is simply that they do one thing, and they are usually capable of doing that one thing better.

Respect your test code #3: Make your tests orthogonal

When two things are orthogonal, it means that the angle between them is 90 degrees – at least in spaces with 3 dimensions or less. So when two vectors are orthogonal, they satisfy the property that there is no way to use the first one to express even the tiniest bit of what the other one expresses.

That is also how we should write our application code: methods and classes should be orthogonal to one another – i.e. no class should try to express what another class already expresses either in part or whole – therefore each class and each method should have only one responsibility, and thus one reason to change.

And test code is real code.

The corollary is that our tests should have only one single reponsibility as well.

That is why I hate tests that look like this:

Notice how this test is actually fairly decently structured – at least that’s what it initiallt looks like… but it actually tests a lot of things: it checks that the output of the DueTermsFinder is what it expects, testing the MortgageDeedRepository indirectly as well – and then it goes on to test the TermDebitRecorder … sigh!

If (when!) one of these classes changes at some point, because the requirements have changed or whatever, the test will break for no good reason. The test should break because you have introduced a bug, not because you made a change in some related functionality.

That is why I usually follow the pattern of AAA: Arrange, Act, Assert. Each test should be divided into discrete steps corresponding to 1) Arranging some data, 2) Triggering a computation or some state change, 3) Asserting that the outcome was what we expected. And if I am feeling idealistic that day, I also follow the principle of putting only one assertion at the end of each test.

I try to never do AAAAA (Arrance, Act, Assert, Act Assert) or AAAAAA, or AAAAAAA which is even worse.

Every test should have only one reason to break.

Respect your test code #2: Create base classes for your test fixtures

When writing code, I often end up introducing a layer supertype – i.e. a base class with functionality shared by all implementations in that particular layer in my application.

This also holds for my test code – and why shouldn’t it? Test code is as real as real code, so the same rules apply and it should benefit from the same pain killers as we implement in our application code.

For example when testing repositories and services that need to query the database, I can save myself a lot of writing by stuffing all the boring NHibernate push-ups in a DbTestFixture supertype – this includes building a configuration that connects to a test database, building a session factory, storing that session factory somewhere, re-creating the entire database schema in the test fixture setup, and running each test in a transaction that is automatically rolled back at the end of each test + a few convenience methods that allow me to flush the current session etc.

The DbTestFixture might look something like this (note that all my repositories take an instance of ISessionProvider in their ctor – that’s how they obtain the currently ongoing session, which is why I have a TestSessionProvider to inject into repositories under test):

Then a fictional repository test might look as simple as this:

Note how DbTestFixture flushes in all the right places so I don’t need to worry about that.

This test fixture supertype can be used for all my database access tests, as well as integration testing. But what about unit tests? I am using Rhino Mocks, so my unit test fixture base looks like this:

Real simple – it just stores my MockRepository and gives me a few shortcuts to the mocks I care for. Then I inherit this further to ease testing e.g. my ASP.NET MVC controllers like this:

As you can see, I make it a real “fixture” – the controllers I am about to test will fit into this fixture like a glove, and I will certainly never forget to instantiate my controller only once, because I start out by implemeting that part in the implementation of the CreateController method.

A controller test might look like this: