Unit testing sagas with Rebus

Ok, so now we have taken a look at how we could unit test a simple Rebus message handler, and that looked pretty simple. When unit testing sagas, however, there’s a bit more to it because of the correlation “magic” going on behind the scenes. Therefore, to help unit testing sagas, Rebus comes with a SagaFixture<TSagaData> which you can use to wrap your saga handler while testing.

I guess an example is in order ๐Ÿ™‚

Consider this simple saga that is meant to be put in front of a call to an external web service where the actual web service call is being made somewhere else in a service that allows for using simple request/reply with any correlation ID:

Let’s say I want to punish this saga in a unit test. A way to do this is to attack the problem like when unit testing an ordinary message handler – but then a good portion of the logic under test is not tested, namely all of the correlation logic going on behind the scenes. Which is why Rebus has the SagaFixture<TSagaData> thing… check this out:

As you can see, the saga fixture can be used to “host” your saga handler during testing, and it will take care of properly dispatching messages to it, respecting the saga’s correlation setup. Therefore, after handing over the saga to the fixture, you should not touch the saga handler again in that test.

If you want to start out in your test with some premade home-baked saga data, you can pass it to the saga fixture as a second constructor argument like so:

and as you can see, you can access all “live” saga data via fixture.AvailableSagaData, and you can access “dead” saga data as well via fixture.DeletedSagaData.

If you’re interested in logging stuff, failing in case of uncorrelated messages, etc., there’s a couple of events on the saga fixture that you can use: CorrelatedWithExistingSagaData, CreatedNewSagaData, CouldNotCorrelate – so in order to fail in all cases where a message would have been ignored because it could not be correlated, and the message is not allowed to initiate a new saga data instance, you could do something like this:

Nifty, huh?

Unit testing handlers with Rebus

…is pretty easy, because message handlers are ordinary classes free of dependencies – i.e. you implement the appropriate Handle methods, but you don’t have to derive your class off of some base class, and you can have your own dependencies injected, so you’re free to mock everything if you feel like it.

There’s one thing, though: The IBus interface, which you’ll most likely need some time, is kind of clunky to mock with mocking libraries like Moq, Rhino Mocks, NSubstitute, FakeItEasy [insert today’s hip mocking library here] – especially if you’re testing the AAA way of writing your unit tests.

Let’s take a look at a simple handler, whose responsibility is to subscribe to the PurchaseRecorded event and, for each debtor involved in the purchased mortgage deed, ensure that a process is kicked off that subscribes that debtor to an SSN-based address update service provided by the Danish Central Person Registry:

Right, so now I want my test to capture the fact that the incoming event gives rise to multiple messages that the service sends to itself, one for each debtor. My test might look somewhat like this (shown with the classic Rhino Mocks syntax):

It doesn’t require that much imagination to see how the asserts can become completely unreadable if the sent messages contain more than a few fields… which is why Rebus has a FakeBus in the Rebus.Testing namespace!

Check this out:

The example above is quite simple, so it might not be that apparent – but the real force of FakeBus is that just stores all messages that are sent, sent to self, published, replied, etc., also it also stores any headers that may have been added. This way, during testing, you can easily get access to the actually sent messages and inspect whether their information is as expected.

Contract testing with NUnit

When you’re testing your code, you’re most likely doing different types of tests – e.g. you might be doing real unit testing, integration testing, etc. In Rebus, we had the need to do contract tests[1. I’m not aware if there’s a more established word for this kind of black box test.], which is a test that ideally runs on multiple implementations of some interface, verifying that each implementation – given various preconditions in the form of differing setups etc – is cabable of satisfying a contract in the form of an interface.

In other words, for each implementation of a given interface, answer this question: Can the implementation behave in the same way in regards to the behavior we care about.

I’m aware of Grensesnitt, which Greg Young authored some time ago – and that is exactly what I’m after, only without the depending on an additional library, without the automatic stuff, etc. I just wanted to do it manually.

So, at first I wrote a lot of tedious code that would execute multiple tests for each test case, which was really clunky and tedious and would yield bad error messages when tests would fail, etc. Luckily, Asger Hallas brought my attention to the fact that a much more appropriate mechanism was already present in NUnit that would enable the thing that I wanted.

It turned out that NUnit’s [TestFixture] attribute accepts a Type as an argument, i.e. you can do this: [TestFixture(typeof(SomeType))], which will cause the test fixture type, which should be generic, to be closed with the given type. E.g. like so:

Now, if we add a generic constraint to the fixture’s type parameter, we can require that the type is a factory and can be newed up:

See where this is going? Now one single test fixture can be closed with multiple factory types, where each factory type is capable of creating an instance of one particular implementation of something, that should be tested. E.g. in Rebus, we can test that multiple implementations of the ISendMessages and IReceiveMessages work as expected by creating a factory interface like this:

where an implementation might look like this:

and one particular test fixture might look like this:

The advantage is that it’s much easier to see if new implementations of Rebus’ abstractions can satisfy the required contracts – and with the way it’s implemented now, various implementations can have different setup and teardown code, and their fixtures can belong to a suitable NUnit category, allowing Rebus’ build servers to run only tests where the required infrastructure is in place.

Using Windsor as an auto-mocking container

One of my former colleagues blogged about using AutoFixture as an auto-mocking container the other day, which got me thinking about auto-mocking with Windsor.

Although I’ve never used auto-mocking myself, a few months ago, I answered a question on StackOverflow, hinting at how auto-mocking could be accomplished with Castle Windsor. I didn’t provide any example code though, so this post will show a more complete solution on how auto-mocking can be implemented with Windsor and Rhino Mocks[1. I realize that all the cool kids are using other mocking libraries these days. I’m still using Rhino though, but it should be a fairly trivial task to “port” this solution to another mocking library.].

A lazy component loader to generate the mocks

First, I create a simple ILazyComponentLoader that lazily registers a Rhino Mocks instance when a particular service is requested – like so:

– real simple. Windsor’s default lifestyle is singleton, which ensures that subsequent calls for the same service will give me the same mock instance.

Test fixture base class

Then I create a test fixture base class, which is supposed to act as exactly that: a fixture for the SUT:

As you can see, my SetUp method creates a new WindsorContainer, registering nothing but my AutoMockingLazyComponentLoader and TAppService – the SUT type. It then uses the container to instantiate the SUT, storing it away in a protected instance variable for test cases to work on.

I included DoSetUp and DoTearDown methods for my test fixtures to override in case they need to – but in most cases, they should not be used because of the coupling they introduce between test cases.

Lastly, I have two methods: Dep, which allows me to access an injected service type (short for “dependency”), and Mock which allows me to generate a new mock object in case I need to mock something other than my SUT’s dependencies.

How to write a new test fixture

As a result of this, a test fixture for something called HandleUpdateDisturbanceForecast is reduced to something like this:

– which I think looks pretty slick. And yes, this is an actual test case from something we’re building – doesn’t matter what it’s doing, just wanted to show an actual example.

It’s not like I’m saving a huge amount of coding here – I usually only instantiate my SUT in the SetUp method of my test fixtures, but I like how the “fixed style” of AutoMockingFixtureFor encourages application services to follow a pattern that makes for easy IoC and testing. And the fixture is relieved of almost all clutter that does not directly relate to the thing being tested.

Tell, Don’t Ask

or “How big is an interface?”… Uuuh, what?

Well, consider these two interfaces:

– which is bigger?

If you think that ISomeService1 is the bigger interface, then there’s a good chance you’re wrong! Why is that?

It’s because an interface is not just the signatures of the methods and properties it exposes, it also consists of the types that go in and out of its methods and properties, and the assumptions that go along with them!

This is a fact that I see people often forget, and this is one of the reasons why everything gets easier if you adhere to the Tell, Don’t Ask principle. And by “everything”, I almost literally mean “everything”, but one thing stands out in particular: Unit testing!

Consider this imaginary API:

Now, in our system, on various occasions we need to change the values for a particular name. We start out by doing it like this:

Obviously, if GetValues returns a reference to a mutable object, and this object is the one kept inside the value store, this will work, and the system will chug along nicely.

The problem is that this has exposed much more than needed from our interface, including the assumption that the obtained Values reference is to the cached object, and an assumption that the Value1 and Value2 properties can set set, etc.

Now, imagine how tedious it will be to test that the Run method works, because we need to stub an instance of Values as a return value from GetValues. And when multiple test cases need to exercise the Run method to test different aspects of it, we need to make sure that GetValues returns a dummy object every time – otherwise we will get a NullReferenceException.

Now, let’s improve the API and change it into this:

adhering to the “Tell, Don’t Ask” principle, allowing a potential usage scenario like this:

As you can see, I have changed the API from a combined query/command thing to a pure command thing which appears to be much cleaner to the eye – it actually reveals exactly what is going on.

And testing has suddenly become a breeze, because our mocked ISomeKindOfValueStore will just record that the call happened, allowing us to assert it in the test cases where that is relevant, ignoring it in all the other test cases.

Another benefit is that this coding style lends itself better to stand the test of time, as it is more tolerant to changes – the implementation of ISomeKindOfValueStore may change an in-memory object, update something in a db, send a message to an external system, etc. A command API is just easier to change.

Therefore: Tell, Don’t Ask.

Assuring that those IWantToRunAtStartup can actually run at startup

When building NServiceBus services based on the generic host, you may need to do some stuff whenever your service starts up and shuts down. The way to do that is to create classes that implement IWantToRunAtStartup, which will be picked up by the host and registered in the container as an implementation of that interface.

When the time comes to run whoever wants to run at startup, the host does a

to get all the relevant instances (or something similar if you aren’t using Windsor…).

If, however, one or more instances cannot be instantiated due to missing dependencies, you will get no kind of warning or error whatsoever! [1. At least this is the case when using Castle Windsor – I don’t know if this is also the behavior of other IoC containers… Maybe someone can clarify this…?] This means that the service will silently ignore the fact that one or more IWantToRunAtStartups could not be instantiated and run.

In order to avoid this error, I have written a test that looks somewhat like this:

I admit that the code is kind of clunky even though I distilled the interesting parts from some of the plumbing in our real test… moreover, our real test is iterating through all the possible configurations our container can have – one for each environment – so you can probably imagine that it’s not pretty ๐Ÿ™‚

But who cares??? The test has proven almost infinitely useful already! Whenever something that wants to run at startup cannot run at startup, TeamCity gives us error messages like this:

Shouldly – better assertions

Today I came across Shouldly, as I followed a link in a tweet by Rob Conery. I have a thing for nifty mini-projects, so I immediately git clone http://github.com/shouldly/shouldly.git’d it, and pulled it into a small thing I am building.

What is Shouldly? Basically, it’s just a niftier way of writing assert statements in tests. It fits right in between NUnit and Rhino Mocks, so you will get the most out of it if you are using those two for your unit tests. Check out this repository test – first, arrange and act:

and then, usually the assert would look something like this:

– yielding error messages like this:

which is probably OK and acceptable, because how would NUnit know any better than that?

Check out what Shouldly can do:

– yielding error messages like this:

which IMO is just too nifty to ignore!

Shouldly takes advantage of the fact the the current StackTrace has all the information we’re after when the assert is an extension method, which is extremely cool and well thought out.

Moreover, it integrates with Rhino Mocks, yielding better messages when doing AssertWasCalled stuff: Instead of just telling that the test did not pass, it tells you exactly which calls were recorded and which one was expected – a thing that Rhino Mocks has always been missing.

Conclusion: YourNextProject.ShouldContain("Shouldly")

Respect your test code #3: Make your tests orthogonal

When two things are orthogonal, it means that the angle between them is 90 degrees – at least in spaces with 3 dimensions or less. So when two vectors are orthogonal, they satisfy the property that there is no way to use the first one to express even the tiniest bit of what the other one expresses.

That is also how we should write our application code: methods and classes should be orthogonal to one another – i.e. no class should try to express what another class already expresses either in part or whole – therefore each class and each method should have only one responsibility, and thus one reason to change.

And test code is real code.

The corollary is that our tests should have only one single reponsibility as well.

That is why I hate tests that look like this:

Notice how this test is actually fairly decently structured – at least that’s what it initiallt looks like… but it actually tests a lot of things: it checks that the output of the DueTermsFinder is what it expects, testing the MortgageDeedRepository indirectly as well – and then it goes on to test the TermDebitRecorder … sigh!

If (when!) one of these classes changes at some point, because the requirements have changed or whatever, the test will break for no good reason. The test should break because you have introduced a bug, not because you made a change in some related functionality.

That is why I usually follow the pattern of AAA: Arrange, Act, Assert. Each test should be divided into discrete steps corresponding to 1) Arranging some data, 2) Triggering a computation or some state change, 3) Asserting that the outcome was what we expected. And if I am feeling idealistic that day, I also follow the principle of putting only one assertion at the end of each test.

I try to never do AAAAA (Arrance, Act, Assert, Act Assert) or AAAAAA, or AAAAAAA which is even worse.

Every test should have only one reason to break.

Respect your test code #2: Create base classes for your test fixtures

When writing code, I often end up introducing a layer supertype – i.e. a base class with functionality shared by all implementations in that particular layer in my application.

This also holds for my test code – and why shouldn’t it? Test code is as real as real code, so the same rules apply and it should benefit from the same pain killers as we implement in our application code.

For example when testing repositories and services that need to query the database, I can save myself a lot of writing by stuffing all the boring NHibernate push-ups in a DbTestFixture supertype – this includes building a configuration that connects to a test database, building a session factory, storing that session factory somewhere, re-creating the entire database schema in the test fixture setup, and running each test in a transaction that is automatically rolled back at the end of each test + a few convenience methods that allow me to flush the current session etc.

The DbTestFixture might look something like this (note that all my repositories take an instance of ISessionProvider in their ctor – that’s how they obtain the currently ongoing session, which is why I have a TestSessionProvider to inject into repositories under test):

Then a fictional repository test might look as simple as this:

Note how DbTestFixture flushes in all the right places so I don’t need to worry about that.

This test fixture supertype can be used for all my database access tests, as well as integration testing. But what about unit tests? I am using Rhino Mocks, so my unit test fixture base looks like this:

Real simple – it just stores my MockRepository and gives me a few shortcuts to the mocks I care for. Then I inherit this further to ease testing e.g. my ASP.NET MVC controllers like this:

As you can see, I make it a real “fixture” – the controllers I am about to test will fit into this fixture like a glove, and I will certainly never forget to instantiate my controller only once, because I start out by implemeting that part in the implementation of the CreateController method.

A controller test might look like this:

Respect your test code #1: Hide object instantiation behind methods with meaningful names

One thing, that I find really annoying, is that somehow it seems to be accepted that test code creates instances of this and that like crazy! In a system I am currently maintaining with about 1MLOC where ~40% is test code, I often come across test fixtures with something like 20 individual tests, and each and every one of them creates instances of maybe 5 different entities, builds an object graph, runs some code to be tested, and does some assertions on the result.

This is a super example on how to write brittle and rigid tests, because what happens if the signature of one of the ctors changes? Or the semantics of the object graph? Or [insert way too many ways for the test to break for the wrong reasons here…].

When I come across these tests, I usually factor out the creation of all but the most simple objects behind methods with meaningful names. This has two advantages: 1) It’s more DRY because they can be re-used, and 2) It serves as brilliant documentation. Consider this rather simple assertion that requires a few objects to be in place:

compared to this:

MUCH more clear! The factory method acts as brilliant documentation on which aspects of the test are relevant to the outcome – it is clear, that a mortgage deed which has not yet begun its amortization must report its first term date as the actual principal date.

Go on to test another property of the mortgage deed before amortization:

– and we have already saved us from writing 50 lines of brittle rigid test code. Keep factoring out common stuff, so that the ctor of the mortage deed is still only called in one place… e.g. create methods like this:

which allows me to write cute easy-to-understand tests like this:

This is also a good example on how to avoid writing // bla bla comments all over the place – it’s just not necessary when the methods have sufficiently well-describing names.