NHibernate is very flexible

…but it does impose limitations on your domain model.

Most of these limitations, however, like the need for public/ internal/ protected members to be virtual, and the requirement for a default constructor to exist with at least protected accessibility, are not that hard to adhere to and usually don’t interfere with what you would do if there were no rules at all.

One of the limitations, however, can be pretty significant – Ayende describes the problem here, using the term “ghost objects”.

But, as I am about to show, this significance only arises if you follow a certain style of coding, which you should usually avoid!

Short explanation of the problem

When NHibernate lazy-loads an entity from the db (i.e. when you call session.Load<TEntity>(id) or when an entity in your session references something through a lazy-loaded association), it does so by providing an instance of a runtime-generated type, which acts as a proxy.

The first time you access something on the proxy, it gets “hydrated”, which is just a fancy way of saying that the data will be loaded from the database.

This would be fine and dandy, if it weren’t for the fact that the proxy is a runtime-generated subclass of your entity, which – in cases where inheritance is involved – will be a sibling to the other derived classes. Consider the simple inheritance hierarchy on the sketch to the right which in code could be something like so:

– and then NHibernate will generate something along the lines of this (fake :)) class signature:

See the problem? Here’s the problem:

This means that this kind of runtime type checking will fail in those circumstances where the entity is a lazy-loaded reference of the supertype, and the following will FAIL:

One possible solution

The other day, Ayende blogged about a recent addition to NHibernate, namely lazy-loaded properties. This allows an entity to be partially hydrated, intercepting calls to certain properties to lazy-load the relevant fields on demand.

This feature is great when storing LOBs alongside the other fields on an entity, but it also laid the ground for his most recent addition, which is the ability to lazy-load an association by setting lazy="no-proxy" on it.

This way, NHibernate will not build a proxy, but instead it will intercept the property getter and load the entity at that point in time, thus being able to return the exact (sub)type of the loaded entity.

Now this seems to solve our problems, but let’s zoom out a bit … why did we have a problem in the first place? Our problem was actually that we failed to write object-oriented code, but instead we wrote a brittle piece of code that would fail at runtime whenever someone added a new subtype, thus violating the Liskov substitution principle. Moreover it just feels wrong to implement business logic that reflects on types!

What to do then?

Well, how about making your code polymorphic? The logic above could be easily rewritten as:

– which moves the logic of yielding name as a oneliner into the class hierarchy, allowing us to always get a name from a LegalEntity.

What if I really really need a concrete instance?

Then you should use the nifty visitor pattern to extract what you need. In the example above, I would need to add the following additions:

This way, we’re taking advantage of the fact that each subclass knows its own concrete instance, thus allowing it to pass itself to the visitor we passed in.

This is the preferred solution when the logic you’re writing doesn’t belong inside the actual entitiy class, like e.g. when you want to convert the entity to an editable view object, because this will make your code break at compile time if someone adds a new specialization, thus requiring each piece of logic to handle that specialization as well.

Oh, and if you’re a Java guy, you might be missing the ability to create an inline anonymous visitor within the scope of the current method, but that can be easily emulated by a generic visitor, like so:

– which would allow you to write inline typesafe code like this:

Only thing missing now is the ability to return a value in one line depending on the subclass. Well, the generic visitor can be used for that as well by adding the following:

– allowing you to write code like this:

– and still have the benefit of compile-time safety that all specializations have been handled.

When to reflect on types?

IMO you should only reflect on types in business logic when it’s a shortcut that doesn’t break the semantics of your code. What do I mean by that? Well, e.g. the implementation of the extension method System.Linq.Enumerable.Count<T>() looks something like this:

This way, providing the number of items is accelerated for certain implementations of IEnumerable&lt;T&gt; because the information is already there, and for other types there’s no way to avoid manually counting.

Conclusion

I don’t think I will be using the new lazy="no-proxy" feature, because if I need it, I think it is a sign that my design has a bad smell to it, and I should either go for polymorphism or using a visitor.

C# vs. Clojure vs. Ruby & Scala

Short preface: at a job interview, Zach Cox was told to aggregate words and word counts from a bunch of files into two files, sorted alphabetically and by word count respectively, which he did in Ruby and Scala. This led Lau Bjørn Jensen to do the same thing in Clojure, which apparantly sparked other people to do it in Java, Python etc.

Inspired by the afore mentioned problem, and an extended train ride home (thank you, Danish National Railways!!), I decided to see what a C# (v. 3) version could look like:

Weighing in at 36 lines and executing in 10.2 seconds (on my Intel Core 2 laptop with 4 GB RAM), I think this is a pretty clear and performant alternative to the other languages mentioned.

Tailoring a custom matcher for NMock

You’ll often hear proponents of test-driven development claim that unit testing is hard and that it forces them to open up their classes’ hidden logic for them to be testable. That might be true to some degree, but more often in my opinion you’ll find that designing your system to be testable also has the nifty side-effect of separating your logic into .. umm logical chunks… and what I really mean here is that your chunks have a tendency to become orthogonal, which is by far one of the best quality attributes of a system.

BUT that was not what I was going to say, actually – I just wanted to comment on a nifty thing, I recently found out: implementing my own Matcher to use with NMock.

An NMock Matcher is an abstract class, which requires you to implement the following members:

A matcher can be used in the call to With(...) when stubbing or expecting… an example could be setting an expectation that a search function on our user repository will return an empty result… like so:

In the example above, Is is a class containing a static method StringContatining, which returns a matcher of a certain type. Now, when the test runs, and NMock needs to decide if an intercepted function call matches the expectation above, it will iterate though the given matchers, and call their Matches function, passing to it the actual argument as object o.

The matcher returned by StringContaining probably contains an implementation of Matches which looks something like this:

where _substring was probably set in the ctor of the matcher when it was constructed by the StringContaining function.

Now that leads me to my recent problem: I needed to pull out the actual argument given when the expected function call was performed. In my case the argument was a delegate, which I wanted to pull out, and then invoke it a few times with different arguments.

What I did was this:

This allows me to set up an expectation like this:

Now, after the code has been run, I have access to the delegate passed to the file service through snatcher.Object.

If someone knows a cooler way to do this, please do post a comment below. Until then, I will continue to think that it was actually pretty nifty.

Using compression at field level when persisting

One day at work, someone needed to store a lot of text in a field of a domain object – and since we are using NHibernate for persistence, that field would get stored to the database, taking up a huge amount of space.

To circumvent this, we just made the actual type of the member variable description be a byte[], and then we let the accessor property Description zip and unzip when accessing the value.

Like so:

It should be noted that it is crucial that the ToArray() method of the compressed MemoryStream be called after the GZipStream stream has been disposed! That’s because the GZipStream will not write the gzip stream footer bytes until the stream is disposed. So if ToArray() is called before disposing, you will get an incomplete stream of bytes.

Moreover it should be noted that zipping strings in the database is not always cool because:

  1. Compression kicks in for strings at around 2-300 chars. The size of the compressed data is greater than that of the original string for shorter strings.
  2. Querying is impossible/hard/weird. 🙂

But it is a nifty little trick to put in one’s backpack, and I had all sorts of trouble figuring out exactly how to make the GZipStream behave, so I figured it would be nice to put a working example on the net.