This post will touch a little bit on the mechanism used for references, and then a few thoughts on how document-orientation relates to OO.

Now – if you, like me, are into OO and normalized object models – the weirdness begins….. or maybe not?! (actually, I am not sure yet :))

In an OO world (and in a normalized RDB world as well), you reference stuff, thus reducing the amount of redundant information as much as possible. E.g. the names of countries should not be put in a column in your address table, each country should have a row in the countries table, and then be referenced by a countryId in the address table.

In a document-oriented world, you generally embed objects instead of referencing them. This is done for performance reasons, and because there’s no way to join stuff – which means that a stored ID/foreign key merely remains free for the client to manually use in additional queries.

When you do need to actually reference another document, use DBRef to create a reference, supplying the collection and the ID as arguments. E.g. like so:

This way, the reference is represented in a consistent manner which may – or may not – be picked up by the driver you are using. The C# driver can create DBRefs and follow them, but you don’t get to join stuff – you still need an extra query.

Embedding objects may seem a little clunky at first, but actually this plays nicely with some common OO concepts – take aggregation, for example: a blog post has an array of comments, each of which makes no sense without an aggregating post – i.e. comments live and die with their post. That’s an obvious sign that comments should be embedded in the post. So, instead of:

– you should do this:

Actually this makes me think about the concept of an aggregate root in DDD: an aggregate root “owns” the data beneath it, and is responsible for maintaining its own integrity. If you were to delete an aggregate root, all the data beneath it would dissappear.

This also fits kind of nicely with the fact that there’s no database transactions in MongoDB – i.e. there’s no way to issue multiple statements and have them rolled back in case of an error – there’s only documents, and either a document gets inserted/updated/deleted, or it doesn’t. So obviously, the document is the unit of atomicity, which fits (sort of nicely) with the aggregate root and its responsibility of keeping itself internally consistent.

Conclusion

The observations stated here pretty much make a document an aggregate root in the DDD sense – especially since only documents get an _id. There’s no obvious way to reference a particular comment inside the second post shown above.

If MongoDB’s performance is up to the task, data should probably be aggregated as much as possible into large documents. MongoDB’s limit is 4 MB per document, but I am unsure of how large documents should be before you should consider splitting them.

Maybe I am thinking too much about these things? Maybe I should just try and build something and see where my document modeling goes? Suggestions and comments are welcome 🙂

More checking out MongoDB: References

One thought on “More checking out MongoDB: References

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: