Having experienced a lot of pain using RDBMSs ([1. Usually because of abusing RDBMSs, actually. Storing an object model in a RDBMS is not painful as long as the tooling is right – e.g. by leveraging the amazing NHibernate. The pain comes when developers suddenly start implementing overly complex queries and doing reporting on top of a pretty entity model, modeling stuff OO style… ouch!]) as a default choice of persistence, having read a couple of blog posts about MongoDB, and being generally interested in widening my horizon, I decided to check out MongoDB.
This post is a write-as-I-go summary of the information I have gathered from the following places:
More posts may follow…. 🙂
Getting MongoDB
Piece of cake! Download MongoDB from the download center and shove the binaries away somewhere on your machine. Default is for MongoDB to store its data in /data/db which translates to c:\data\db if you are using Windows – go ahead and create this directory. The MongoDB daemon can be started by running mongod.exe, which will accept connections on localhost:27017.
It will probably look something like the screenshot shown below.
An alternative data path can be specified on the command line, e.g. like so: mongod --dbpath c:\somewhere\else.
Accessing it with JavaScript
Run mongo.exe to start the Mongo Shell. It will probably look something like this:
In the MongoDB prompt, you can use JavaScript to access the db. Here’s a sample session of some commands I have found useful:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
// lists the dbs in this mongo > show dbs admin local > > // change database context to some db ("myblog" - will automagically be created) > use myblog switched to db myblog > > show collections system.indexes > > // save a couple of documents in a collection named "posts" > // (collection will be automagically created as well...) > db.posts.save({ headline: 'Notes to self about MongoDB', slug: 'notes-to-self-about-mongodb', tags: ['mongodb', 'nosql', 'nifty', 'c#'] }) > db.posts.save({ headline: 'Someday I want to check out CouchDB as well', slug: 'someday-i-want-to-check-out-couchdb-as-well', tags: ['couchdb', 'nosql', 'nifty', 'c#'] }) > > // show documents in a collection (returns a cursor, which will be iterated for the first 10 or 20 results - next pages can be retrieved with the 'it' command) > db.posts.find() { "_id" : ObjectId("4b8e4281781b000000005cfc"), "headline" : "Notes to self about MongoDB", "slug" : "notes-to-self-about-mongodb", "tags" : [ "mongodb", "nosql", "nifty", "c#" ] } { "_id" : ObjectId("4b8e42cc781b000000005cfd"), "headline" : "Someday I want to check out CouchDB as well", "slug" : "someday-i-want-to-check-out-couchdb-as-well", "tags" : [ "couchdb", "nosql", "nifty", "c#" ] } |
Now, I have successfully added two documents representing blog posts in a collection named posts. As you can see, MongoDB assigns some funky IDs to the documents.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
> // lets get the first post ('find' and 'findOne' accept a query document as their first parameter) > db.posts.findOne({'_id': ObjectId('4b8e4281781b000000005cfc')}) { "_id" : ObjectId("4b8e4281781b000000005cfc"), "headline" : "Notes to self about MongoDB", "slug" : "notes-to-self-about-mongodb", "tags" : [ "mongodb", "nosql", "nifty", "c#" ] } > > // now let's find all IDs ('find' and 'findOne' accept as their second parameter a document > // specifying which fields to return) > db.posts.find({}, {'_id': true}) { "_id" : ObjectId("4b8e4281781b000000005cfc") } { "_id" : ObjectId("4b8e42cc781b000000005cfd") } { "_id" : ObjectId("4b8e4595781b000000005cfe") } > |
That was a brief demonstration of the JavaScript API in the Mongo Shell. Now, let’s do this from C#.
Getting started with mongodb-csharp
Now, go to mongodb-csharp dowload section at GitHub and get a debug build of the driver. Create a C# project and reference the MongoDB.Driver assembly.
On my machine, punching in the following actually works:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
[Test] public void CanAddPost() { using(var mongo = new Mongo()) { mongo.Connect(); var db = mongo["myblog"]; var posts = db["posts"]; posts.Insert(new Document { {"headline", "Post added from C#"}, {"slug", "post-added-from-csharp"}, { "tags", new[] { "c#", "nifty" } } }); } } |
Now, I can verify that the document is actually in there by going back to the console and doing this:
1 2 3 4 5 |
> db.posts.find() { "_id" : ObjectId("4b8e4281781b000000005cfc"), "headline" : "Notes to self about MongoDB", "slug" : "notes-to-self-about-mongodb", "tags" : [ "mongodb", "nosql", "nifty", "c#" ] } { "_id" : ObjectId("4b8e42cc781b000000005cfd"), "headline" : "Someday I want to check out CouchDB as well", "slug" : "someday-i-want-to-check-out-couchdb-as-well", "tags" : [ "couchdb", "nosql", "nifty", "c#" ] } { "_id" : ObjectId("4b8e4b72091abb14e4000001"), "headline" : "Post added from C#", "slug" : "post-added-from-csharp", "tags" : [ "c#", "nifty" ] } > |
Nifty! Now lets show the posts from C#. On my machine the following snippet displays the headlines of all posts:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
[Test] public void CanShowPosts() { using(var mongo = new Mongo()) { mongo.Connect(); var db = mongo["myblog"]; var posts = db["posts"]; Console.WriteLine("Posts"); foreach(var post in posts.FindAll().Documents) { Console.WriteLine(" {0}", post["headline"]); } } } |
– which is documented in the following screenshot:
Random nuggets of information
Document IDs
All MongoDB documents must have an ID in the _id field, either assigned by you (any object can be used), or automatically by MongoDB. IDs generated by MongoDB are virtually globally unique, as they consist of the following: 4 bytes of timestamp, 3 bytes of machine identification, 2 bytes of process identification, 3 bytes of something that gets incremented.
As a nifty consequence, the time of creation can be extracted from auto-generated IDs.
The ID type used by MongoDB can be created with ObjectId('00112233445566778899aabb') (where the input must be a string representing 12 bytes in HEX).
How are documents stored?
I you have not yet figured it out, documents are serialized to JSON – with the minor modification that it’s a BINARY version of JSON, hence it’s called BSON.
String encoding
UTF-8. No worries.
What about references?
I will research this and do a separate post on the subject. As MongoDB is non-relational, a “join” is – in principle – an unknown concept. There’s a mechanism, however, that allows for consistent representation of foreign keys that may/may not give you some extra functionality (depending on the driver you are using).
What about querying?
I will research this as well, posting as I go.
OR/M? (or OD/M?)
It is not yet clear to me how to handle Object-Document Mapping. Will require some research as well. As an OO dude, I am especially interested in finding out what a schema-less persistance mechanism will do to my design.
What else?
More topics include applying indices, deleting/updating, atomicity, and more. Implies additional blog posts.
Conclusion
My first impression of MongoDB is really good. It’s extremely easy to get going, and the few error messages I have received were easy to understand.
I am especially in awe with how little friction I encountered – mostly because of the schema-less nature, but also because everything just worked right away.
Have you looked at MongoDB.Emitter?
http://groups.google.com/group/mongodb-user/browse_thread/thread/d85b91a68145bee3?pli=1
No I haven’t. I am actually unsure of how I would like to access my documents from C#.
At one point, I thougt that wrapping documents behind something dynamic in C# 4 would be cool – sort of like using an ExpandoObject – but that would just be syntactic sugar on top of the document.
I am thinking about doing something along the lines of: var personOrNull = document.As<IPerson>(), which would attempt to deserialize the fields required for the document to qualify as a complete person, yielding null if that could not be achieved – and var person = document.Is<IPerson>() to work pretty much like MongoDB.Emitter, allowing you to set fields that might not already be there.
I am still pretty unsure of this though.
There has been a lot of work going into the next version of MongoDB-CSharp which is soon to be released. Massive overhaul of the serializer now let’s us support typed collections (db.GetCollection() ). We’ll handle all the mapping. As a side benefit, the Linq provider was re-written and now supports projections, complex where conditions, and automatic map-reduce when groupby and/or aggregates are used.
You can check out the pre-release branch at http://github.com/craiggwilson/mongodb-csharp and the accompanying wiki for some pre-release documentation.
Have fun…