computers are cool

mookid on code

Checking out MongoDB

March 3rd, 2010 by mookid

Having experienced a lot of pain using RDBMSs (1) as a default choice of persistence, having read a couple of blog posts about MongoDB, and being generally interested in widening my horizon, I decided to check out MongoDB.

This post is a write-as-I-go summary of the information I have gathered from the following places:

More posts may follow…. :)

Getting MongoDB

Piece of cake! Download MongoDB from the download center and shove the binaries away somewhere on your machine. Default is for MongoDB to store its data in /data/db which translates to c:\data\db if you are using Windows – go ahead and create this directory. The MongoDB daemon can be started by running mongod.exe, which will accept connections on localhost:27017.

It will probably look something like the screenshot shown below.

An alternative data path can be specified on the command line, e.g. like so: mongod --dbpath c:\somewhere\else.

Accessing it with JavaScript

Run mongo.exe to start the Mongo Shell. It will probably look something like this:

In the MongoDB prompt, you can use JavaScript to access the db. Here’s a sample session of some commands I have found useful:

// lists the dbs in this mongo
> show dbs   
admin
local
>
> // change database context to some db ("myblog" - will automagically be created)
> use myblog   
switched to db myblog
>
> show collections
system.indexes
>
> // save a couple of documents in a collection named "posts"
> // (collection will be automagically created as well...)
> db.posts.save({
    headline: 'Notes to self about MongoDB', 
    slug: 'notes-to-self-about-mongodb', 
    tags: ['mongodb', 'nosql', 'nifty', 'c#']
 })
> db.posts.save({
    headline: 'Someday I want to check out CouchDB as well', 
    slug: 'someday-i-want-to-check-out-couchdb-as-well', 
    tags: ['couchdb', 'nosql', 'nifty', 'c#']
 })
>
> // show documents in a collection (returns a cursor, which will be iterated for the first 10 or 20 results - next pages can be retrieved with the 'it' command)
> db.posts.find()
{ "_id" : ObjectId("4b8e4281781b000000005cfc"), "headline" : "Notes to self about MongoDB", "slug" : "notes-to-self-about-mongodb", "tags" : [ "mongodb", "nosql", "nifty", "c#" ] }
{ "_id" : ObjectId("4b8e42cc781b000000005cfd"), "headline" : "Someday I want to check out CouchDB as well", "slug" : "someday-i-want-to-check-out-couchdb-as-well", "tags" : [ "couchdb", "nosql", "nifty", "c#" ] }

Now, I have successfully added two documents representing blog posts in a collection named posts. As you can see, MongoDB assigns some funky IDs to the documents.

> // lets get the first post ('find' and 'findOne' accept a query document as their first parameter)
> db.posts.findOne({'_id': ObjectId('4b8e4281781b000000005cfc')})
{
        "_id" : ObjectId("4b8e4281781b000000005cfc"),
        "headline" : "Notes to self about MongoDB",
        "slug" : "notes-to-self-about-mongodb",
        "tags" : [
                "mongodb",
                "nosql",
                "nifty",
                "c#"
        ]
}
>
> // now let's find all IDs ('find' and 'findOne' accept as their second parameter a document
> // specifying which fields to return)
> db.posts.find({}, {'_id': true})
{ "_id" : ObjectId("4b8e4281781b000000005cfc") }
{ "_id" : ObjectId("4b8e42cc781b000000005cfd") }
{ "_id" : ObjectId("4b8e4595781b000000005cfe") }
>

That was a brief demonstration of the JavaScript API in the Mongo Shell. Now, let’s do this from C#.

Getting started with mongodb-csharp

Now, go to mongodb-csharp dowload section at GitHub and get a debug build of the driver. Create a C# project and reference the MongoDB.Driver assembly.

On my machine, punching in the following actually works:

[Test]
public void CanAddPost()
{
  using(var mongo = new Mongo())
  {
    mongo.Connect();
 
    var db = mongo["myblog"];
    var posts = db["posts"];
 
    posts.Insert(new Document
                   {
                     {"headline", "Post added from C#"},
                     {"slug", "post-added-from-csharp"},
                     {
                       "tags", new[]
                                 {
                                   "c#",
                                   "nifty"
                                 }
                       }
                   });
  }
}

Now, I can verify that the document is actually in there by going back to the console and doing this:

> db.posts.find()
{ "_id" : ObjectId("4b8e4281781b000000005cfc"), "headline" : "Notes to self about MongoDB", "slug" : "notes-to-self-about-mongodb", "tags" : [ "mongodb", "nosql", "nifty", "c#" ] }
{ "_id" : ObjectId("4b8e42cc781b000000005cfd"), "headline" : "Someday I want to check out CouchDB as well", "slug" : "someday-i-want-to-check-out-couchdb-as-well", "tags" : [ "couchdb", "nosql", "nifty", "c#" ] }
{ "_id" : ObjectId("4b8e4b72091abb14e4000001"), "headline" : "Post added from C#", "slug" : "post-added-from-csharp", "tags" : [ "c#", "nifty" ] }
>

Nifty! Now lets show the posts from C#. On my machine the following snippet displays the headlines of all posts:

[Test]
public void CanShowPosts()
{
  using(var mongo = new Mongo())
  {
    mongo.Connect();
 
    var db = mongo["myblog"];
    var posts = db["posts"];
 
    Console.WriteLine("Posts");
    foreach(var post in posts.FindAll().Documents)
    {
      Console.WriteLine("    {0}", post["headline"]);
    }
  }
}

- which is documented in the following screenshot:

Random nuggets of information

Document IDs

All MongoDB documents must have an ID in the _id field, either assigned by you (any object can be used), or automatically by MongoDB. IDs generated by MongoDB are virtually globally unique, as they consist of the following: 4 bytes of timestamp, 3 bytes of machine identification, 2 bytes of process identification, 3 bytes of something that gets incremented.

As a nifty consequence, the time of creation can be extracted from auto-generated IDs.

The ID type used by MongoDB can be created with ObjectId('00112233445566778899aabb') (where the input must be a string representing 12 bytes in HEX).

How are documents stored?

I you have not yet figured it out, documents are serialized to JSON – with the minor modification that it’s a BINARY version of JSON, hence it’s called BSON.

String encoding

UTF-8. No worries.

What about references?

I will research this and do a separate post on the subject. As MongoDB is non-relational, a “join” is – in principle – an unknown concept. There’s a mechanism, however, that allows for consistent representation of foreign keys that may/may not give you some extra functionality (depending on the driver you are using).

What about querying?

I will research this as well, posting as I go.

OR/M? (or OD/M?)

It is not yet clear to me how to handle Object-Document Mapping. Will require some research as well. As an OO dude, I am especially interested in finding out what a schema-less persistance mechanism will do to my design.

What else?

More topics include applying indices, deleting/updating, atomicity, and more. Implies additional blog posts.

Conclusion

My first impression of MongoDB is really good. It’s extremely easy to get going, and the few error messages I have received were easy to understand.

I am especially in awe with how little friction I encountered – mostly because of the schema-less nature, but also because everything just worked right away.

  1. Usually because of abusing RDBMSs, actually. Storing an object model in a RDBMS is not painful as long as the tooling is right – e.g. by leveraging the amazing NHibernate. The pain comes when developers suddenly start implementing overly complex queries and doing reporting on top of a pretty entity model, modeling stuff OO style… ouch!

4 Responses

  1. Asger Says:

    Have you looked at MongoDB.Emitter?

    http://groups.google.com/group/mongodb-user/browse_thread/thread/d85b91a68145bee3?pli=1

  2. mookid Says:

    No I haven’t. I am actually unsure of how I would like to access my documents from C#.

    At one point, I thougt that wrapping documents behind something dynamic in C# 4 would be cool – sort of like using an ExpandoObject – but that would just be syntactic sugar on top of the document.

    I am thinking about doing something along the lines of: var personOrNull = document.As<IPerson>(), which would attempt to deserialize the fields required for the document to qualify as a complete person, yielding null if that could not be achieved – and var person = document.Is<IPerson>() to work pretty much like MongoDB.Emitter, allowing you to set fields that might not already be there.

    I am still pretty unsure of this though.

  3. More checking out MongoDB: Querying « mookid on code Says:

    [...] my first post about MongoDB, I touched querying very lightly. Querying is of course pretty important to most [...]

  4. Craig Wilson Says:

    There has been a lot of work going into the next version of MongoDB-CSharp which is soon to be released. Massive overhaul of the serializer now let’s us support typed collections (db.GetCollection() ). We’ll handle all the mapping. As a side benefit, the Linq provider was re-written and now supports projections, complex where conditions, and automatic map-reduce when groupby and/or aggregates are used.

    You can check out the pre-release branch at http://github.com/craiggwilson/mongodb-csharp and the accompanying wiki for some pre-release documentation.

    Have fun…