In my first post about MongoDB, I touched querying very lightly. Querying is of course pretty important to most systems, so it’s fair to dedicate a separate post to the subject.

Querying in MongoDB works by sending a document to the server, e.g. in the following snippet I create a document with a post ID

var somePost = db.posts.findOne({"_id": ObjectId("00112233445566778899aabb")})

- which can actually be even shorter, as the find and findOne functions accept an ObjectId directly as their argument, like so:

var somePost = db.posts.findOne(ObjectId("00112233445566778899aabb"))

But how can I find a post with a specific slug? Easy, like so:

var somePost = db.posts.find({slug: "this-slug-probably-comes-from-a-url"})

But how does this perform? It’s easy to examine how queries are executed with the explain() function, like so:

> db.posts.find({slug: "this-slug-probably-comes-from-a-url"}).explain()
{
        "cursor" : "BasicCursor",
        "startKey" : {
 
        },
        "endKey" : {
 
        },
        "nscanned" : 10000,
        "n" : 1,
        "millis" : 11,
        "allPlans" : [
                {
                        "cursor" : "BasicCursor",
                        "startKey" : {
 
                        },
                        "endKey" : {
 
                        }
                }
        ]
}

- yielding some info about the execution of the query. I don’t know exactly how to interpret all this, but I think I get that "nscanned": 10000 means 10000 documents were scanned – and in a collection with 10000 documents, that’s not entirely optimal as it implies a full table scan. Now, let’s make sure that our query will execute as fast as possible by creating an index on the field (_id is always automatically indexed):

db.posts.ensureIndex({slug:1})

Now lets explain() again:

> db.posts.find({slug: "post-no-454"}).explain()
{
        "cursor" : "BtreeCursor slug_1",
        "startKey" : {
                "slug" : "post-no-454"
        },
        "endKey" : {
                "slug" : "post-no-454"
        },
        "nscanned" : 1,
        "n" : 1,
        "millis" : 0,
        "allPlans" : [
                {
                        "cursor" : "BtreeCursor slug_1",
                        "startKey" : {
                                "slug" : "post-no-454"
                        },
                        "endKey" : {
                                "slug" : "post-no-454"
                        }
                }
        ]
}

Wow! That’s what I call an improvement!

What about posts with a specific tag? First I tried the following snippet, because I learned that the special $where field could be put in a query document to evaluate a predicate server-side:

var niftyPosts = db.posts.find({$where: function() { return this.tags != null && this.tags.indexOf("nifty") != -1; }})

- and this actually works. This syntax is sort of clunky though. Luckily, MongoDB provides a nifty mechanism for arrays that automagically checks if something is contained in it. So my query can be rewritten to this:

var niftyPosts = db.posts.find({tags: 'nifty'})

Nifty!

Now, to take advantage of indexes, the special query document fields should be used. I showed $where above, but there are more – to name a few: $gt (greater than), $gte (greater than or equal), $lt (less than), $lte (less than or equal), $ne (not equal), and many more. For example, to count the number of non-nifty posts in February 2010:

> db.posts.find({
    date: {$gte: 20100201, $lt: 20100301},
    tags: {$ne: 'nifty'}
}).count()
0

Check out the manual for some great documentation on available operators.

Conclusion

Querying with MongoDB actually seems pretty cool and flexible. I like the idea that it’s possible to execute ad-hoc queries, and for most usage I think the supplied operators are adequate. The ability to supply a predicate function via $where seems really cool, but it should probably only be used in conjunction with one or more of the other operators to avoid a full table scan.

  • Twitter
  • DotNetKicks
  • Technorati
  • Google Bookmarks
  • Reddit
  • Digg
  • del.icio.us

Leave a Reply

(required)

(required)

© 2010 mookid on code Suffusion WordPress theme by Sayontan Sinha