More checking out MongoDB: Querying

In my first post about MongoDB, I touched querying very lightly. Querying is of course pretty important to most systems, so it’s fair to dedicate a separate post to the subject.

Querying in MongoDB works by sending a document to the server, e.g. in the following snippet I create a document with a post ID

– which can actually be even shorter, as the find and findOne functions accept an ObjectId directly as their argument, like so:

But how can I find a post with a specific slug? Easy, like so:

But how does this perform? It’s easy to examine how queries are executed with the explain() function, like so:

– yielding some info about the execution of the query. I don’t know exactly how to interpret all this, but I think I get that "nscanned": 10000 means 10000 documents were scanned – and in a collection with 10000 documents, that’s not entirely optimal as it implies a full table scan. Now, let’s make sure that our query will execute as fast as possible by creating an index on the field ( _id is always automatically indexed):

Now lets explain() again:

Wow! That’s what I call an improvement!

What about posts with a specific tag? First I tried the following snippet, because I learned that the special $where field could be put in a query document to evaluate a predicate server-side:

– and this actually works. This syntax is sort of clunky though. Luckily, MongoDB provides a nifty mechanism for arrays that automagically checks if something is contained in it. So my query can be rewritten to this:

Nifty!

Now, to take advantage of indexes, the special query document fields should be used. I showed $where above, but there are more – to name a few: $gt (greater than), $gte (greater than or equal), $lt (less than), $lte (less than or equal), $ne (not equal), and many more. For example, to count the number of non-nifty posts in February 2010:

Check out the manual for some great documentation on available operators.

Conclusion

Querying with MongoDB actually seems pretty cool and flexible. I like the idea that it’s possible to execute ad-hoc queries, and for most usage I think the supplied operators are adequate. The ability to supply a predicate function via $where seems really cool, but it should probably only be used in conjunction with one or more of the other operators to avoid a full table scan.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.