More checking out MongoDB: Querying

In my first post about MongoDB, I touched querying very lightly. Querying is of course pretty important to most systems, so it’s fair to dedicate a separate post to the subject.

Querying in MongoDB works by sending a document to the server, e.g. in the following snippet I create a document with a post ID

var somePost = db.posts.findOne({"_id": ObjectId("00112233445566778899aabb")})

1	var somePost = db.posts.findOne({"_id": ObjectId("00112233445566778899aabb")})

– which can actually be even shorter, as the find and findOne functions accept an ObjectId directly as their argument, like so:

var somePost = db.posts.findOne(ObjectId("00112233445566778899aabb"))

1	var somePost = db.posts.findOne(ObjectId("00112233445566778899aabb"))

But how can I find a post with a specific slug? Easy, like so:

var somePost = db.posts.find({slug: "this-slug-probably-comes-from-a-url"})

1	var somePost = db.posts.find({slug: "this-slug-probably-comes-from-a-url"})

But how does this perform? It’s easy to examine how queries are executed with the explain() function, like so:

> db.posts.find({slug: "this-slug-probably-comes-from-a-url"}).explain()
{
        "cursor" : "BasicCursor",
        "startKey" : {

        },
        "endKey" : {

        },
        "nscanned" : 10000,
        "n" : 1,
        "millis" : 11,
        "allPlans" : [
                {
                        "cursor" : "BasicCursor",
                        "startKey" : {

                        },
                        "endKey" : {

                        }
                }
        ]
}

> db.posts.find({slug: "this-slug-probably-comes-from-a-url"}).explain()

{

"cursor" : "BasicCursor",

"startKey" : {

"endKey" : {

"nscanned" : 10000,

"n" : 1,

"millis" : 11,

"allPlans" : [

{

"cursor" : "BasicCursor",

"startKey" : {

"endKey" : {

}

]

}

– yielding some info about the execution of the query. I don’t know exactly how to interpret all this, but I think I get that "nscanned": 10000 means 10000 documents were scanned – and in a collection with 10000 documents, that’s not entirely optimal as it implies a full table scan. Now, let’s make sure that our query will execute as fast as possible by creating an index on the field ( _id is always automatically indexed):

db.posts.ensureIndex({slug:1})

1	db.posts.ensureIndex({slug:1})

Now lets explain() again:

> db.posts.find({slug: "post-no-454"}).explain()
{
        "cursor" : "BtreeCursor slug_1",
        "startKey" : {
                "slug" : "post-no-454"
        },
        "endKey" : {
                "slug" : "post-no-454"
        },
        "nscanned" : 1,
        "n" : 1,
        "millis" : 0,
        "allPlans" : [
                {
                        "cursor" : "BtreeCursor slug_1",
                        "startKey" : {
                                "slug" : "post-no-454"
                        },
                        "endKey" : {
                                "slug" : "post-no-454"
                        }
                }
        ]
}

> db.posts.find({slug: "post-no-454"}).explain()

{

"cursor" : "BtreeCursor slug_1",

"startKey" : {

"slug" : "post-no-454"

"endKey" : {

"slug" : "post-no-454"

"nscanned" : 1,

"n" : 1,

"millis" : 0,

"allPlans" : [

{

"cursor" : "BtreeCursor slug_1",

"startKey" : {

"slug" : "post-no-454"

"endKey" : {

"slug" : "post-no-454"

}

]

}

Wow! That’s what I call an improvement!

What about posts with a specific tag? First I tried the following snippet, because I learned that the special $where field could be put in a query document to evaluate a predicate server-side:

var niftyPosts = db.posts.find({$where: function() { return this.tags != null && this.tags.indexOf("nifty") != -1; }})

1	var niftyPosts = db.posts.find({$where: function() { return this.tags != null && this.tags.indexOf("nifty") != -1; }})

– and this actually works. This syntax is sort of clunky though. Luckily, MongoDB provides a nifty mechanism for arrays that automagically checks if something is contained in it. So my query can be rewritten to this:

var niftyPosts = db.posts.find({tags: 'nifty'})

1	var niftyPosts = db.posts.find({tags: 'nifty'})

Nifty!

Now, to take advantage of indexes, the special query document fields should be used. I showed $where above, but there are more – to name a few: $gt (greater than), $gte (greater than or equal), $lt (less than), $lte (less than or equal), $ne (not equal), and many more. For example, to count the number of non-nifty posts in February 2010:

> db.posts.find({
    date: {$gte: 20100201, $lt: 20100301},
    tags: {$ne: 'nifty'}
}).count()
0

> db.posts.find({

date: {$gte: 20100201, $lt: 20100301},

tags: {$ne: 'nifty'}

}).count()

Check out the manual for some great documentation on available operators.

Conclusion

Querying with MongoDB actually seems pretty cool and flexible. I like the idea that it’s possible to execute ad-hoc queries, and for most usage I think the supplied operators are adequate. The ability to supply a predicate function via $where seems really cool, but it should probably only be used in conjunction with one or more of the other operators to avoid a full table scan.

More checking out MongoDB: Querying

Conclusion

Like this:

Related

Leave a Reply Cancel reply

Conclusion

Share this:

Like this:

Related

Leave a Reply Cancel reply