In this blog I am going to share my experience and learnings with MongoDb indexes.
MongoDB stores data contiguously located on disk. When we search for data, MongoDB will check all data from beginning of disk. Consider the worst case, when we try to find last document in collection that contains millions of documents. It will take few seconds to complete execution.
Indexes are important factor to improve database performance. Indexes in MongoDB are similar to other database. Indexes are sorted list of keys, which has pointers connected to data on disk. Using index, we can traverse instantly to our required document.
Indexes speed up reads, but writes with an index will add a bit of overhead. It also takes some additional space on disk. Indexes, will improve querying and sorting performance. Indexes do not work well with operators like, $gt, $lt, $not, and in some cases of $regex.
By default MongoDB creates indexes on “_id” while creating collections. We can also create our custom index for our needs.
Single Field Indexes:
When we create index on one field of a collection, it is called single field index.
Consider we have a collection Student with fields student_id, class and name with millions of documents in it. Usually we use student_id to query . We can create an index on student_id using following Mongo shell command.
[source language=”javascript”]
db.students.ensureIndex({ “student_id” : 1 });
[/source]
If there is a large amount of data in collection, it takes some time to complete this operation. But after creating the index query performance is improved a lot.
The value 1 signifies that the index is created in ascending order, we can also create the index in descending order by passing value of -1. For read operations it does not matter if the index is in ascending (or) descending order. But it helps in sorting.
Compound Indexes:
When we create an index on more than one field, it is called compound index. In some case we want to find students in a particular class with a particular name. In that case, student_id index will not be applicable.
We can create compound index on class and name using following Mongo shell command.
[source language=”javascript”]
db.students.ensureIndex({ “class” : -1, “name” : 1});
[/source]
We can create compound indexes on any number of fields, but ordering of fields has an impact on performance.
The simple rule while creating compound indexes is that “the key that has large number of possibilities must come first”. Consider log collection, we can create index for time-stamp and operation. It must be in order (time-stamp, operation) since time-stamp has a large set of possible values compared to the operations.
Multi-key Indexes:
When we create an index on fields which are arrays, it is called a multi-key index (Since this key can store more than one value).
[source language=”javascript”]
db.students.ensureIndex({ “phone” : 1});
[/source]
Creating a multi-key index is similar to creating other indexes. In the above example, we create an index on phone which is an array. Compound index is also possible as Multi-key index, but it takes up more space.
Listing Indexes:
Whenever we create a new index, its information will be stored in the system.indexes collection. It contains “index”, “key”, “collection” fields. We can find all indexes associated with the current database using command,
[source language=”javascript”]
db.system.indexes.find()
[/source]
We can also find all indexes associated with a particular collection using command,
[source language=”javascript”]
db.students.getIndexes()
[/source]
Deleting Indexes:
We can delete indexes in the same way as creating them. We need to pass the key to the dropIndex method like,
[source language=”javascript”]
db.students.dropIndex( { “student_id” : 1 } );
db.students.dropIndex({ “class” : -1, “name” : 1});
[/source]
Unique index:
MongoDB provide us an option for creating unique indexes, so that we can prevent duplication of values in index. For creating a unique index, we need to pass additional an parameter while creating the index which is {unique: true}.
[source language=”javascript”]
db.students.ensureIndex({ “student_id” : 1} , {unique : true } )
[/source]
Removing Duplicates:
Consider if our collection has 1000s of records, for which we want to create a unique index. There is a possibility of having a number of duplicates on the key for which we want to the create index.
To deal with this we can pass the dropDups parameter while creating the index and it will remove duplicates and create a unique index without throwing errors.
[source language=”javascript”]
db.students.ensureIndex({ “student_id” : 1}, { unique : true, dropDups : true})
[/source]
Sparse indexes:
Another problem while creating unique indexes is, when we want to create a unique index on a field which can contain null (or) on a key that may not be present in many documents. In those cases, we can use sparse indexes and it sets index on documents which contain the key set and the other documents will not be indexed.
[source language=”javascript”]
db.students.ensureIndex( { “student_id” : 1}, {unique : true, sparse : true})
[/source]
It generally does not affect finds, but it affect sorting based on the key which we indexed. In that case it sorts only the documents that have been indexed.
Explain:
Explain method in Mongo is similar to the explain function in MySQL or Postgres. It return details about how the query will get executed in the database. It gives us information regarding whether indexes will be used or not, time taken to run the query, number of scanned objects, etc.
[source language=”javascript”]
db.students.find().explain()
[/source]
The Cursor field in the output describes whether indexes are used or not. If its value is basic-cursor then no index used and if it is b-tree-cursor then index is used. If we run a query that can be answered directly from an index and there is no need to interact with the collection then the index-only field is set to true.
Index size:
We can find the total space occupied by indexes using the method,
[source language=”javascript”]
db.students.totalIndexSize()
[/source]
Size of the index grows along with the size of the data. If size of data is almost 3 GB, size of each index will be around 292 MB. So it is better to create index only when we need them and for frequent queries, otherwise it will waste of memory.
Hint:
Hint method instructs MongoDB to use what index for executing a certain query. Hint method should be appended with our query like so:
[source language=”javascript”]
db.students.find().hint({a:1})
[/source]
This is helpful while analyzing the performance of queries using different indexes.
Geo-spatial indexes:
Geo-spatial indexes are used to index data on their location attributes – laditude and longitude. We can create a geo-spatial index on a key with the special parameter ‘2d’.
[source language=”javascript”]
db.store.ensureIndex( { location : ‘2d’ } );
[/source]
Then for each document we need to have the location key with coordinates like [X, Y].
[source language=”javascript”]
db.store.insert( { name : “RS radio”, type : “electronics”, location : [40, 70] });
[/source]
Geo-spatial indexe gives us the privilege to use additional query operators like $near, $geoWithin, etc.
[source language=”javascript”]
db.store.find({location : { $near : [50, 50] }}).limit(1)
[/source]
These are the various types of indexes in the MongoDB. Use them wisely and achieve huge performance boosts.
Some useful links on this topic:
* https://university.mongodb.com/courses/catalog
* http://docs.mongodb.org/manual/indexes/