Request a Call Back
X
Fields marked with an * are required
MongoDB Query Performance - Optimization Techniques

Performance always matters even in the database queries. While working with big data in modern Web Application Development, it comes out that achieving high-performance is the biggest challenge. 

For instance, we prefer SSDs based laptops over HDDs based laptops only because they render better performance than the latter one. A similar case is with OnePlus smartphones. There are hundreds of different brands in the market for smartphones but OnePlus is leading, all because of its high-performance. 

MongoDB Query Optimization Techniques

We are a team of professionals, who had faced several performance challenges while working with MongoDB and figured out some points that helped us to optimize MongoDB queries’ performance. Here we have described some of the query optimization techniques, with the use of which the same query will return data more efficiently and will greatly improve the query performance in MongoDB-

Proper Indexing

Rather than scanning the entire collection, we should scan an index because it is much faster. A set of one or more fields can be used to make an index, called Query Indexing, to improve the performance of a query.
For example- We have a collection of employees of a company. If our most queries use “employee_id” as a search parameter then we should make the “employee_id” as an index by command:

If most of the queries use “employee_name” and “employee_email” as search parameters then we should make a compound index by command: 

Improve Sorting Based Query

At some stage, we definitely need to sort the collection. For example, if you want to sort out the list of employees in ascending alphabetical order of their names. We can use the command: 

Performance of the above query can be improved by using a proper index such as: 

If we have a sort query such as: 

then we can make a compound index, as given below, to improve the performance of the query:

Note – Sort run on memory and have a maximum limit of 32MB.

Equality, Sort, Range rule 

Designing indexes is cumbersome. The equality, sort and range rules might come up for rescue. Let’s say we have a query: 

We can create a compound index like:

But this result will be in-memory sort, “address.zipcode”: {$gt: 50000}, is a range query (not equality) and this way more index keys will be examined.

If we create an index like:

It would still do in-memory sorting but keys to be examined will be way less.

But if we create an index like:

We can save in-memory sort while examining fewer keys.

Sparse Index 

In the sparse index, we index only those documents where the indexed field exists:

We can also achieve the same with partial indexes and they are much more expressive, such as:

We can also check for the existence of other fields in partial indexes, this is the reason that the partial indexes are recommended over sparse indexes.

Use $hint to select a particular index

In most of the cases,  the query optimizer selects the optimal index for a specific operation; however, you can force MongoDB to use a specific index using the hint() method. We can use hint() to support performance testing or on some queries where you must select a field or fields included in several indexes.

Use of ‘lean()’ Function

As we all know, queries without a ‘lean’ option return a Mongoose Document that contains methods such as save, getters/setters, etc. We can use this type of query in case we want to find an employee and update his/her information. But we can’t use this type of query if we want to display a list of particular employees as it will return a document containing all the methods of Mongoose for every employee, which is not required.
To improve the performance of the query in the latter case, we should use a lean() Function in the query that will return a plain javascript object without any methods and will improve the query performance.

Example:

Note: Using lean() will not return default values defined in mongoose schema.

Use Projections to Return the Necessary Data

When you need only a subset of fields from documents, you can achieve better performance by returning only the fields you need:

Example:

Aggregate Pipeline

The order of query is very important in pipelining to improve the query performance. Example: In our employee’s collection, we have a total of 20000 employees. Out of them, 

  • 10000 employees have a salary higher than INR 100000.
  • 500 employees have a rank A.
  • 400 employees with both cases.

Let’s suppose, we need to make a query for the list of employees whose salary is higher than INR 100000 and also have a rank A. We can have two pipelines for this query:

  • 1st Case: One pipeline will filter the collection by rank 
  • 2nd Case: Another pipeline will filter the collection by salary. 

Now let’s see how the order in the pipeline will improve the query performance-

  • In the first case, pipeline(filter by salary) is applied first then we will get 10000 employees for the next pipeline.
  • In the second case, pipeline(filter by rank) is applied first then we will only get 500 employees for the next pipeline.

From the above example, it is quite clear that the order in a pipeline is very important.

Disk Usage

If we know that a particular query will return a very large object then we can use the concept of Disk Usage. It is a similar concept to Virtual Memory as it uses the secondary storage to store the data from that particular query so that our RAM doesn’t get overloaded.
However, this theory is not recommendable for smaller queries.

Example: 

Regex Performance 

Regex operations are slow as each operation goes through each document and tests it against given regex to find the results. Indexing regex fields can improve regex performance a bit, in which case, it will go through index keys for applying regex instead of each document.

Conclusion:

If you are looking forward to implementing the above-mentioned techniques in your database queries, you need to choose them wisely. For instance, using a number of indexes with the database having a small size will not be a very wise decision. 

There are many other techniques to Optimize the MongoDB Queries and we are eager to assist you with them. Just drop your issues in the comment box provided below and we will revert back with the most appropriate answer. Also, your feedback will be appreciated considerably. 

Thanks!

Written by: Surendra Kumar, Balgopal Sharma and Sankalp Kataria