I don’t know whether the execution inside is single-threaded, but if it is a production environment, it is best not to directly access the mapReduce results every time. Depending on the size of the data, it will still take a certain amount of time. Our data is in the tens of millions, and each execution of mapReduce takes about 5-6 seconds. Fortunately, our application is not very real-time. So basically the data is cached for 2 hours, and then mapReduce is executed to obtain the latest results.
I have done similar things before using MapReduce. Because it was time consuming, I later modified it to use aggregate query for statistics. The specific example is as follows:
Because the data of my mock is relatively simple and regular, it can be seen that the number of calculations is almost twice the number of scanned documents. Later, I used random data for testing and found that the results were even worse. I decisively gave up the implementation of MapReduce and switched to other methods. accomplish.
I don’t know whether the execution inside is single-threaded, but if it is a production environment, it is best not to directly access the mapReduce results every time. Depending on the size of the data, it will still take a certain amount of time. Our data is in the tens of millions, and each execution of mapReduce takes about 5-6 seconds. Fortunately, our application is not very real-time. So basically the data is cached for 2 hours, and then mapReduce is executed to obtain the latest results.
I think this article will explain the performance issues of mongodb!
http://stackoverflow.com/questions/39...
I have done similar things before using MapReduce. Because it was time consuming, I later modified it to use aggregate query for statistics. The specific example is as follows:
The basic document model is as above, I indexed it on accountId and tags
Now it is required to count the tags under the user. MapReduce is designed as follows:
Result:
It seems to have achieved our effect. I just used a small amount of data 10W to do the above test. During the execution process, it will output:
Because the data of my mock is relatively simple and regular, it can be seen that the number of calculations is almost twice the number of scanned documents. Later, I used random data for testing and found that the results were even worse. I decisively gave up the implementation of MapReduce and switched to other methods. accomplish.