I encountered a MongoDB problem in the project, and I still haven’t been able to solve it after many days. I hope to get some advice from experts.
The specific problem is: when the database is not accessed for a long time, the first query of the database takes a long time, but subsequent queries will be very fast.
Details:
①整个数据库大小大概在1.9TB左右;
②我查询的collection的数据大致为700万条;
③我查询一次得到的数据为23万条左右;
④服务器内存为120GB;
⑤已按照查询条件建立了索引,索引数据大小为600MB左右;
⑥第一次查询所用时间20s左右,之后的查询在1s以内。
Reasons currently considered:
由于MongoDB不负责内存的管理,所以,当长时间未访问数据库时,内存中的数据即为冷数据,操作系统的内存管理程序就会将这部分冷数据释放,导致下次查询时,需要重新加载数据到内存,所以比较费时。目前,不能够确定是加载索引比较费时,还是加载数据比较费时。MongoDB虽然提供了touch命令(该命令能够指定将某个collection的索引数据或者用户数据加载到内存中),但是我使用的是WiredTiger存储引擎,该命令不支持该存储引擎。
Help needed:
①是不是以上原因导致的该问题?
②如果是该原因导致的,如何确定是加载索引费时还是加载数据费时?
③有什么比较好的解决方案么?
注:由于该collection最大会达到25GB左右,而且整个数据库还有其他很多collection,所以将该collection的所有数据存储到内存是不可取的。如果能够确认是加载索引费时的话,倒是可以考虑定期将索引加载到内存,但是对于WiredTiger存储引擎,没有支持该功能的方法,这又是一个问题。
The problem you mentioned is related to working set.
1. What is working set?
An important concept in MongoDB’s memory management. In memory management, try to place frequently accessed data sets and related indexes in memory.
2. How to place the working set in memory?
In your statement, it actually means that the working set needs to be preheated and placed in the memory in advance (Preload or Preheating). How to do it specifically? You mentioned touch (MMAP engine), so how to implement it in subsequent versions (WT engine)?
If it is a relational database, the often used method is select *. Many times when doing performance testing, in order to achieve good results, a batch of Select statements will be run in advance to warm up the memory.
In MongoDB, consider:
1) If the covered index can be directly used in business queries, or when the index needs to be warmed up:
db.collection.find({}, {"_id" : 0, "field_a" : 1, "field_b" : 1}).hint({"field_a" : 1, "field_b" : 1}).explain( )
2) When you need to preheat the working set, the premise is that you know which data in your collection needs to be accessed frequently, usually a certain time period, and then use the same method above to preheat the collection in this time period. hot.
The difference is: preheating index preheats all the index, so the query condition is {}, while preheating working set only preheats a certain part of the data in the collection, so the query condition may be and time Scope-related conditions.
For reference.
Love MongoDB! Have fun!
MongoDB Chinese Community Shenzhen User Conference
This Saturday, let’s make an appointment
Please enter for details