mongodb - Python中mongo 高效排序
PHP中文网
PHP中文网 2017-04-18 09:36:48
0
3
419

1.如何使用python对mongodb中的多个collection中的数据分析后做排序?

2.具体的使用场景是这样的,假设有如下model: 用户表,用户购买记录表。

假设记录表中存有用户每次买东西所花的钱,那么问题来了,如何将用户已购买东西花费金额的
累计和(假设此类统计字段有5个),做降序排列?

3.场景为我为了说明问题虚构的,事实上有很多统计字段,假设用户表中有100w条记录,记录表100w条数据,服务器4核8线程,能否做到每20条数据的等待时间不超过3s?

4.假设在统计完每个用户的所有数据后用sorted进行排序,是否效率真的会很低?

PHP中文网
PHP中文网

认证高级PHP讲师

reply all(3)
PHPzhong

Enable mongodb's index for the corresponding fields you need to filter (mongodb supports multiple indexes under one collection), that is, the index. Since it uses hashtable, it should be much faster, and you can use mongodb's own api for sorting, 100W的情形没遇到过, 但是1~10W的规模记得好像是500ms As a comparison, without opening the index, life is stuck and I can’t take care of myself

In addition, if the data you need to count is very important and the call frequency is high, it is recommended to create a separate collection, call the queue cache regularly, and trade space for time. This collection can have the following fields, user ID, and the past 3 hours The total number of purchases in the past 12 hours, the past 24 hours, the past day, the past month, the total purchase volume in the past, etc. The disadvantage of this is that it wastes some space, and it cannot reflect the data in real time, but the advantages are obvious , if you want to query the amount of a user’s chops, you can simply query, with millisecond-level response

The above is just one family’s opinion and is for reference only

巴扎黑

You can load all collection data into memory and then process it.

左手右手慢动作

Mongodb is not good at processing data from multiple collections, so it is best to aggregate them all together when designing the data itself.

Create an index for the query of a single collection. The order of query usage is: basic query->aggregation->mapreduce. The query method becomes more and more flexible from left to right, and the query efficiency is getting lower and lower.

Querying multiple collections needs to be implemented by yourself, querying from each collection separately, and processing multiple query results.

For those with particularly high timeliness requirements, use an intermediate cache layer and design an update strategy.

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!