java - mysql single table pipeline 500W data multi-dimensional statistics solution
黄舟
黄舟 2017-05-19 10:07:03
0
4
910

Now there is a flow table structure in the project as follows
id sdkVersion jarVersion countryCode imei createTime

The previous requirement was to find the total number by grouping sdkVersion, jarVersion, countryCode and the total number after imei sorting. The approximate sql is as follows:

select sdkVersion,jarVersion,countryCode,count(*),count(distinct imei) from xxx 
where createTime = 'xxxx-xx-xx'
group by sdkVersion,jarVersion,countryCode

Find out all the data of the previous day and summarize it into a table. The structure is roughly as follows

id sdkVersion jarVersion countryCode count(*) count(distinct imei) createTime

The current requirement is to query the combination of any latitude, that is,
group by sdkVersion
group by jarVersion
group by countryCode
group by sdkVersion, countryCode
and so on. Combination, if we follow the previous daily summary plan, we will have to create many tables for different latitude combinations. Is there any good solution to solve this problem? Or can it be solved using a specialized statistical framework?

黄舟
黄舟

人生最曼妙的风景,竟是内心的淡定与从容!

reply all(4)
仅有的幸福

You can check out the PipelineDB streaming database

黄舟

apache kylin, sub-second olap

曾经蜡笔没有小新

For daily summary, real-time requirements are not high, and 500W records are still within the processing range. View + scheduled plan can meet the requirements, and there is no need to build multiple tables.
It is best for the questioner to explain what bottlenecks or pain points there are. After all, mysql is a mature product, and there are certain risks in switching to cutting-edge technology.

黄舟

Write the stored procedure and run it regularly every day

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template