How to design large distributed system reports?

Question

I was really speechless. The second submission was rejected again because "the content belongs to a technical discussion. It is recommended to briefly talk about your thoughts on this issue so as to better have a technical exchange with others." If The first time I did it was because of the layout. Why is it so difficult to post a Q&A post...

过去多啦不再A梦 · Answer

The demand you mentioned is basically building a data warehouse. The basic idea is:

1. The databases of the data warehouse and the business system are independent. The modeling of the data warehouse generally requires a hierarchical design, not simply building a large table.
It is generally divided into buffer layer, base layer, aggregation layer, report layer, etc. The focus of each layer is different. The base layer is still based on the paradigm model, the aggregation layer generally needs to make data redundant, and the report layer is generally It is a wide table design with many columns.

2. Data synchronization. When the amount of data is large, there must be an incremental mechanism. If not, system modification needs to be applied.

3. There are several ideas for synchronization methods:

a. 用dblink打通数据库，人工写存储过程。
b. 用informatic powercenter 或kettle类似的ETL工具
c. 专用的数据库层同步软件，如oracle的ogg等

System Background

Platform Introduction

Existing requirements

Technical Difficulties

Personal design ideas