java - 微服务架构下跨服务查询的聚合有什么好的方案？

Question

微服务架构中，每个服务都有自己的独立数据库。然而现在有个需求，需要生成一张实时的报表，该报表包含两个服务的数据。如服务A，服务B。B中仅包含A的主键id作为关联。而此报表的搜索条件包含A服务实体中的字段也...

迷茫 · Answer

Laxative

If it is online business data (OLTP), then option 1 is the standard practice of microservices. If such related queries need to be done frequently online, it means that the coupling of the two services (and their two libraries) is very serious, so why bother to separate them in the first place?

If it is an analysis report, it falls under the OLAP category. Solution 2 is indeed a desirable solution. If you feel that using a search engine is overkill, you might as well try doing various report analysis operations on the slave database. For example, the online A database and B database are synchronized to a read-only database in real time, and then in the read-only database JOIN is done in one go.

天蓬老师 · Answer

One of the design principles of microservices is to separate services that are not related to the business into separate services. There is overlap between your businesses.

PHP中文网 · Answer

In fact, this kind of problem is very common in microservices. For example, you need to query the order through some information on the product. The order and the product belong to two microservices respectively. In addition to your own two solutions, there are also

Put the data aggregation into the data warehouse, aggregate the data in A and B in real time into another database (not necessarily mysql, it can also be Hbase), and pull the data from the report from the data warehouse
When designing the table, it is appropriate to redundant some fields. As you said, some fields of A can be predictably redundant on B

Method 1 has a very fatal shortcoming. Once paging is involved, this method is definitely not feasible. Which solution to adopt depends on the order of magnitude corresponding to your data. If the corresponding amount of data is not very large, you can Using method 1, if the speed is slow, you can open a few more threads to retrieve the corresponding data in batches (if there are too many IDs, pull them in batches, and batch query are effective solutions that can reduce timeouts and time); if the amount of data is large , it is recommended to use a data warehouse. The main benefit of using a data warehouse is that it will not put pressure on the main database, because the generation of aggregate tables can be obtained through Binlog; because the report still belongs to the category of offline data, if it really needs to be like an order The query is so real-time, very efficient and accompanied by the status of the table, and there are so many search conditions, then the search engine is a good choice
So, you can use method 1 and method 3 according to the actual situation

黄舟 · Answer

Requirements such as generating reports should not be placed in business database systems. You can make a set of otter aggregation libraries on the backend to synchronize data from multiple services in real time. Then you can play whatever you want in this aggregation library.