Home > Java > javaTutorial > Hardcore stuff: a journey of reconstruction of more than 30,000 lines of code in a core system

Hardcore stuff: a journey of reconstruction of more than 30,000 lines of code in a core system

Release: 2023-07-26 15:48:11
forward
1445 people have browsed it
There is this passage in the classic book "Refactoring":
In the beginning, the refactorings I did were all about minutiae. As the code becomes more concise, I find that I can see some design-level things that I couldn't understand before. Without refactoring, I wouldn't be able to reach this level.
Refactoring is really an exciting thing for programmers.
At the beginning of this year, our team completed the reconstruction of a complex project. It is the core engine part of the advertising system. It has about 300 files and more than 30,000 files. lines of code.
It only took about a month from the design of the technical solution to the final full launch, and there were no accidents.
This should be the largest and most successful refactoring project I have ever experienced in my 8-year programming career: the speed is fast enough, the plan is comprehensive, and the quality is passable.

01 Let’s first talk about the historical baggage of this system

Our advertising engine went through about a year and a half before this reconstruction. The initial iteration is focused on search scenarios, with single business and clear processes.

Beginning in 2019, the company’s advertising business began to expand rapidly, with revenue growing almost exponentially. In this process, our advertising engine faced two challenges:

#1. The business scenario began to become complex. In addition to search advertising, there were also It is necessary to support information flow recommendation and similar recommendation scenarios.

2. Advertising traffic begins to increase rapidly. In addition to meeting functional requirements, it is also necessary to take into account performance.

After sorting out, most of the logic of the entire engine can be shared, so we defined a main framework and made it extensible Partially abstracted. In this way, each scenario can implement certain public interfaces according to the particularity of its own business. In addition, from a performance perspective, we sacrificed some code readability and parallelized some logic.

With the development of business, search scenarios began to enter a rapid iteration period, with more and more new strategies added, and our main framework gradually became inflexible at this time.

If you move the main body frame, all scenes other than search need to be reconstructed accordingly. During the period of rapid business development, the construction period is not allowed at all, so we can only carry out patch development on the existing framework. This brings about two obvious problems:

1. In order to be compatible with the special logic of search , we need to add various if judgments in other scenarios to bypass these logics.

2. There are more and more advertising strategies, dozens in total. When the framework loses its clear structure, the implementation of some strategies begins to become customized and lacks hierarchical divisions. and pluggable abstract design.

#In this context, as changes accumulate, the code begins to deviate from the original intention of the design, and the technical debt becomes heavier and heavier. However, we could never find the right time to refactor.
Hardcore stuff: a journey of reconstruction of more than 30,000 lines of code in a core system

##The turning point came at the end of 2019. Due to the particularity of the advertising business, traffic began to naturally decline. In addition, the product operation team The focus is on the work planning for the second year, thus giving us a very good window period to start this reconstruction.

We set the construction period to 1 month, and in the end was only online one day later than expected. Although there were two online problems, but in Grayscale They were discovered and repaired in time, and no online accidents were caused.

Overall, this is a very difficult and relatively successful refactoring project. Let’s talk about the valuable experience I learned from this project in detail.

02 What preparations have we done before refactoring?

The amount of code refactored this time is very large, more than 30,000 lines, and it is the core engine part of the advertising system. Before starting, we can anticipate the following difficulties:

#1, Resistance on the business side : Advertising is extremely business-oriented. Although this reconstruction can improve long-term R&D efficiency, it cannot directly improve business income, and the development cycle will not be too short. How can we get support from business classmates? ?

2. Concerns on the technical side: Once refactoring causes an online accident, the company has a penalty system. How to make everyone behave lightly? Going into battle? At the same time, if there are very heavy business iterations interspersed during the reconstruction process, no one can guarantee the delivery time, and the quality will be difficult to control.

Hardcore stuff: a journey of reconstruction of more than 30,000 lines of code in a core system

In response to the concerns of these two parties, I think the following tasks play a key role.

▍Let everyone see the pain points

As mentioned earlier: With the business iteration, the main framework of our advertising engine has become blurred, and dozens of advertising strategies are scattered in different business scenarios, with messy configurations.

In view of these two pain points, we started sorting out the existing business one month in advance, reading the old code and looking through the previous requirements documents. Finally, we combined the core processes of different scenarios. and advertising strategies categorized into a clear table.

It is this table that allows technology and products to clearly see the whole picture of our engine part for the first time, and understand the complexity of the business and the current technical bottlenecks.

▍Clear the goals and values ​​of refactoring

Let everyone feel the pain points Finally, we planned two core goals for this reconstruction:

1. Reconstruction of the main framework: modularize the main process, Redefine the upper and lower layer protocols to ensure clear interfaces; each layer also needs to be abstracted and have good scalability.

2. Flexible and configurable strategies: Advertising strategies are classified and abstracted according to business intentions, the execution conditions of the strategies are dynamically configurable, and the strategies can be plugged in and out at will.

In addition, we have refined the expected benefits that can be brought about after completing these two core goals:

#1. Technical benefits: The code structure is clearer, easier to understand and maintain; the scalability is enhanced, and the engine development efficiency will be further improved.

2. Business benefits: Strategies can achieve more fine-grained configuration and expansion, and are more friendly to business support; improved R&D efficiency can further speed up business iteration.

After synchronizing the value of reconstruction to everyone, it further increased everyone’s excitement and gave everyone stronger motivation to participate. .

##▍Control of the overall rhythm

The control of the overall rhythm is also a very important part, allowing everyone to have a time expectation for this matter.

First of all, we set the construction period to 1 month. On the one hand, we considered the maximum cycle acceptable to the business side, and we also hoped for a quick solution technically; on the other hand, the Spring Festival is about to Come, we must rush to go online before the company shuts down the network, and reserve a buffer of 1-2 weeks to prevent unexpected situations.

In addition, we have reached an agreement with the business side: During the reconstruction period, non-urgent requirements for the engine part will not be accepted. This can minimize parallel development and code conflicts and allow the team to focus more.

03 What experiences can you share during the implementation process?

This refactoring was implemented so smoothly. I have 4 valuable experiences to share with you.

1. High-quality technical design plan

This is due to daily requirements. We will design technical solutions for projects with a development cycle of more than 3 days, and this reconstruction is of course no exception.

The overall architecture of the framework, the protocol design between modules, and the scalability design of the strategy are the focus of this technical solution. The team discussed it no less than three times. .

After the big plan was finalized, the team further refined the public parts such as database, interface fields, cache structure, log buried points, etc. Because it involves multi-person collaborative development, the team agreed Using documents as the communication interface, documents are always synchronized with code.

Under such high requirements, the team produced a technical solution document of more than 5,000 words, totaling 36 pages, which laid a good foundation for overall quality assurance.

2. Pre-reconstruct the framework code

This PR is very critical, it is our technical solution The most important step to landing on the code. We have sorted out the reconstructed package structure, module division, API definition between each layer, and abstraction of different advertising strategies, ignoring the implementation details first.

In this way, the main body of the code is basically formed, which can clearly depict our ideal framework. We then organized multiple centralized code reviews and finally formed a unified opinion.

This step can well avoid getting caught up in implementation details too early, resulting in insufficient attention to the main framework and unstable code. Rework later will drag down efficiency.

3. Frequent communication and paired code review mechanism

After entering the detailed implementation stage, a very important point is: review the existing Logical understanding. The engine code has been iterated for a year and a half. It has been developed by many people in history, but this time only three students participated in the reconstruction.

#During the whole process, whenever we encountered any unclear code logic, we communicated and verified repeatedly and did not make subjective guesses. This caution is actually very important.

In addition, in terms of code review, we assigned students who are familiar with this business to be responsible for each module. They are paired in pairs and the mechanism is flexible.

4. Effective test plan

Refactoring has not been done, testing first. This principle is emphasized in the book "Refactoring" and is also the focus of our discussion of this technical solution. I will single it out here to expand on it in detail.

First of all, we made an agreement in the early stage: not to leave any old code, and completely build a new package for reconstruction. This makes it easy to compare the results before and after reconstruction and conduct online grayscale experiments at the same time.

Regarding the test plan, the following 4 points are worth learning from:

1 . End-to-end testing: This reconstruction does not involve functional adjustments, so the behavior of the outer API will not change. This end-to-end testing method is the most effective. This is the most important means of R&D and QA testing. .

2. Smoke test: QA students provide smoke cases, and R&D students conduct smoke. Before R&D tests, all smoke cases must be passed. . This is not common in most Internet companies, but it is absolutely effective for large projects.

3. Sandbox environment dual-process verification: As mentioned earlier, the code before and after our reconstruction is retained, so the input parameters of the online environment can be captured through scripts as a case, and then Use an automated method to compare the returned fields of the API one by one.

4. Online environment grayscale experiment: Grayscale is very important for reconstruction. We use the existing ABTest platform to gradually liberalize grayscale traffic, starting from 5% , to 10%, to 30%, and finally to 100%, a very cautious pace of volume increase was established, and then verified through logs and business indicator monitoring.

Write at the end


Review the entire reconstruction process and summarize it into the following 7 Key points:

1. Seize the opportunity to refactor
2. Early sorting is very important. Find the pain points first
3. Be clear Come up with goals and values ​​to get everyone excited
4. It is not suitable for long-term operations and it is not suitable to run in parallel with business
5. High-quality technology is required Plan
6. Refactoring is not done, test first

7. Verify carefully and be responsible for every line of code

Of course, the most critical factor is people. Large-scale project refactoring extremely tests the team's collaboration ability. If everyone is reliable, the refactoring is already half successful.

The above is the detailed content of Hardcore stuff: a journey of reconstruction of more than 30,000 lines of code in a core system. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:Java学习指南
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template