The more critical question is how to identify whether its answer is correct. I now have the standard answer on hand and can evaluate it. In actual projects, we are faced with unknowns. If we have no experience, how do we know that the design given by GPT-4 is effective? Can it solve the problem?
ChatGPT is a good helper for programmers? Or do you want to kill the programmers?
I think it’s not only about its ability to generate code, but more importantly, whether it has strong design capabilities.
Design capabilities have two levels, one is high-level, such as architecture design and system design.
One is relatively low-level design capabilities, mainly designing specific classes and interfaces.
Today we take a look at how it performs in these two aspects.
Since ChatGPT’s answer is particularly long-winded, I will delete some details and keep only the important parts.
Due to company confidentiality, we cannot use real projects and can only use cases published in books to test it.
The case I use here is the customer service ticket system , is extracted from a book "From Programmer to Architect". This book has many practical cases. I recommend it to everyone by the way.
The approximate requirements of the customer service work order system are as follows:
There is a customer service system that uses the MySQL database. The number of work orders generated every year is There are more than 30 million items, and each work order will have 5 to 6 operations. These operations must also be recorded, so there will be more than 150 million operation records every year.
Now when customer service opens the system to view work orders, the speed is very slow and it takes 10 seconds to open. Now the SQL statements, indexes, etc. have been optimized to the extreme and cannot be improved any further.
Now the system performance has seriously affected the processing efficiency of customer service. The customer requires that this problem be solved within a week, which is a very short time.
After analyzing the business process, we found that if the work order is processed, a status of "closed" will be added. Such a work order cannot be opened several times a year.
The team had never used database partitioning before, and they gave up on the database partitioning plan as they were not sure it could be solved within a week.
For this problem, the solution in the book is "hot and cold separation". Daily operation of "hot database" moves the "closed" work order into a new "cold database", but in the separation of hot and cold data During the process, data consistency issues need to be dealt with.
Throw this requirement to GPT-4 and see how it answers.
GPT-4: In this case, I recommend the following two methods to solve the performance problem:
1. Database Archiving : Since the majority of closed tickets no longer need to be accessed, the ticket data can be moved from the main database to an archive database. In this way, the size of the main database will be greatly reduced, thereby improving query speed. The archive database can be updated regularly (such as monthly or quarterly), and closed work orders can be moved to the archive database...
The specific operation steps are as follows:
a. Create a new MySQL database to store archived data.
b. Write a script to regularly move closed work orders and their related operation records from the master data database to the archive database and ensure data consistency.
c. Modify the customer service system so that it only queries unarchived data in the main database by default.
d. Provide a separate interface or tool for customer service personnel to query the data in the archived data library when needed.
Yes, it caught the point right away, The database archiving scheme is given, as well as the specific operation steps.
However, there are certain hints in the requirements, and this solution is easy to think of.
Liu Xin: In your operation steps, how to ensure data consistency? You need to operate two databases here
GPT-4 : It is very important to ensure data consistency. When operating two databases, you can use the following methods to ensure data consistency:
Liu Xin: When you use transactions, how can you start a transaction that spans two databases?
GPT-4: Sorry, I did make a mistake in my previous answer. In MySQL we cannot directly use a single transaction between two different databases. In this case, consider using other strategies to ensure data consistency. One possible approach is to use Two-Phase Commit (2PC). ....omitted....
Liu Xin: The two-phase commit protocol is very fragile. There are other safe ways to move data from the main database. To the archive database?
GPT-4: Sorry, the previous solution did not fully meet your needs. Here I provide a new solution that uses message queues (such as RabbitMQ, Kafka or Amazon SQS) to ensure the consistency of the data migration process. This approach breaks the data migration process into two separate steps, ensuring easy rollback if something goes wrong. .....omitted....
Liu Xin: If you don’t use MQ and write the code yourself, how can you ensure consistency during the data migration process?
GPT-4: If you do not want to use Message Queuing, you can write your own code to ensure consistency during data migration using the following method:
a. Add a Boolean field named "is_migrated" to the work order table and operation record table respectively to mark whether the records have been migrated.
b. Write a script or program to query the master database for records that meet migration conditions (for example, closed work orders and "is_migrated" is false).
......Slightly......
Well, at this point, it finally found a solution, which is somewhat similar to the book , which is to use a field to mark "whether it has been migrated".
However, no matter how I prompted it, it would not suggest how to make it idempotent, so I gave up.
Liu Xin: The amount of data in the archive database is very large, and the query is very slow, but the query is basically "find the work order based on the customer ID, and then find out the work order based on the work order ID" "All processing records alone", is there any way to solve this?
GPT-4: For this query mode, the following strategies can be adopted to optimize the query performance of the archive database:
These solutions are The problem cannot be effectively solved. The correct solution is to use a NoSQL database like HBase, which is unexpected.
The case I use is still salary payment, this is extracted from the classic book Agile Software Development: Principles, Patterns, and Practices.
The requirements of this case are as follows:
The system consists of a company database and employee-related data. The system needs to be updated on time Pay employees according to the rules
There are three types of employees
1. Hourly employees, who have a fixed pay per hour, and they submit their working hours every day Card, which records the date and hours worked. If you work more than 8 hours a day, you will be paid at 1.5 times. They are paid every Friday.
2. Monthly salary employees, their wages are fixed, and they are paid on the last working day of each month
3. Sales staff, there are They receive a fixed salary, but are paid a certain amount of commission based on their sales, and they submit sales receipts recording the date and quantity of the sale. They are paid every other Friday.
Employees can choose the payment method. They can mail the check to their designated postal address, save it in the finance department and withdraw it at any time, or request it to be deposited directly into their designated bank account.
After reading this requirement, the general design is like this:
Employee serves as the base class from which different types of employee classes inherit.
But this requirement will change. The customer requires that the employee type can be changed, for example, from hourly employee to monthly salary employee, so the above design will not work.
At this time, you should make an abstraction, use a class PaymentClassification to express the payment classification, and then let the Employee class hold this instance. Simply put, use combination instead of inheritance.
#This is a trap, This is also a hole dug for ChatGPT to see how it handles it. Officially started, I told it the requirements and let it design. Unexpectedly, it started writing Python code for me as soon as it came up. There were too many details, so I immediately stopped outputting it. #Liu Xin: No need to show the code, just describe the class. GPT-4 : OK, according to your needs, we can design the following classes and related properties and methods:is really good, it "understands" the requirements and extracts them Nouns are used to design classes. The Employee, HourlyEmployee, SalariedEmployee, and SalesEmployee designed by it are almost the same as the first UML diagram.
Has reached the primary level of object-oriented design.
Let’s start digging:
Liu Xin: The above classes are well designed, but if a person starts out as a part-time worker, Later, I became a monthly salary employee, and the above design could no longer support this change. What should I do?
GPT-4: In order to solve the problem you mentioned, we can separate the employee's salary calculation strategy from the Employee class and design it using the strategy pattern. In this way, we can dynamically change the salary calculation strategy for the Employee class without modifying it.
Under this design, we need to make some adjustments to the original class structure:
New method: set_salary_strategy, used to set the employee's salary calculation strategy
HourlySalary class, inherited from SalaryStrategy class
SalariedSalary class, inherits from SalaryStrategy class
SalesSalary class, inherits from SalaryStrategy class
With the above adjustments, you can now do it without modification In the case of Employee class, the salary calculation strategy is dynamically changed for it. When an employee changes from hourly to monthly salary, just set up a new salary calculation policy for them.
I have to say that GPT-4 is very powerful!
It uses the strategy pattern to abstract the salary calculation strategy SalaryStrategy.
Compared with the second UML diagram, except for the different name, it is basically the same.
It can be said that if you can do this, you can surpass many programmers.
I noticed that it has not been handling the payment cycle issue. I started to continue asking questions:
Liu Xin: The original demand also included payment time. For example, hourly workers are paid every Friday. How should this be designed?
GPT-4: In order to handle the payment time of different types of employees, we can introduce an abstract method should_pay_today in the SalaryStrategy base class, which determines whether the salary should be paid based on the current date. Next, implement this method in the specific strategy class.
......Slightly......
Here it made a mistake and put the payment cycle into the salary In the computing strategy, it is also known as "easy to maintain and expand".
Liu Xin: Payment time and salary calculation strategy are two concepts that do not go well together. How can we achieve maximum flexibility?
GPT-4: You are right, payment time and salary calculation strategy are two different concepts. To make the design more flexible, we can separate the payment time from the salary calculation strategy and create a new abstract class PaymentSchedule to handle the logic of payment time. In this way, we can set the salary calculation strategy and payment time strategy separately for each employee.
The following is the adjusted class structure:
Retain the base class Employee:
Method: set_payment_method, used to set the employee’s payment method
Method: set_salary_strategy, used to set the employee’s salary calculation strategy
New method: set_payment_schedule, used to set Employee payment time strategy
WeeklyPayment class...slightly...
MonthlyPayment class...slightly...
BiweeklyPayment class...slightly...
Very amazing! It seems to know what I am thinking, PaymentSchedule and SalaryStrategy are separated!
So far, it has achieved the final and probably best design in the book:
##3. Summary In At the detailed design level, GPT-4’s performance is far stronger than architectural design. Old readers who have turned into programmers know that I tested GPT-3.5 with the salary payment case last year. At that time, it was reciting the answers and even gave designs that exceeded the requirements. . But GPT-4 is different. Its design capabilities have gone a step further. With the prompts of the programmer, he has learned to separate concerns. In terms of architectural design, I feel that GPT-4 has mastered a lot of knowledge, but it cannot be effectively applied to complex problems. The solutions it provides are very confusing and far from what we want. Far. The more critical question is how to identify whether its answer is correct. I now have the standard answer on hand and can evaluate it. In actual projects, we are faced with unknowns. If we have no experience, how do we know that the design given by GPT-4 is effective? Can it solve the problem? The above is the detailed content of ChatGPT is beginning to threaten programmers' core capabilities!. For more information, please follow other related articles on the PHP Chinese website!