Assume that a website (discuz) has a small number of visits from the beginning to tens of millions of daily PVs. Let's speculate on the evolution of its mysql server architecture.
The first stage
The daily pv level of website visits is below 10,000. A single machine runs web and db, and there is no need to do architectural layer tuning (for example, there is no need to increase the memcached cache). At this time, data is often cold backed up daily, but sometimes if data security is considered, a mysql master-slave will be set up.
Second stage
The number of daily visits to the website has reached tens of thousands. At this time, the single machine is already somewhat loaded. We need to separate web and db, and we need to build a memcached service as a cache. In other words, at this stage, we can also use a single machine to run mysql to undertake the data storage and query of the entire website. If you do MySQL master-slave, the purpose is also for data security.
The third stage
The number of daily visits to the website has reached hundreds of thousands. Although a single machine can also support it, the machine configuration required is much better than the previous machine. If funds allow, you can purchase a machine with high configuration to run the MySQL service. However, this does not mean that doubling the configuration will double the performance. At a certain stage, increasing the configuration will no longer bring about an increase in performance. Therefore, at this stage, we will think of clustering mysql services, which means that we can use multiple machines to run MySQL. However, MySQL clusters are different from web clusters. We need to consider data consistency, so we cannot simply apply the methods of web clustering (lvs, nginx proxy). The possible architecture is mysql master-slave, one master and multiple slaves. In order to ensure the robustness of the architecture and data integrity, there can only be one master and multiple slaves.
There is another question that we need to think about, that is, in the front-end web layer, our program specifies the IP of the MySQL machine. So when there are multiple mysql machines, what happens in the program? To configure? discuz actually has a function that supports MySQL read and write separation. That is, we can use multiple machines to run MySQL, one of which is for writing, and the others are for reading. We only need to configure the reading and writing IPs into the program, and the program will automatically distinguish the machines. Of course, if we do not use the configuration that comes with discuz, we can also refer to a software called mysql-proxy and use it to achieve read-write separation. It supports one master and multiple slaves mode.
The fourth stage
The number of website visits reaches several million per day. The previous one-master-multiple-slave model has encountered bottlenecks, because when the number of website visits increases, the amount of database reading will also increase. We need to add more slaves, but when the number of slaves increases to dozens, due to the main All bin-logs need to be distributed to all slaves, so this process itself is a very cumbersome matter. Coupled with frequent reading, it will inevitably cause a large delay in the data synchronized from the slaves. Therefore, we can make an optimization,change the original one master and multiple slaves of mysql to one master and one slave, and then the slave will serve as the master of other slaves. The former master is only responsible for writing website business, and the subsequent slaves It is not responsible for any business of the website, only responsible for synchronizing the bin-log to other slaves. In this way, you can continue to stack several more slave libraries.
The fifth stage
When the daily pv of website visits reached 10 million, we found that the number of writes to the website It is very large. There was only one master in our previous architecture, and the master here has become a bottleneck. Therefore, further adjustments are needed. For example, we can divide the business into modules, separate user-related ones, separate permissions, points, etc., and run a separate library, and then use it as a master-slave, which is the so-called sub-library. Of course, you can also change the latitude, separate tables with large access or write volume, and run them on a server. You can also divide a table into multiple small tables. This step of operation involves some procedural changes, so it is necessary to communicate and design with development colleagues in advance. In short, what we need to do in this step is sub-database and sub-table.
Write at the back
Going forward, just continue to divide the large table into small tables. The domestic Alibaba Taobao website has a huge amount of data. All their databases are MySQL. Their MySQL architecture follows the principle of sub-database and sub-table. However, their division rules will have many latitudes, such as division by region. , can be divided according to buyers and sellers, can be divided according to time, etc.
The above is the detailed content of the evolution process of MySQL architecture from small to large. For more related content, please pay attention to the PHP Chinese website (www.php.cn)!