Table of Contents
The volume of user requests is too large
The single database is too large
A single table is too large
Vertical splitting
Vertical sub-library
Horizontal split table
Horizontal database and table partitioning
Horizontal database sharding and table sharding rules
Transaction support
Multiple database result set merging (group by, order by)
Cross-database join
Home Database Mysql Tutorial Today I finally figured out the MySQL sub-database and sub-tables, so I can brag about it in the interview!

Today I finally figured out the MySQL sub-database and sub-tables, so I can brag about it in the interview!

Jul 26, 2023 pm 03:02 PM
mysql

Preface

The company has recently been engaged in service separation and data segmentation, because the amount of data in a single package table is really too large, and It is still growing at 60W per day.

I have learned about database sub-databases and sub-tables before, and have read a few blog posts, but I only know a vague concept, and now that I think about it, everything is vague.

I spent the whole afternoon reading database sub-tables and reading a lot of articles. Now I will make a summary:

Part 1: Problems faced in the actual website development process.

Part 2: What are the different ways of segmentation, the differences and applicable aspects between vertical and horizontal.

Part 3: Some open source products and technologies currently on the market, and what are their advantages and disadvantages.

Part 4: Perhaps the most important thing, why is it not recommended to split databases and tables horizontally! ? This allows you to treat it carefully in the early stages of planning and avoid problems caused by segmentation.

Explanation of terms

Library: database; table: table; sub-database and sub-table: sharding

The database architecture has just evolved At first, it was enough for us to use a single-machine database. Later, when faced with more and more requests, we separated the write operations and read operations of the database, using multiple slave database copies (Slaver Replication) to be responsible for reading, and using the master database (Master ) is responsible for writing, and the slave library updates data synchronously from the main library to keep the data consistent. Architecturally, it is database master-slave synchronization. The slave library can be scaled horizontally, so more read requests are not a problem.

But when the number of users increases and there are more and more write requests, what should we do? Adding a Master cannot solve the problem, because the data needs to be consistent and the write operation requires synchronization between the two masters, which is equivalent to duplication and is more complicated.

At this time, you need to use sharding to split the write operations.

Problems before sharding databases and tables

Any problem is too big or too small. The amount of data we face here Too big a problem.

The volume of user requests is too large

Because single server TPS, memory, and IO are limited.

Solution: Distribute requests to multiple servers; in fact, user requests and executing a SQL query are essentially the same, both requesting a resource, but user requests will also go through gateways, routing, http servers, etc. .

The single database is too large

The processing capacity of a single database is limited;

Insufficient disk space on the server where the single database is located;

Operation on a single database IO bottleneck

Solution: Split into more smaller libraries

A single table is too large

CRUD is a problem;

Index expansion, query timeout

Solution: Split into multiple tables with smaller data sets.

Methods for sharding databases and tables

Generally, vertical slicing and horizontal slicing are used, which is a result set description The way of segmentation is physical space segmentation.

We start from the problems we face and solve them.

Explanation:
First of all, the number of user requests is too large, so we pile up machines to handle it (this is not the focus of this article)
Then the single library is too large. At this time, we need to see why Too many tables lead to too much data, or because there is too much data in a single table.
If there are many tables and a lot of data, use vertical segmentation to divide it into different libraries according to the business.
If the amount of data in a single table is too large, horizontal segmentation must be used, that is, the data in the table is divided into multiple tables according to certain rules, or even multiple tables on multiple databases.

The order of database and table partitioning should be vertical partitioning first, and then horizontal partitioning. Because vertical division is simpler and more consistent with the way we deal with real-world problems.

Vertical split

Vertical splitting

That is, "split a large table into a small table", which is based on column fields. Generally, there are many fields in the table, and those that are not commonly used, have large data, and are long in length (such as text type fields) are split into "extended tables". It is generally aimed at large tables with hundreds of columns, and also avoids the "cross-page" problem caused by too much data when querying.

Vertical sub-library

Vertical sub-library is aimed at splitting different businesses in a system, such as a database for users, a database for products, and a database for orders. After splitting, it should be placed on multiple servers instead of one server. Why? Let's imagine that a shopping website provides services to the outside world and has CRUD for users, products, orders, etc. Before the split, everything fell into a single database, which would make the database's single database processing capability become a bottleneck. After dividing the database vertically, if it is still placed on a database server, as the number of users increases, the processing power of a single database will become a bottleneck, and the disk space, memory, TPS, etc. of a single server will be very tight. . Therefore, we need to split it into multiple servers, so that the above problems are solved and we will not face single-machine resource problems in the future.

The splitting of the database business level is similar to the

governance and degradation mechanism of the service. It can also manage, maintain and monitor the data of different businesses separately. Extensions etc. The database is often the most likely to become the bottleneck of the application system, and the database itself is stateful. Compared with the Web and application servers, it is more difficult to achieve horizontal expansion. Database connection resources are precious and single-machine processing capabilities are limited. In high-concurrency scenarios, vertical sub-databases can break through the bottlenecks of IO, number of connections, and single-machine hardware resources to a certain extent.

Horizontal split

Horizontal split table

For a single table with a huge amount of data (such as an order table), according to a certain Rules (RANGE, HASH modulus , etc.) are divided into multiple tables. However, these tables are still in the same library, so database operations at the library level still have IO bottlenecks. Not recommended.

Horizontal database and table partitioning

Split the data of a single table into multiple servers. Each server has a corresponding library and table, but the data collection in the table is different. Horizontal sub-database and sub-table can effectively alleviate the performance bottlenecks and pressures of single machines and single databases, and break through the bottlenecks of IO, number of connections, hardware resources, etc.

Horizontal database sharding and table sharding rules

  • RANGE

    One table from 0 to 10000, one table from 10001 to 20000;

  • HASH model

    A shopping mall system generally uses users and orders as the main table, and then uses the related tables as supplementary tables , this will not cause problems such as cross-database transactions. Get the user ID, then take the modulus of hash and distribute it to different databases.

  • Geographic Region

    For example, if we divide our business according to East China, South China, and North China, Qiniu Cloud should be like this.

  • Time

    Split by time, that is, cut out the data 6 months ago or even a year ago and put it in another Tables, because as time goes by, the probability of data in these tables being queried becomes smaller, so there is no need to put them together with "hot data". This is also "separation of hot and cold data".

Problems faced after sharding databases and tables

Transaction support

Sub-databases and tables After that, it became distributed transaction.

If you rely on the distributed transaction management function of the database itself to execute transactions, you will pay a high performance price; If the application assists in controlling it, forming a program logic transaction, it will also cause programming problems burden.

Multiple database result set merging (group by, order by)

Similar to group by, order bySuch grouping and sorting statements cannot be used

Cross-database join

After the database is divided into tables, the association operations between the tables will be restricted. We cannot join tables located in different databases, nor can we join tables with different granularity. The result is originally The business that can be completed with one query may require multiple queries to complete. Rough solution: global table: basic data, all libraries have a copy. Field redundancy: In this way, some fields do not need to be queried by join. System layer assembly: Query everything separately and then assemble it, which is more complicated.

Sub-database and sub-table solution products

#There are relatively many sub-database and sub-table middleware on the market, among which those based on the proxy method MySQL Proxy and Amoeba, based on Hibernate framework is Hibernate Shards, based on jdbc is Dangdangsharding-jdbc, based on mybatis similar maven plug-in There are Mogujie’s MogujieTSharding, and Cobar Client by rewriting spring’s ibatis template class.

There are also open source products from some big companies:

Today I finally figured out the MySQL sub-database and sub-tables, so I can brag about it in the interview!


I am programmer Qingge, a person who loves life and A post-90s programmer who loves to share.


This issue’s introduction and solutions about Mysql sub-database and sub-table are introduced here. I hope it can help everyone. Please continue to pay attention to the public account Java learning for more Java interview articles in the future. guide.

The above is the detailed content of Today I finally figured out the MySQL sub-database and sub-tables, so I can brag about it in the interview!. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Chat Commands and How to Use Them
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

MySQL: Simple Concepts for Easy Learning MySQL: Simple Concepts for Easy Learning Apr 10, 2025 am 09:29 AM

MySQL is an open source relational database management system. 1) Create database and tables: Use the CREATEDATABASE and CREATETABLE commands. 2) Basic operations: INSERT, UPDATE, DELETE and SELECT. 3) Advanced operations: JOIN, subquery and transaction processing. 4) Debugging skills: Check syntax, data type and permissions. 5) Optimization suggestions: Use indexes, avoid SELECT* and use transactions.

How to open phpmyadmin How to open phpmyadmin Apr 10, 2025 pm 10:51 PM

You can open phpMyAdmin through the following steps: 1. Log in to the website control panel; 2. Find and click the phpMyAdmin icon; 3. Enter MySQL credentials; 4. Click "Login".

How to create navicat premium How to create navicat premium Apr 09, 2025 am 07:09 AM

Create a database using Navicat Premium: Connect to the database server and enter the connection parameters. Right-click on the server and select Create Database. Enter the name of the new database and the specified character set and collation. Connect to the new database and create the table in the Object Browser. Right-click on the table and select Insert Data to insert the data.

MySQL: An Introduction to the World's Most Popular Database MySQL: An Introduction to the World's Most Popular Database Apr 12, 2025 am 12:18 AM

MySQL is an open source relational database management system, mainly used to store and retrieve data quickly and reliably. Its working principle includes client requests, query resolution, execution of queries and return results. Examples of usage include creating tables, inserting and querying data, and advanced features such as JOIN operations. Common errors involve SQL syntax, data types, and permissions, and optimization suggestions include the use of indexes, optimized queries, and partitioning of tables.

How to create a new connection to mysql in navicat How to create a new connection to mysql in navicat Apr 09, 2025 am 07:21 AM

You can create a new MySQL connection in Navicat by following the steps: Open the application and select New Connection (Ctrl N). Select "MySQL" as the connection type. Enter the hostname/IP address, port, username, and password. (Optional) Configure advanced options. Save the connection and enter the connection name.

Why Use MySQL? Benefits and Advantages Why Use MySQL? Benefits and Advantages Apr 12, 2025 am 12:17 AM

MySQL is chosen for its performance, reliability, ease of use, and community support. 1.MySQL provides efficient data storage and retrieval functions, supporting multiple data types and advanced query operations. 2. Adopt client-server architecture and multiple storage engines to support transaction and query optimization. 3. Easy to use, supports a variety of operating systems and programming languages. 4. Have strong community support and provide rich resources and solutions.

How to use single threaded redis How to use single threaded redis Apr 10, 2025 pm 07:12 PM

Redis uses a single threaded architecture to provide high performance, simplicity, and consistency. It utilizes I/O multiplexing, event loops, non-blocking I/O, and shared memory to improve concurrency, but with limitations of concurrency limitations, single point of failure, and unsuitable for write-intensive workloads.

MySQL and SQL: Essential Skills for Developers MySQL and SQL: Essential Skills for Developers Apr 10, 2025 am 09:30 AM

MySQL and SQL are essential skills for developers. 1.MySQL is an open source relational database management system, and SQL is the standard language used to manage and operate databases. 2.MySQL supports multiple storage engines through efficient data storage and retrieval functions, and SQL completes complex data operations through simple statements. 3. Examples of usage include basic queries and advanced queries, such as filtering and sorting by condition. 4. Common errors include syntax errors and performance issues, which can be optimized by checking SQL statements and using EXPLAIN commands. 5. Performance optimization techniques include using indexes, avoiding full table scanning, optimizing JOIN operations and improving code readability.

See all articles