Home > Database > Mysql Tutorial > body text

MySQL interview Q&A collection (summary sharing)

WBOY
Release: 2022-03-22 17:45:07
forward
2173 people have browsed it

This article brings you relevant knowledge about mysql. It mainly compiles some frequently asked questions in interviews, including database architecture, indexing and SQL optimization, etc. I hope everyone has to help.

MySQL interview Q&A collection (summary sharing)

Recommended learning: mysql tutorial

1. Database architecture

1.1. Talk about the basic architecture of MySQL Picture

Tell the interviewer about the logical architecture of MySQL. If you have a whiteboard, you can draw the following picture. The picture comes from the Internet.

MySQL interview Q&A collection (summary sharing)

Mysql logical architecture diagram is mainly divided into three layers:

(1) The first layer is responsible for connection processing, authorization authentication, security, etc.

(2) The second layer is responsible for compiling and optimizing SQL

(3) The third layer is the storage engine.

1.2. How is a SQL query statement executed in MySQL?

  • First check whether the statement has permission. If there is no permission, an error message will be returned directly. If there is permission, the cache will be queried first (before MySQL8.0 version).

  • If there is no cache, the analyzer performs lexical analysis, extracts key elements such as select in the sql statement, and then determines whether the sql statement has grammatical errors, such as whether the keyword Correct and so on.

  • Finally, the optimizer determines the execution plan and performs permission verification. If there is no permission, it will directly return an error message. If there is permission, it will call the database engine interface and return to execution. result.

2. SQL optimization

2.1. How do you optimize SQL in daily work?

You can answer this question from these dimensions:

2.1.1, Optimize the table structure

(1) Try to use numeric fields

If fields containing only numeric information should not be designed as character type, this will reduce the performance of queries and connections, and increase storage overhead. This is because the engine will compare each character in the string one by one when processing queries and connections, and only one comparison is enough for numeric types.

(2) Use varchar instead of char as much as possible

Variable length fields have small storage space and can save storage space.

(3) When there is a large amount of duplicate data in the index column, the index can be deleted

For example, if there is a column for gender, almost only male, female, and unknown, such an index is invalid.

2.1.2, Optimize query

  • Try to avoid using != or operator in the where clause

  • Try to avoid using or in the where clause to connect conditions

  • Do not appear select in any query *

  • Avoid null value judgment on fields in where clause

2.1.3, Index optimization

  • Create indexes for fields that serve as query conditions and order by

  • Avoid creating too many indexes and use combined indexes

2.2. How to read the execution plan (explain) and understand the meaning of each field in it?

Add the explain keyword before the select statement to return execution plan information.

MySQL interview Q&A collection (summary sharing)

(1) id column: It is the serial number of the select statement. MySQL divides select queries into simple queries and complex queries.

(2) select_type column: Indicates whether the corresponding row is a simple or complex query.

(3) table column: Indicates which table a row of explain is accessing.

(4) type column: one of the most important columns. Represents the type of association or access type that MySQL determines how to find rows in the table. From best to worst: system > const > eq_ref > ref > fulltext > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALL

( 5) possible_keys column: Shows which indexes the query may use to find.

(6) key column: This column shows which index mysql actually uses to optimize access to the table.

(7) key_len column: shows the number of bytes used by mysql in the index. This value can be used to calculate which columns in the index are used.

(8) ref column: This column shows the columns or constants used by the table to look up values ​​in the index of the key column record. Common ones are: const (constant), func, NULL, and field names.

(9) rows column: This column is the number of rows that mysql estimates to read and detect. Note that this is not the number of rows in the result set.

(10) Extra column: Display additional information. For example, there are Using index, Using where, Using temporary, etc.

2.3. Have you ever cared about the time-consuming SQL in the business system? Is the query statistics too slow? How have you optimized slow queries?

When we usually write Sql, we must develop the habit of using explain analysis. Statistics of slow queries, operation and maintenance will give us regular statistics

Optimization ideas for slow queries:

  • Analyze statements to see if unnecessary fields/data are loaded

  • Analyze the SQL execution sentence, whether the index is hit, etc.

  • If the SQL is very complex, optimize the SQL structure

  • If the amount of table data is too large, consider splitting the table

3, index

3.1, the difference between clustered index and non-clustered index

You can press Answers from the following four dimensions:

(1) A table can only have one clustered index, while a table can have multiple non-clustered indexes.

(2) Clustered index, the logical order of the key values ​​in the index determines the physical order of the corresponding rows in the table; non-clustered index, the logical order of the index in the index is different from the physical storage order of the rows on the disk.

(3) The index is described through the data structure of a binary tree. We can understand the clustered index this way: the leaf nodes of the index are the data nodes. The leaf nodes of non-clustered indexes are still index nodes, but have a pointer pointing to the corresponding data block.

(4) Clustered index: physical storage is sorted according to the index; non-clustered index: physical storage is not sorted according to the index;

3.2. Why use B-tree and why not ordinary binary tree?

We can look at this problem from several dimensions, whether the query is fast enough, whether the efficiency is stable, how much data is stored, and the number of disk searches. Why is it not an ordinary binary tree, why is it not a balanced binary tree, why is it not a B-tree, and What about the B-tree?

3.2.1. Why not an ordinary binary tree?

If the binary tree is specialized into a linked list, it is equivalent to a full table scan. Compared with binary search trees, balanced binary trees have more stable search efficiency and faster overall search speed.

3.2.2. Why not a balanced binary tree?

We know that the query efficiency of data in memory is much faster than that in disk. If a data structure like a tree is used as an index, then every time we search for data, we need to read a node from the disk, which is what we call a disk block, but a balanced binary tree only stores one key value and data per node. If it is a B-tree, more node data can be stored, and the height of the tree will also be reduced, so the number of disk reads will be reduced, and the query efficiency will be faster.

3.2.3. Why not B-tree but B-tree?

B tree does not store data on non-leaf nodes, only key values. However, B tree nodes not only store key values, but also store data. The default size of a page in innodb is 16KB. If no data is stored, more key values ​​will be stored, the order of the corresponding tree (the child node tree of the node) will be larger, and the tree will be shorter and fatter. In this way, the number of disk IO times we need to search for data will be reduced again, and the efficiency of data query will be faster.

B All data in the tree index are stored in leaf nodes, and the data is arranged in order and linked to the linked list. Then B-tree makes range search, sort search, group search and deduplication search extremely simple.

3.3. What is the difference between Hash index and B-tree index? How did you choose to design the index?

  • B tree can perform range query, Hash index cannot.

  • B-tree supports the leftmost principle of joint index, but Hash index does not support it.

  • B-tree supports order by sorting, but Hash index does not support it.

  • Hash index is more efficient than B-tree for equivalent queries.

  • When B-tree uses like for fuzzy query, the words after like (such as starting with %) can play an optimization role, and the Hash index cannot perform fuzzy query at all.

3.4. What is the leftmost prefix principle? What is the leftmost matching principle?

The leftmost prefix principle is leftmost priority. When creating a multi-column index, according to business needs, the most frequently used column in the where clause is placed on the leftmost side.

When we create a combined index, such as (a1, a2, a3), it is equivalent to creating three indexes (a1), (a1, a2) and (a1, a2, a3). This This is the leftmost matching principle.

3.5. Which scenarios are not suitable for indexing?

  • It is not suitable to add index if the amount of data is small

  • It is updated more frequently Not suitable for indexing = Fields with low discrimination are not suitable for indexing (such as gender)

3.6. What are the advantages and disadvantages of indexing?

(1) Advantages:

  • The unique index can ensure the uniqueness of each row of data in the database table

  • Index It can speed up data query and reduce query time

(2) Disadvantages:

  • It takes time to create and maintain indexes

  • Indexes need to occupy physical space. In addition to the data space occupied by the data table, each index also occupies a certain amount of physical space.

  • Based on the data in the table When adding, deleting, or modifying, the index must also be maintained dynamically.

4. Lock

4.1. Has MySQL ever encountered a deadlock problem? How did you solve it?

Encountered. My general steps for troubleshooting deadlocks are as follows:

(1) Check the deadlock log show engine innodb status;

(2) Find out the deadlock Sql

(3) Analyze the sql lock situation

( 4) Simulate deadlock cases

(5) Analyze deadlock log

(6) Analyze deadlock results

4.2. Talk about optimistic locking and pessimistic locking of database What are they and their differences?

(1) Pessimistic lock:

The pessimistic lock is single-minded and insecure. Her heart only belongs to the current affairs, and she is always worried that her beloved data may be stolen. Modified by other transactions, so after a transaction owns (obtains) a pessimistic lock, no other transaction can modify the data and can only wait for the lock to be released before executing it.

(2) Optimistic lock:

The "optimism" of optimistic lock is reflected in the fact that it believes that the data will not change too frequently. Therefore, it allows multiple transactions to make changes to the data simultaneously.

Implementation method: Optimistic locking is generally implemented using the version number mechanism or CAS algorithm.

4.3. Are you familiar with MVCC and know its underlying principles?

MVCC (Multiversion Concurrency Control), that is, multi-version concurrency control technology.

The implementation of MVCC in MySQL InnoDB is mainly to improve database concurrency performance and use a better way to handle read-write conflicts, so that even if there are read-write conflicts, no locking can be achieved. Non-blocking concurrent reads.

5. Transactions

5.1. The four major characteristics and implementation principles of MySQL transactions

  • Atomicity: The transaction is executed as a whole and is included in Either all operations on the database are executed, or none are executed.

  • Consistency: means that the data will not be destroyed before the transaction starts and after the transaction ends. If account A transfers 10 yuan to account B, regardless of success or failure, A and B The total amount remains unchanged.

  • Isolation: When multiple transactions access concurrently, the transactions are isolated from each other, that is, one transaction does not affect the running effects of other transactions. In short, it means that there is no conflict between affairs.

  • Persistence: Indicates that after the transaction is completed, the operational changes made by the transaction to the database will be permanently saved in the database.

5.2. What are the isolation levels of transactions? What is MySQL's default isolation level?

  • Read Uncommitted

  • Read Committed

  • Repeatable Read

  • Serializable

The default transaction isolation level of Mysql is Repeatable Read )

5.3. What are phantom reads, dirty reads, and non-repeatable reads?

Transactions A and B are executed alternately. Transaction A is interfered by transaction B because transaction A reads uncommitted data of transaction B. This is a dirty read.

Within the scope of a transaction, two identical queries read the same record but return different data. This is a non-repeatable read.

Transaction A queries the result set of a range, and another concurrent transaction B inserts/delete data into this range and commits it silently. Then transaction A queries the same range again, and reads twice. The result set is different, which is phantom reading.

6. Practical combat

6.1. How to deal with the MySQL database CPU surge?

Troubleshooting process:

(1) Use the top command to observe and determine whether it is caused by mysqld or other reasons.

(2) If it is caused by mysqld, show processlist, check the session status, and determine whether there is any resource-consuming SQL running.

(3) Find out the SQL with high consumption and see whether the execution plan is accurate, whether the index is missing, and whether the amount of data is too large.

Processing:

(1) Kill these threads (and observe whether the CPU usage decreases)

(2) Make corresponding adjustments (such as adding indexes, changing sql, change memory parameters)

(3) Rerun these SQLs.

Other situations:

It is also possible that each SQL statement does not consume much resources, but suddenly, a large number of sessions are connected, causing the CPU to surge. In this case, you need to contact the application. Let’s analyze why the number of connections surges, and then make corresponding adjustments, such as limiting the number of connections, etc.

6.2. How do you solve the master-slave delay in MYSQL?

Master-slave replication is divided into five steps: (pictures from the Internet)

MySQL interview Q&A collection (summary sharing)

  • Steps One: The update events (update, insert, delete) of the main library are written to the binlog

  • Step 2: Initiate a connection from the slave library and connect to the main library.

  • Step 3: At this time, the main library creates a binlog dump thread and sends the contents of the binlog to the slave library.

  • Step 4: After starting from the slave library, create an I/O thread, read the binlog content passed from the main library and write it to the relay log

  • Step 5: A SQL thread will also be created to read the content from the relay log, execute the read update event starting from the Exec_Master_Log_Pos position, and write the updated content to the slave db

Cause of master-slave synchronization delay

A server opens N links for clients to connect, so there will be large concurrent update operations, but there is only one thread to read the binlog from the server. When a certain SQL is executed on the slave server If it takes a long time or because a certain SQL needs to lock the table, there will be a large backlog of SQL on the master server and it will not be synchronized to the slave server. This leads to master-slave inconsistency, that is, master-slave delay.

Solution to master-slave synchronization delay

  • The master server is responsible for the update operation and has higher security requirements than the slave server, so Some setting parameters can be modified, such as sync_binlog=1, innodb_flush_log_at_trx_commit = 1 and other settings.

  • Choose a better hardware device as a slave.

  • If a slave server is used as a backup without providing queries, its load will be reduced, and the efficiency of executing the SQL in the relay log will naturally be higher.

  • The purpose of adding slave servers is to disperse the reading pressure and thereby reduce the server load.

6.3. If you were asked to design sub-databases and sub-tables, briefly tell me what you would do?

Sub-database and table scheme:

  • Horizontal sub-database: Based on fields and according to certain strategies (hash, range, etc.), Data in one library is split into multiple libraries.

  • Horizontal table splitting: Split the data in one table into multiple tables based on fields and certain strategies (hash, range, etc.).

  • Vertical database splitting: Based on tables, different tables are split into different databases according to different business ownerships.

  • Vertical table splitting: Based on the fields and according to the activity of the fields, the fields in the table are split into different tables (main table and extended table).

Commonly used sharding middleware:

  • ##sharding-jdbc

  • Mycat

Problems that may be encountered in sub-databases and sub-tables

  • Transaction problems: Distribution is required Type transaction

  • The problem of cross-node Join: To solve this problem, you can query it in two times to achieve

  • Cross-node count, order by , group by and aggregation function issues: the results are obtained on each node and then merged on the application side.

  • Data migration, capacity planning, expansion and other issues

  • ID problem: After the database is split, it can no longer rely on the primary key generation of the database itself. Mechanism, the simplest one can consider UUID

  • Cross-shard sorting and paging problem

Recommended learning:

mysql learning tutorial

The above is the detailed content of MySQL interview Q&A collection (summary sharing). For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:csdn.net
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!