Foreword:
This chapter outlines the MySQL server architecture, the main differences between various storage engines, and the importance of the differences
Reviews the historical background and benchmark testing of MySQL, and simplifies Details and demonstration cases to discuss the principles of MySQL
Text:
The MySQL architecture can be applied in a variety of different scenarios, can be embedded in applications, and supports data warehouses, content indexes, and deployment software. , high-availability redundant systems, online transaction processing systems, etc.;
The most important feature of MySQL is its storage engine architecture, which separates query processing and other system tasks from data storage and retrieval;
Lock strategy: Find a balance between lock overhead and data security Balanced, each storage engine can implement specified lock strategy and granularity
Table lock: table lock The most basic and minimum overhead to lock the entire table
Row-level lock: row lock Maximum support for concurrency Maximum Lock overhead is implemented at the storage engine layer (in its own way)
Independent unit of work, a set of atomic SQL queries
Four types, each stipulates the modifications made in the transaction. Lower isolation can perform higher concurrency and lower overhead
READ UNCOMMITTED uncommitted read
The modifications in the transaction are not committed in time and are also visible to other transactions; the transaction reads uncommitted data: Dirty read; rarely used
READ COMMITTED submission Read
almostThe default isolation level of the library, non-MySQL; from the beginning to the end of a transaction, only the modifications made by the submitted transaction are visible, and the modifications made by itself are not visible to other transactions;Not repeatable Read: Execute the same query twice, the results may be different (modification of other transactions)
REPEATABLE READ Repeatable read
MySQL default, solved Dirty read, the same transaction reads the same result multiple times; phantom read: When a transaction is reading records in a certain range, another transaction inserts new records in the range Records, the current transaction reads the range of records again, phantom rows
SERIALIZABLE: Serializable
Highest, forces transactions to be executed serially to avoid phantom reading problems , lock when reading each row of data (can cause a lot of timeouts and lock contention), rarely used deadlock1, two Multiple transactions occupy each other on the same resource and request to lock the resources occupied by each other;2. Multiple transactions try to lock resources in different orders, which may cause deadlock;3. Multiple transactions lock the same resource at the same time; The behavior and order of locks are related to the access engine. When executing statements in the same order, some storage engines will produce deadlocks and some will not;Death Double reasons for lock generation: Because of real data conflicts (hard to avoid), because of the implementation of the storage engine; After the deadlock is sent, the deadlock can only be broken by partially or completely rolling back one of the transactions. : InnoDB rolls back the transaction that holds the minimum row-level exclusive lock; 1.3.4 Transactions in MySQL: Storage engine implementationMySQL has two transactional storage engines: InnoDB and NDB ClusterAutomatic submission AUTOCOMMIT; The automatic submission mode is adopted by default. If you do not explicitly start a transaction, each query will be treated as a transaction to perform a commit operation. It can be enabled through the AUTOCOMMIT variable =1 =ON, disabled =0 =OFF (all queries are in one transaction until explicit commit rollback) The transaction ends and a new transaction starts at the same time. Modifying this variable has no impact on non-transactional tables; MySQL can Set the isolation level by set transaction isolation level. The new level will take effect when the next transaction starts. The configuration file sets the entire library. You can also only change the isolation level of the current session.set session transaction isolation level read committed;
snapshot of the database at a certain moment, and the write operation is not visible to the outside world before being submitted; [Source]
When updating, mark the old data as obsolete and add a new version of the data elsewhere (multiple versions of data, only one is the latest), allowing the previous data to be readFeatures:
1. Each row of data has a version, which is updated every time the data is updated. 2. When modifying, copy the current version and modify it at will, without interfering between transactions.3、保存时比较版本号,成功commit则覆盖原纪录,失败则放弃rollback
4、只在REPEATABLE READ 和READ COMMITTED两个隔离级别下工作
mysql将每个数据库保存位数据目录下的一个子目录,创建表示,mysql在子目录下创建与表同名的.frm文件保存表的定义,不同存储引擎保存数据和索引的方式不同,但表的定义在MySQL服务层同一处理;
处理大量短期事务;其性能和自动崩溃恢复特性、非事务型存储的需求中也很流行
数据存储在由InnoDB管理的表空间中,由一系列数据文件组成;
使用MVCC支持高并发,并实现了四个标准的隔离级别,默认是REPEATABLE READ可重复读,通过间隙锁next-key locking防止幻读,间隙锁使得InnoDB锁定查询设计的行还锁定索引中的间隙防止唤影行;
当使用范围条件并请求锁时,InnoDB给符合条件的已有数据记录的索引项加锁,对应键值在条件范围内但是不存在的记录(间隙)加锁,间隙锁:【源】
//如emp表中有101条记录,其empid的值分别是 1,2,...,100,101 Select * from emp where empid > 100 for update;
InnoDB对符合条件的empid值为101的记录加锁,也会对empid大于101(这些记录并不存在)的“间隙”加锁;
1、上面的例子,如果不使用间隙锁,如果其他事务插入大于100的记录,本事务再次执行则幻读,但是会造成锁等待,在并发插入比较多时、要尽量优化业务逻辑,使用相等条件来访问更新数据,避免使用范围条件;
2、 在使用相等条件请求给一个不存在的记录加锁时,也会使用间隙锁,当我们通过参数删除一条记录时,如果参数在数据库中不存在,库会扫描索引,发现不存在,delete语句获得一个间隙锁,库向左扫描扫到第一个比给定参数小的值,向右扫描到第一个比给定参数大的值,构建一个区间,锁住整个区间内数据;【源】
全文索引、压缩、空间函数,不支持事务和行级锁,崩溃后无法安全恢复
存储:
将表存储在两个文件中:数据.MYD、索引文件.MYI
表可以包含动态或静态(长度固定)行,MySQL据表定义来决定采用何种行格式
表如是变长行,默认配置只能处理256TB数据(指向记录的指针长度6字节),改变表指针长度,修改表的MAX_ROWS和AVG_ROW_LENGTH,两者相乘=表可到达的max大小,修改会导致重建整个表、表all索引;
特性:
1、对整张表加锁,读、共享锁,写、排他锁,但在读的同时可从表中插入新记录:并发插入
2、修复:可手工、自动执行检查和修复操作,CHECK TABLE mytable检查表错误,REPAIR TABLE mytable进行修复,执行修复可能会丢失些数据,如果服务器关闭,myisamchk命令行根据检查和修复操作;
3、索引特性:支持全文索引,基于分词创建的索引,支持复杂查询
4、延迟更新索引键Delayed Key Write,如果指定了DELAY_KEY_WRITE选项,每次修改完,不会立即将修改的索引数据写入磁盘,写入到内存的键缓冲区,清理此区或关闭表时将对应的索引块写入到磁盘,提升写性能,但是在库或主机崩溃时造成索引损坏、需要执行修复操作
压缩表:
表在创建并导入数据后,不再修改,比较适合,可使用myisampack对MyISAM表压缩(打包),压缩表不能修改(除非先解除压缩、修改数据、再次压缩);减少磁盘空间占用、磁盘IO,提升查询性能,也支持只读索引;
现在的硬件能力,读取压缩表数据时解压的开销不大,减少IO带来的好处大得多,压缩时表记录独立压缩,读取单行时不需要解压整个表
性能:
设计简单,紧密格式存储;典型的性能问题是表锁的问题,长期处于locked状态:找表锁
Archive:适合日志和数据采集类应用,针对高速插入和压缩优化,支持行级锁和专业缓存区,缓存写利用zlib压缩插入的行,select扫描全表;
Blackhole:复制架构和日志审核,其服务器记录blackhole表日志,可复制数据到备库 日志;
CSV:数据交换机制,将CSV文件作为MySQL表来处理,不支持索引;
Federated:访问其他MySQL服务器的代理,创建远程mysql的客户端连接将查询传输到远程服务器执行,提取发送需要的数据,默认禁用;
Memory:快速访问不会被修改的数据,数据保存在内存、不IO,表结构重启后还在但数据没了
1、查找 或 映射 表 ,2、缓存周期性聚合数据, 3、保存数据分析中产生的中间数据
支持hash索引,表级锁,查找快并发写入性能低,不支持BLOB/TEXT类型的列,每行长度固定,内存浪费
Merge:myisam变种,多个myisam合并的虚拟表
NDB集群引擎:
OLTP类:
XtraDB基于InnoDB改进,性能、可测量性、操作灵活
PBXT:ACID/MVCC,引擎级别的复制、外键约束,较复杂架构对固态存储SSD适当支持,较大值类型BLOB优化
TokuDB:大数据,高压缩比,大数据量创大量索引
RethinkDB:固态存储
面向列的
列单独存储,压缩效率高
Infobright:大数据量,数据分析、仓库应用设计的,高度压缩,按照块(一组元数据)排序;块结构准索引,不支持索引(量大索引也没用),如查询无法再存储层使用面向列的模式执行,则需要在服务器层转换成按行处理
社区存储引擎:***
除非需要用到某些InnoDB不具备的特性,且无办法可以替代,否则优先选择InnoDB引擎
不要混合使用多种存储引擎,如果需要不同的存储引擎:
1、事务:需要事务支出,InnoDB XtraDB;不需要 主要是select insert 那MyISAM
2、备份:定期关闭服务器来执行备份,该因素可忽略;在线热备份,InnoDB
3、崩溃恢复:数据量较大,MyISAM崩后损坏概率比InnoDB高很多、恢复速度慢
4、持有的特性:
ALTER TABLE:最简单
ALTER TABLE mytable ENGINE=InnoDB
此会执行很长时间,MySQL按行将数据从原表复制到新表中,在复制期间可能会消耗掉系统all的I/O能力,同时原表上加读锁;会失去和原引擎相关的all特性
导出与导入:
mysqldump工具将数据导出到文件,修改文件中CREATE_TABLE语句的存储引擎选项,同时修改表名(同一个库不能存在相同的表名),mysqldump默认会自动在CREATE_TABLE语句前加上DROP TABLE语句
创建与查询:CREATE SELECT
综合上述两种方法:先建新存储引擎表,利用INSERT……SELECT语法导数
CREATE TABLE innodb_table LIKE myisam_table ALTER TABLE innodb_table ENGINE=InnoDB; INSERT INTO innodb_table SELECT * FROM myisam_table; 数据量大的话,分批处理(放事务中)
早期MySQL破坏性创新,有诸多限制,且很多功能只能说是二流的,但特性支持和较低的使用成本,使受欢迎;5.x早起引入视图、存储过程等,期望成为“企业级”数据库,但不算成功,5.5显著改善
遵循GPL开源协议,全部源代码开发给社区,部分插件收费;
mysql分层架构,上层是服务器层的访问和查询执行引擎,下层存储引擎(最重要)
相关文章:
The above is the detailed content of [MySQL Database] Interpretation of Chapter 1: MySQL Architecture and History. For more information, please follow other related articles on the PHP Chinese website!