Detailed explanation of the difference between utf8 and utf8mb4-PHP Tutorial-php.cn

Home

Backend Development

PHP Tutorial

Detailed explanation of the difference between utf8 and utf8mb4

coldplay.xixi

Jun 13, 2020 pm 05:56 PM

utf8

Detailed explanation of the difference between utf8 and utf8mb4

The difference between utf8 and utf8mb4

1. Introduction

MySQL added this utf8mb4 encoding after 5.5.3 , mb4 means most bytes 4, which is specially designed to be compatible with four-byte unicode. utf8mb4 is a superset of utf8, and no other conversion is required except changing the encoding to utf8mb4. Of course, in order to save space, it is usually enough to use utf8.

2. Content description

As mentioned above, since utf8 can store most Chinese characters, why should we use utf8mb4? It turns out that the maximum character length of utf8 encoding supported by mysql is 3 characters. section, an exception will be inserted if a 4-byte wide character is encountered. The maximum Unicode character that can be encoded by three-byte UTF-8 is 0xffff, which is the Basic Multilingual Plane (BMP) in Unicode. In other words, any Unicode characters that are not in the basic multi-text plane cannot be stored using Mysql's utf8 character set. Including Emoji expressions (Emoji is a special Unicode encoding, common on ios and android mobile phones), and many uncommon Chinese characters, as well as any new Unicode characters, etc. (disadvantages of utf8).

Usually, when computers store characters, they allocate storage space according to different types of characters and encoding methods. For example, the following encoding methods;

①In ASCII encoding, one English letter (regardless of upper and lower case) occupies one byte of space, and one Chinese character occupies two bytes of space. A binary number sequence, when stored as a digital unit in the computer, is generally an 8-bit binary number, converted to decimal. The minimum value is 0 and the maximum value is 255.

②In UTF-8 encoding, one English character occupies one byte of storage space, and one Chinese character (including traditional Chinese) occupies three bytes of storage space.

③In Unicode encoding, an English character occupies two bytes of storage space, and a Chinese character (including Traditional Chinese) occupies two bytes of storage space.

④In UTF-16 encoding, the storage of an English alphabetic character or a Chinese character requires 2 bytes of storage space (some Chinese characters in the Unicode extension area require 4 bytes to store).

⑤In UTF-32 encoding, the storage of any character in the world requires 4 bytes of storage space.

Since utf8 is compatible with most characters, why should we extend utf8mb4?

With the development of the Internet, many new types of characters have been produced, such as emoji symbols, which are the little yellow face expressions we usually send when chatting. The appearance of such characters is no longer basically the same. Among the flat Unicode characters, it was impossible to use utf8 storage in MySQL, so MySQL expanded the utf8 characters and added the utf8mb4 encoding.

So, if you want to allow users to use special symbols when designing a database, it is best to use utf8mb4 encoding to store them, so that the database has better compatibility, but this design will consume more storage space.

Recommended tutorial: "php from beginner to master"

The above is the detailed content of Detailed explanation of the difference between utf8 and utf8mb4. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks ago By DDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks ago By DDD

Where to find the Crane Control Keycard in Atomfall

3 weeks ago By DDD

Saving in R.E.P.O. Explained (And Save Files)

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

4 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7564

CakePHP Tutorial

1385

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

Related knowledge

Alipay PHP SDK transfer error: How to solve the problem of 'Cannot declare class SignData'? Apr 01, 2025 am 07:21 AM

Alipay PHP...

Explain JSON Web Tokens (JWT) and their use case in PHP APIs. Apr 05, 2025 am 12:04 AM

JWT is an open standard based on JSON, used to securely transmit information between parties, mainly for identity authentication and information exchange. 1. JWT consists of three parts: Header, Payload and Signature. 2. The working principle of JWT includes three steps: generating JWT, verifying JWT and parsing Payload. 3. When using JWT for authentication in PHP, JWT can be generated and verified, and user role and permission information can be included in advanced usage. 4. Common errors include signature verification failure, token expiration, and payload oversized. Debugging skills include using debugging tools and logging. 5. Performance optimization and best practices include using appropriate signature algorithms, setting validity periods reasonably,

Describe the SOLID principles and how they apply to PHP development. Apr 03, 2025 am 12:04 AM

The application of SOLID principle in PHP development includes: 1. Single responsibility principle (SRP): Each class is responsible for only one function. 2. Open and close principle (OCP): Changes are achieved through extension rather than modification. 3. Lisch's Substitution Principle (LSP): Subclasses can replace base classes without affecting program accuracy. 4. Interface isolation principle (ISP): Use fine-grained interfaces to avoid dependencies and unused methods. 5. Dependency inversion principle (DIP): High and low-level modules rely on abstraction and are implemented through dependency injection.

How to automatically set permissions of unixsocket after system restart? Mar 31, 2025 pm 11:54 PM

How to automatically set the permissions of unixsocket after the system restarts. Every time the system restarts, we need to execute the following command to modify the permissions of unixsocket: sudo...

Explain the concept of late static binding in PHP. Mar 21, 2025 pm 01:33 PM

Article discusses late static binding (LSB) in PHP, introduced in PHP 5.3, allowing runtime resolution of static method calls for more flexible inheritance.Main issue: LSB vs. traditional polymorphism; LSB's practical applications and potential perfo

How to send a POST request containing JSON data using PHP's cURL library? Apr 01, 2025 pm 03:12 PM

Sending JSON data using PHP's cURL library In PHP development, it is often necessary to interact with external APIs. One of the common ways is to use cURL library to send POST�...

Framework Security Features: Protecting against vulnerabilities. Mar 28, 2025 pm 05:11 PM

Article discusses essential security features in frameworks to protect against vulnerabilities, including input validation, authentication, and regular updates.

Customizing/Extending Frameworks: How to add custom functionality. Mar 28, 2025 pm 05:12 PM

The article discusses adding custom functionality to frameworks, focusing on understanding architecture, identifying extension points, and best practices for integration and debugging.

See all articles