Use PHP to decode POP3 emails 2_PHP tutorial
Introduction to MIME encoding method (Author: Chen Junqing, October 24, 2000 15:09) Introduction to MIME encoding method Subject: =?gb2312?B?xOO6w6Oh?= This is the subject of the email, but because it is encoded, we cannot see what it is. Content, its original text is: "Hello!" Let's first look at the two methods of MIME encoding. The original reason for encoding emails is because many gateways on the Internet cannot correctly transmit 8-bit internal code characters, such as Chinese characters. The principle of encoding is to convert 8-bit content into 7-bit form so that it can be transmitted correctly, and then restore it to 8-bit content after the receiver receives it. MIME is the abbreviation of "Multipurpose Internet Mail Extensions". Before the MIME protocol, there were encoding methods such as UUENCODE for mail encoding. However, because the MIME protocol algorithm is simple and easy to expand, it has now become the mainstream mail encoding method. Not only It is used to transmit 8-bit characters, and can also be used to transmit binary files, such as images, audio and other information in email attachments, and has expanded many MIME-based applications. In terms of encoding methods, MIME defines two encoding methods: Base64 and QP (Quote-Printable): Base 64 is a universal method, and its principle is very simple, that is, three Byte data is represented by 4 Byte, so , among these four Bytes, only the first 6 bits are actually used, so there is no problem that only 7-bit characters can be transmitted. The abbreviation of Base 64 is usually "B". The Subject in this letter uses Base64 encoding. Another method is the QP (Quote-Printable) method, usually abbreviated as "Q" method. Its principle is to represent an 8-bit character with two hexadecimal values, and then add "=" in front.So we see that the file after QP encoding usually looks like this: =B3=C2=BF=A1=C7=E5=A3=AC=C4=FA=BA=C3=A3=A1. In PHP, the system has two functions that can easily implement decoding: base64_decode() and quoted_printable_decode(). The former can be used for decoding base64 encoding, and the latter is used for decoding QP encoding method. Now let's take a look at the content of Subject: =?gb2312?B?xOO6w6Oh?=. This is not a complete encoding, only part of it is encoded. This part is enclosed by two marks =? ?=, = ? What is explained later is that the character set of this text is GB2312, and a B after a ? represents the Base64 encoding. Through this analysis, let’s take a look at this MIME decoding function: (This function is provided by Sadly, the webmaster of PHPX.COM. I put it into a class and made a few modifications. I would like to thank you) function decode_mime( $string) { $pos = strpos($string, =?); if (!is_int($pos)) { return $string; } $preceding = substr($string, 0, $pos); // save any preceding text $search = substr($string, $pos+2); /* the mime header spec says this is the longest a single encoded word can be */ $d1 = strpos($search, ?); if (!is_int( $d1)) { Return $string; } $charset = substr($string, $pos+2, $d1); //Get the definition part of the character set $search = substr($search, $d1+1); / /The part after the character set definition =>$search; $d2 = strpos($search, ?); if (!is_int($d2)) { return $string; } $encoding = substr($search, 0, $d2 ); ////Part of the encoding method between two?: q or b $search = substr($search, $d2+1); $end = strpos($search, ?=); //$d2+ Between 1 and $end is the encoded content: => $endcoded_text; if (!is_int($end)) { return $string; } $encoded_text = substr($search, 0, $end); $rest = substr ($string, (strlen($preceding . $charset . $encoding . $encoded_text)+6)); //+6 is the previous removed =????= Six characters switch ($encoding) { case Q: case q: //$encoded_text = str_replace(_, %20, $encoded_text); //$encoded_text = str_replace(=, %, $encoded_text); //$decoded = urldecode($encoded_text); $decoded=quoted_printable_decode( $encoded_text); if (strtolower($charset) == windows-1251) { $decoded = convert_cyr_string($decoded, w, k); (strtolower($charset) == windows-1251) { $decoded = convert_cyr_string($decoded, w, k); $decoded = convert_cyr_string($decoded, w, k); ?=; break; } Return $preceding . $decoded . $this->decode_mime($rest); } } This function uses a recursive method to decode a character containing the above Subject segment. Comments have been added to the program. I believe anyone with some basic knowledge of PHP programming can understand it. This function is also decoded by calling the two system functions base64_decode() and quoted_printable_decode(), but it requires a large amount of string analysis on the email source file. However, PHP's string operations can be regarded as the most convenient and free among all languages. The final return of the function $preceding . $decoded . $this->decode_mime($rest); implements recursive decoding. Because this function is actually placed in a MIME decoding class to be introduced later, $this- is used. >Decode_mime($rest) This form of calling method. Now let’s look at the text. This is related to some header information of MIME. Let’s give a brief introduction first (if readers are interested in learning more, please refer to the official documentation of MIME). MIME-Version: 1.0 Indicates the version number of MIME used, usually 1.0; Content-Type: Defines the type of text. We actually use this identifier to know what type of file the text is, for example: text/plain means is the unformatted text body, text/html represents the Html document, image/gif represents the image in gif format, etc. What needs to be explained in this article is the compound types commonly used in emails. The multipart type indicates that the text is composed of multiple parts. The following subtypes describe the relationship between these parts. The three types used in emails are: multipart/alternative: indicates that the text is composed of two parts, which can be selected. Any of them. The main function is that when the essay has both text format and html format, you can choose one of the two bodies to display. Mail client software that supports the html format will generally display its HTML body, while those that do not support it will display its Text body. ; multipart/mixed: Indicates that multiple parts of the document are mixed, referring to the relationship between the text and attachments.If the MIME type of the email is multipart/mixed, it means that the email contains attachments; multipart/related: means that multiple parts of the document are related, generally used to describe the Html text and its related images. These composite types can be nested. For example, if an email contains an attachment and has body text in both HTML and text formats, the structure of the email is: Content-Type: multipart/mixed Part 1: Content Type: multipart/alternative: Text text; Text part two in Html format: Attachment email end character; Since the composite type is composed of multiple parts, a delimiter is needed to separate these multiple parts, which is what is in the email source file above As described by boundary="----=_NextPart_000_0007_01C03166.5B1E9510", for each content of Contact type: multipart/*, there will be such a description, indicating the separation between multiple parts. This separator is not in the text. A possible combination of a string of ancient characters. In the document, "--" plus the boundary is used to indicate the beginning of a section. At the end of the document, "--" is added to the boundary and then "--" is added at the end. " to indicate the end of the document. Since composite types can be nested, there may be multiple boundaries in the email. There is also the most important MIME header tag: Content-Transfer-Encoding: base64 It indicates the encoding method of this part of the document, which is the Base64 or QP (Quote-Printable) we introduced above. Only by identifying this description can we decode it using the correct decoding method. Due to space limitations, this is the only introduction to MIME. Below I will give a class for decoding MIME emails and give a brief description of it.

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



PHP 8.4 brings several new features, security improvements, and performance improvements with healthy amounts of feature deprecations and removals. This guide explains how to install PHP 8.4 or upgrade to PHP 8.4 on Ubuntu, Debian, or their derivati

Visual Studio Code, also known as VS Code, is a free source code editor — or integrated development environment (IDE) — available for all major operating systems. With a large collection of extensions for many programming languages, VS Code can be c

If you are an experienced PHP developer, you might have the feeling that you’ve been there and done that already.You have developed a significant number of applications, debugged millions of lines of code, and tweaked a bunch of scripts to achieve op

This tutorial demonstrates how to efficiently process XML documents using PHP. XML (eXtensible Markup Language) is a versatile text-based markup language designed for both human readability and machine parsing. It's commonly used for data storage an

JWT is an open standard based on JSON, used to securely transmit information between parties, mainly for identity authentication and information exchange. 1. JWT consists of three parts: Header, Payload and Signature. 2. The working principle of JWT includes three steps: generating JWT, verifying JWT and parsing Payload. 3. When using JWT for authentication in PHP, JWT can be generated and verified, and user role and permission information can be included in advanced usage. 4. Common errors include signature verification failure, token expiration, and payload oversized. Debugging skills include using debugging tools and logging. 5. Performance optimization and best practices include using appropriate signature algorithms, setting validity periods reasonably,

A string is a sequence of characters, including letters, numbers, and symbols. This tutorial will learn how to calculate the number of vowels in a given string in PHP using different methods. The vowels in English are a, e, i, o, u, and they can be uppercase or lowercase. What is a vowel? Vowels are alphabetic characters that represent a specific pronunciation. There are five vowels in English, including uppercase and lowercase: a, e, i, o, u Example 1 Input: String = "Tutorialspoint" Output: 6 explain The vowels in the string "Tutorialspoint" are u, o, i, a, o, i. There are 6 yuan in total

Static binding (static::) implements late static binding (LSB) in PHP, allowing calling classes to be referenced in static contexts rather than defining classes. 1) The parsing process is performed at runtime, 2) Look up the call class in the inheritance relationship, 3) It may bring performance overhead.

What are the magic methods of PHP? PHP's magic methods include: 1.\_\_construct, used to initialize objects; 2.\_\_destruct, used to clean up resources; 3.\_\_call, handle non-existent method calls; 4.\_\_get, implement dynamic attribute access; 5.\_\_set, implement dynamic attribute settings. These methods are automatically called in certain situations, improving code flexibility and efficiency.
