Home Backend Development PHP Tutorial 用 PHP 实现 POP3 邮件的解码(2)_PHP

用 PHP 实现 POP3 邮件的解码(2)_PHP

Jun 01, 2016 pm 12:32 PM
accomplish coding express mail part

POP3

MIME 编码方式简介


  MIME 编码方式简介

  Subject: =?gb2312?B?xOO6w6Oh?=

  这里是邮件的主题,可是因为编码了,我们看不出是什么内容,其原来的文本是:“你好!”我们先看看 MIME 编码的两种方法。

  对邮件进行编码最初的原因是因为 Internet 上的很多网关不能正确传输8 bit 内码的字符,比如汉字等。编码的原理就是把 8 bit 的内容转换成 7 bit 的形式以能正确传输,在接收方收到之后,再将其还原成 8 bit 的内容。

  MIME 是“多用途网际邮件扩充协议”的缩写,在 MIME 协议之前,邮件的编码曾经有过 UUENCODE 等编码方式 ,但是由于 MIME 协议算法简单,并且易于扩展,现在已经成为邮件编码方式的主流,不仅是用来传输 8 bit 的字符,也可以用来传送二进制的文件 ,如邮件附件中的图像、音频等信息,而且扩展了很多基于MIME 的应用。从编码方式来说,MIME 定义了两种编码方法Base64与QP(Quote-Printable) :

  Base 64 是一种通用的方法,其原理很简单,就是把三个Byte的数据用 4 个Byte表示,这样,这四个Byte 中,实际用到的都只有前面6 bit,这样就不存在只能传输 7bit 的字符的问题了。Base 64的缩写一般是“B”,像这封信中的Subject 就是用的 Base64 编码。

  另一种方法是QP(Quote-Printable) 方法,通常缩写为“Q”方法,其原理是把一个 8 bit 的字符用两个16进制数值表示,然后在前面加“=”。所以我们看到经过QP编码后的文件通常是这个样子:=B3=C2=BF=A1=C7=E5=A3=AC=C4=FA=BA=C3=A3=A1。

  在 PHP 里,系统有两个函数可以很方便地实现解码:base64_decode()与quoted_printable_decode(),前者可用于base64 编码的解码,后者是用于 QP 编码方法的解码。

  现在我们再来看看Subject: =?gb2312?B?xOO6w6Oh?= 这一主题的内容,这不是一段完整的编码,只有部分是编码了的,这个部分用 =? ?= 两个标记括起来,=? 后面说明的是这段文字的字符集是 GB2312 ,然后一个 ? 后面的一个 B 表示的是用的 Base64 编码。通过这段分析,我们来看一下这个 MIME 解码的函数:(该函数由 PHPX.COM 站长 Sadly 提供,本人将其放入一个类中,并做了少量的修改,在此致谢)

  function decode_mime($string) {

   $pos = strpos($string, ‘=?‘);

   if (!is_int($pos)) {

     return $string;

   }

   $preceding = substr($string, 0, $pos); // save any preceding text

   $search = substr($string, $pos+2); /* the mime header spec says this is the longest a single encoded word can be */

   $d1 = strpos($search, ‘?‘);

   if (!is_int($d1)) {

     return $string;

   }

   $charset = substr($string, $pos+2, $d1); //取出字符集的定义部分

   $search = substr($search, $d1+1); //字符集定义以后的部分=>$search;

   $d2 = strpos($search, ‘?‘);

   if (!is_int($d2)) {

     return $string;

   }

   $encoding = substr($search, 0, $d2); ////两个? 之间的部分编码方式 :q 或 b 

   $search = substr($search, $d2+1);

   $end = strpos($search, ‘?=‘); //$d2+1 与 $end 之间是编码了 的内容:=> $endcoded_text;

   if (!is_int($end)) {

     return $string;

   }

   $encoded_text = substr($search, 0, $end);

   $rest = substr($string, (strlen($preceding . $charset . $encoding . $encoded_text)+6)); //+6 是前面去掉的 =????= 六个字符

   switch ($encoding) {

   case ‘Q‘:

   case ‘q‘:

     //$encoded_text = str_replace(‘_‘, ‘%20‘, $encoded_text);

     //$encoded_text = str_replace(‘=‘, ‘%‘, $encoded_text);

     //$decoded = urldecode($encoded_text);

   $decoded=quoted_printable_decode($encoded_text);

     if (strtolower($charset) == ‘windows-1251‘) {

     $decoded = convert_cyr_string($decoded, ‘w‘, ‘k‘);

     }

     break;

   case ‘B‘:

   case ‘b‘:

     $decoded = base64_decode($encoded_text);

     if (strtolower($charset) == ‘windows-1251‘) {

     $decoded = convert_cyr_string($decoded, ‘w‘, ‘k‘);

     }

     break;

   default:

     $decoded = ‘=?‘ . $charset . ‘?‘ . $encoding . ‘?‘ . $encoded_text . ‘?=‘;

     break;

   }

   return $preceding . $decoded . $this->decode_mime($rest);

  }

  这个函数用了递归的方法来实现一段包含有如上的 Subject 段的字符的解码。程序中已经加上了注释。相信有点PHP 编程基础的人都能够看得明白。该函数也是调用的base64_decode()与quoted_printable_decode()两个系统函数实现的解码,但是需要对邮件源文件进行大量的字符串的分析。不过,PHP 的字符串操作可以算是所有语言里最为方便自由的。函数的最后return $preceding . $decoded . $this->decode_mime($rest); 实现递归解码,因为这个函数实际上是放在后面要介绍的一个 MIME解码的类中的,所以用了 $this->decode_mime($rest)这种形式的调用方法。

  下面我们来看正文。这里关系到 MIME 的一些头信息,我们先做一个简单的介绍(如果读者有兴趣了解更多的内容,请参考 MIME 的官方文档)。

  MIME-Version: 1.0

  表示使用的 MIME 的版本号,一般是1.0;

  Content-Type: 定义了正文的类型,我们实际上是通过这个标识来知道正文内是什么类型的文件,比如:text/plain 表示的是无格式的文本正文,text/html 表示的 Html 文档,image/gif 表示的是 gif 格式的图片等等。在本文中特别要说明一下的是邮件中常用到的复合类型。multipart 类型表示正文是由多个部分组成的,后面的子类型说明的是这些部分之间的关系,邮件中用到的三个类型有,multipart/alternative:表示正文由两个部分组成,可以选择其中的任意一个。主要作用是在征文同时有 text 格式和 html 格式时,可以在两个正文中选择一个来显示,支持 html 格式的邮件客户端软件一般会显示其 HTML 正文,而不支持的则会显示其 Text 正文;multipart/mixed :表示文档的多个部分是混合的,指正文与附件的关系。如果邮件的 MIME 类型是multipart/mixed,即表示邮件带有附件;multipart/related :表示文档的多个部分是相关的,一般用来描述 Html 正文与其相关的图片。

  这些复合类型又是可以嵌套使用的,比如说一个带有附件的邮件,同时有 html 与 text 两种格式的正文,则邮件的结构是:

  Content-Type: multipart/mixed

   部分一:

   Content Type : multipart/alternative:

   Text 正文;

   Html 格式的正文 

  部分二:

   附件

  邮件结束符;

  由于复合类型由多个部分组成,因此,需要一个分隔符来分隔这多个部分,这就是上面的邮件源文件中的boundary="----=_NextPart_000_0007_01C03166.5B1E9510"所描述的,对于每一个Contect type :multipart/* 的内容,都会有这么一个说明,表示多个部分之间的分隔,这个分隔符是正文中不可能出现的一串古字符的组合,在文档中,以 "--" 加上这个boundary 来表示一个部分的开始,在文档的结束,以"--"加boundary再在最后加上 "--" 来表示文档的结束。由于复合类型是可以嵌套使用的,因此,邮件中可能会多个 boundary 。

  还有一个最重要的 MIME 头标签:

  Content-Transfer-Encoding: base64 它表示了这个部分文档的编码方式,也就是我们上面所介绍的Base64或QP(Quote-Printable)。我们只有识别了这个说明,才能用正确的解码方式实现对其解码。

  限于篇幅,对于 MIME 的介绍就只说到这里。下面我将给出一个解码MIME邮件的类,并对其做简要说明。

作者:陈俊清
转载:中华网
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Outlook emails lost from control panel in Windows 11 Outlook emails lost from control panel in Windows 11 Feb 29, 2024 pm 03:16 PM

Is the Outlook mail icon missing from Windows 11's Control Panel? This unexpected situation has caused confusion and concern among some individuals who rely on OutlookMail for their communication needs. Why don't my Outlook emails show up in Control Panel? There may be several possible reasons why there are no Outlook mail icons in Control Panel: Outlook is not installed correctly. Installing Office applications from the Microsoft Store does not add the Mail applet to Control Panel. The location of the mlcfg32.cpl file in Control Panel is missing. The path to the mlcfg32.cpl file in the registry is incorrect. The operating system is not currently configured to run this application

How to implement dual WeChat login on Huawei mobile phones? How to implement dual WeChat login on Huawei mobile phones? Mar 24, 2024 am 11:27 AM

How to implement dual WeChat login on Huawei mobile phones? With the rise of social media, WeChat has become one of the indispensable communication tools in people's daily lives. However, many people may encounter a problem: logging into multiple WeChat accounts at the same time on the same mobile phone. For Huawei mobile phone users, it is not difficult to achieve dual WeChat login. This article will introduce how to achieve dual WeChat login on Huawei mobile phones. First of all, the EMUI system that comes with Huawei mobile phones provides a very convenient function - dual application opening. Through the application dual opening function, users can simultaneously

Word mail merge prints blank page Word mail merge prints blank page Feb 19, 2024 pm 04:51 PM

If you find that blank pages appear when printing a mail merge document using Word, this article will help you. Mail merge is a convenient feature that allows you to easily create personalized documents and send them to multiple recipients. In Microsoft Word, the mail merge feature is highly regarded because it helps users save time manually copying the same content for each recipient. In order to print the mail merge document, you can go to the Mailings tab. But some Word users have reported that when trying to print a mail merge document, the printer prints a blank page or doesn't print at all. This may be due to incorrect formatting or printer settings. Try checking the document and printer settings and make sure to preview the document before printing to ensure the content is correct. if

How to implement the WeChat clone function on Huawei mobile phones How to implement the WeChat clone function on Huawei mobile phones Mar 24, 2024 pm 06:03 PM

How to implement the WeChat clone function on Huawei mobile phones With the popularity of social software and people's increasing emphasis on privacy and security, the WeChat clone function has gradually become the focus of people's attention. The WeChat clone function can help users log in to multiple WeChat accounts on the same mobile phone at the same time, making it easier to manage and use. It is not difficult to implement the WeChat clone function on Huawei mobile phones. You only need to follow the following steps. Step 1: Make sure that the mobile phone system version and WeChat version meet the requirements. First, make sure that your Huawei mobile phone system version has been updated to the latest version, as well as the WeChat App.

PHP Programming Guide: Methods to Implement Fibonacci Sequence PHP Programming Guide: Methods to Implement Fibonacci Sequence Mar 20, 2024 pm 04:54 PM

The programming language PHP is a powerful tool for web development, capable of supporting a variety of different programming logics and algorithms. Among them, implementing the Fibonacci sequence is a common and classic programming problem. In this article, we will introduce how to use the PHP programming language to implement the Fibonacci sequence, and attach specific code examples. The Fibonacci sequence is a mathematical sequence defined as follows: the first and second elements of the sequence are 1, and starting from the third element, the value of each element is equal to the sum of the previous two elements. The first few elements of the sequence

Knowledge graph: the ideal partner for large models Knowledge graph: the ideal partner for large models Jan 29, 2024 am 09:21 AM

Large language models (LLMs) have the ability to generate smooth and coherent text, bringing new prospects to areas such as artificial intelligence conversation and creative writing. However, LLM also has some key limitations. First, their knowledge is limited to patterns recognized from training data, lacking a true understanding of the world. Second, reasoning skills are limited and cannot make logical inferences or fuse facts from multiple data sources. When faced with more complex and open-ended questions, LLM's answers may become absurd or contradictory, known as "illusions." Therefore, although LLM is very useful in some aspects, it still has certain limitations when dealing with complex problems and real-world situations. In order to bridge these gaps, retrieval-augmented generation (RAG) systems have emerged in recent years. The core idea is

Master how Golang enables game development possibilities Master how Golang enables game development possibilities Mar 16, 2024 pm 12:57 PM

In today's software development field, Golang (Go language), as an efficient, concise and highly concurrency programming language, is increasingly favored by developers. Its rich standard library and efficient concurrency features make it a high-profile choice in the field of game development. This article will explore how to use Golang for game development and demonstrate its powerful possibilities through specific code examples. 1. Golang’s advantages in game development. As a statically typed language, Golang is used in building large-scale game systems.

PHP Game Requirements Implementation Guide PHP Game Requirements Implementation Guide Mar 11, 2024 am 08:45 AM

PHP Game Requirements Implementation Guide With the popularity and development of the Internet, the web game market is becoming more and more popular. Many developers hope to use the PHP language to develop their own web games, and implementing game requirements is a key step. This article will introduce how to use PHP language to implement common game requirements and provide specific code examples. 1. Create game characters In web games, game characters are a very important element. We need to define the attributes of the game character, such as name, level, experience value, etc., and provide methods to operate these

See all articles