Community

Learn

Tools Library

AI Tools

Leisure

English

Home > Backend Development > PHP Tutorial > Use PHP to determine whether a file is UTF-8 encoded (check Bom)_PHP tutorial

Use PHP to determine whether a file is UTF-8 encoded (check Bom)_PHP tutorial

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Release： 2016-07-21 14:53:53

Original

1297 people have browsed it

UTF-8 encoded files are divided into two types: with Bom and without Bom. The one with Bom is easy for everyone to process, while the one without Bom will be a bit troublesome, so I wrote a function to judge. The code is as follows :

//Return 1 means pure ASCII (that is, all characters are not greater than 127)
//Return 2 means UTF8
//Return 0 means normal gb encoding

function TestUtf8($text)
{
if(strlen($text) < 3) return false;
$lastch = 0;
$begin = 0;
$ BOM = true;
$BOMchs = array(0xEF, 0xBB, 0xBF);
$good = 0;
$bad = 0;
$notAscii = 0;
for($i =0; $i < strlen($text); $i++)
{
$ch = ord($text[$i]);
if($begin < 3)
{
$BOM = ($BOMchs[$begin]==$ch);
$begin += 1;
continue;
}

if($begin== 4 && $BOM) break;

if($ch >= 0x80 ) $notAscii++;

if( ($ch&0xC0) == 0x80 )
{
if( ($lastch&0xC0) == 0xC0 )
{
$good += 1;
}
else if( ($lastch&0x80) == 0 )
{
$bad += 1;
}
}
else if( ($lastch&0xC0) == 0xC0 )
{
$bad += 1;
}
$lastch = $ch;
}
if($begin == 4 && $BOM)
{
return 2;
}
else if($notAscii==0)
{
return 1;
}
else if ($good >= $bad )
{
return 2;
}
else
{
return 0;
}
}

Related labels：

bom php utf-8 judgment and bring document examine use of coding

Previous article：Implementation of PHP scheduled tasks and scheduled execution tasks_PHP Tutorial Next article：Get the client IP with php, simple and practical_PHP tutorial

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Latest Articles by Author

How LLMs Work: Pre-Training to Post-Training, Neural Networks, Hallucinations, and Inference

2025-02-26 03:58:14
I Combined the Blockchain and AI to Generate Art. Here’s What Happened Next.

2025-02-26 03:38:10
Advanced Prompt Engineering: Chain of Thought (CoT)

2025-02-26 03:17:10
Retrieval Augmented Generation in SQLite

2025-02-26 02:49:09
How to Use an LLM-Powered Boilerplate for Building Your Own Node.js API

2025-02-26 01:08:13
LLMs for Coding in 2024: Price, Performance, and the Battle for the Best

2025-02-26 00:46:10
Prompting Vision Language Models

2025-02-25 23:42:08
How to Measure the Reliability of a Large Language Model's Response

2025-02-25 22:50:13
An Illusion of Life

2025-02-25 21:54:11
Scientists Go Serious About Large Language Models Mirroring Human Thinking

2025-02-25 20:45:11

Latest Issues

用{?><?php}报错Parse error: syntax error, unexpected end of file in E:\web\WWW\web23\index1.php on line 36

From 1970-01-01 08:00:00

0

0

0

用phpstorm

From 1970-01-01 08:00:00

0

0

0

angular.js - angularJS ng-style用法

From 1970-01-01 08:00:00

0

0

0

javascript - typescript 使用const 报Cannot redeclare block-scoped variable

From 1970-01-01 08:00:00

0

0

0

Why does my ajax keep calling error! ! ! ! !

From 1970-01-01 08:00:00

0

0

0

Related Topics

More>

Popular Recommendations

Popular Tutorials

More>

Related Tutorials

Popular Recommendations

Latest courses

Latest Downloads

More>

Web Effects

Website Source Code

Website Materials

Front End Template