©
This document uses PHP Chinese website manual Release
(PHP 4 >= 4.2.0, PHP 5)
mb_ereg — Regular expression match with multibyte support
$pattern
, string $string
[, array $regs
] )Executes the regular expression match with multibyte support.
pattern
The search pattern.
string
The search string .
regs
Contains a substring of the matched string .
Executes the regular expression
match with multibyte support, and returns 1 if matches are found.
If the optional regs
parameter was specified, the function
returns the byte length of matched part, and the array regs
will contain the substring of matched
string. The function returns 1 if it matches with the empty
string. If no matches are found or an error happens, FALSE
will be
returned.
Note:
mb_regex_encoding() 指定的内部编码或字符编码将会当作此函数用的字符编码。
[#1] mb_ereg() seems unable to Use "named sub [2015-05-06 13:42:19]
mb_ereg() seems unable to Use "named subpattern".
preg_match() seems a substitute only in UTF-8 encoding.
<?php
$text = 'multi_byte_string';
$pattern = '.*(?<name>string).*'; // "?P" causes "mbregex compile err" in PHP 5.3.5
if(mb_ereg($pattern, $text, $matches)){
echo '<pre>'.print_r($matches, true).'</pre>';
}else{
echo 'no match';
}
?>
This code ignores "?<name>" in $pattern and displays below.
Array
(
[0] => multi_byte_string
[1] => string
)
$pattern = '/.*(?<name>string).*/u';
if(preg_match($pattern, $text, $matches)){
instead of lines 2 & 3
displays below (in UTF-8 encoding).
Array
(
[0] => multi_byte_string
[name] => string
[1] => string
)
[#2] Riikka K [2014-07-13 16:23:18]
While hardly mentioned anywhere, it may be useful to note that mb_ereg uses Oniguruma library internally. The syntax for the default mode (ruby) is described here:
http://www.geocities.jp/kosako3/oniguruma/doc/RE.txt
[#3] pressler at hotmail dot de [2012-11-19 22:50:53]
Note that mb_ereg() does not support the \uFFFF unicode syntax but uses \x{FFFF} instead:
<?PHP
$text = 'Peter is a boy.'; // english
$text = '???? ?? ???.'; // arabic
//$text = '???? ??? ???.'; // hebrew
mb_regex_encoding('UTF-8');
if(mb_ereg('[\x{0600}-\x{06FF}]', $text)) // arabic range
//if(mb_ereg('[\x{0590}-\x{05FF}]', $text)) // hebrew range
{
echo "Text has some arabic/hebrew characters.";
}
else
{
echo "Text doesnt have arabic/hebrew characters.";
}
?>
[#4] Jon [2009-04-11 04:22:19]
Hebrew regex tested on PHP 5, Ubuntu 8.04.
Seems to work fine without the mb_regex_encoding lines (commented out).
Didn't seem to work with \uxxxx (also commented out).
<?php
echo "Line ";
//mb_regex_encoding("ISO-8859-8");
//if(mb_ereg(".*([\u05d0-\u05ea]).*", $this->current_line))
if(mb_ereg(".*([?-?]).*", $this->current_line))
{
echo "has";
}
else
{
echo "doesn't have";
}
echo " Hebrew characters.<br>";
//mb_regex_encoding("UTF-8");
?>