©
このドキュメントでは、 php中国語ネットマニュアル リリース
(PHP 4, PHP 5)
eregi — 不区分大小写的正则表达式匹配
$pattern
, string $string
[, array &$regs
] )本函数和 ereg() 完全相同,只除了在匹配字母字符时忽略大小写的区别。
Example #1 eregi() 例子
<?php
$string = 'XYZ' ;
if ( eregi ( 'z' , $string )) {
echo "' $string ' contains a 'z' or 'Z'!" ;
}
?>
参见 ereg() , ereg_replace() , eregi_replace() , stripos() 和 stristr() 。
[#1] info at o08 dot com [2010-04-14 21:12:53]
because eregi is not recommended after php 5, you can replaced it with stristr if just for simple search.
For editors with regular express function:
eregi\(([^,]*),([^)]*)\)
stristr(\2,\1)
[#2] julia_kean25 at yahoo dot com [2009-02-10 14:15:30]
I use this in my app_config.php file to sanitize each request:
<?php
// app_config.php
function sanitize_request($methods, $array)
{
// methods: trim ; addslashes ; stripslashes ; etc...
// array : $_GET ; $_POST ; etc...
foreach ($methods as $function) {
$array = array_map($function, $array);
}
return $array;
}
if ( ! get_magic_quotes_gpc() )
{
$methods = array('trim', 'addslashes');
$_GET = sanitize_request($methods, $_GET);
$_POST = sanitize_request($methods, $_POST);
$_COOKIE = sanitize_request($methods, $_COOKIE);
$_REQUEST = sanitize_request($methods, $_REQUEST);
}
?>
it currently only trims and adds slashes to the request but it would be nice to have the possibility to add the striptags function too.
[#3] jrgpmaster at gmail dot com [2008-09-13 01:04:53]
Here is a simple way of checking if the visitor if your page is a search engine or a normal person. It does this by checking if the user agent returned by $_SERVER['HTTP_USER_AGENT'] contains one of the keywords search engine's user agents usually contain.
<?php
//check if user is a bot of some sort
function is_bot()
{
$bots = array('google','yahoo','msn');
//takes the list above and returns (google)|(yahoo)|(msn)
$regex = '('.implode($bots, ')|(').')';
return eregi($regex, $_SERVER['HTTP_USER_AGENT']);
}
?>
[#4] sumit270 at gmail dot com [2008-06-12 09:45:18]
Prevent XXS attack
<?php
// Prevent any possible XSS attacks via $_GET.
foreach ($_GET as $check_url) {
if ((eregi("<[^>]*script*\"?[^>]*>", $check_url)) || (eregi("<[^>]*object*\"?[^>]*>", $check_url)) ||
(eregi("<[^>]*iframe*\"?[^>]*>", $check_url)) || (eregi("<[^>]*applet*\"?[^>]*>", $check_url)) ||
(eregi("<[^>]*meta*\"?[^>]*>", $check_url)) || (eregi("<[^>]*style*\"?[^>]*>", $check_url)) ||
(eregi("<[^>]*form*\"?[^>]*>", $check_url)) || (eregi("\([^>]*\"?[^)]*\)", $check_url)) ||
(eregi("\"", $check_url))) {
die ();
}
}
unset($check_url);
?>
[#5] Jeff Morris [2008-05-30 11:06:26]
Email Address RegEx -- The Final Frontier?
Inspired by bobocop's stalwart effort. Cheers for that matey!
Contrary to most folks' expectation, a quoted @ character is permitted in the
local part of an email address. So strictly speaking bobocop's test result for
'@exam@exam.com' is ...inconclusive?
The RFC prohibits control characters in the address. So it's no coincidence
that most header-related exploits try to inject control characters into the
fields sent to the server. If we're validating client-side, we need to ensure
user input is restricted to the printable code set. And in the spirit of not
trusting anything inbound, we need to filter again server-side. It's handy
to have the same regex working at both ends.
My variant of bobocop's regex is listed below. Note the mask for the local
part matches any printable character *excluding the dot*. The dot is reserved
as a label separator. Bobocop's regex enforces that role while ensuring the
local part does not start or end with a dot.
Outside of the 7-bit ASCII and dot rules, the RFC says 'anything goes' in the
local part. Them's the breaks folks.
All we need to realise is that our endeavours are limited, and the nearest
we'll get to validating an email address is finding an MX record in DNS.
Whatever, don't go probing mail servers with test emails, you might get more
than you bargained for. That's sp@mmer territory, that is.
If you want to positively vet a mail server, consider running a check against
sbl-xbl.spamhaus.org. Search for the checkdnsrr function page on this site and
read the comments for good info.
Anyhoo, here's the modded regex builder:
//the variables
$local = '[\x20-\x2D\x2F-\x7E]';
$alnum = 'a-z0-9';
$domain = "([$alnum]([-$alnum]*[$alnum]+)?)";
//the array
$arr = array();
$arr['start'] = '^';
$arr['local'] = "$local+(\.$local+)*";
$arr['at'] = '@';
$arr['domain'] = "($domain{1,63}\.)+";
$arr['tld'] = "[$alnum]{2,6}";
$arr['end'] = '$';
//the regex
$regex = implode('',$arr);
[#6] ted devito [2008-05-02 16:29:55]
RE: validate a url
--------------------
based on "ian at hyperborea dot co dot uk" below...
original date: 10-Nov-2004 03:15
I added a test for http(s?) and ftp as well as a trailing slash on urls that don't specify a page.
now it allows...
http://test.com/
https://www.test.com
$domain = "(http(s?):\/\/|ftp:\/\/)*([[:alpha:]][-[:alnum:]]*[[:alnum:]])
(\.[[:alpha:]][-[:alnum:]]*[[:alpha:]])+";
$dir = "(/[[:alpha:]][-[:alnum:]]*[[:alnum:]])*";
$trailingslash = "(\/?)";
$page = "(/[[:alpha:]][-[:alnum:]]*\.[[:alpha:]]{3,5})?";
$getstring = "(\?([[:alnum:]][-_%[:alnum:]]*=[-_%[:alnum:]]+)
(&([[:alnum:]][-_%[:alnum:]]*=[-_%[:alnum:]]+))*)?";
$pattern = "^".$domain.$dir.$trailingslash.$page.$getstring."$";
[#7] mbfreight atthe gmail place [2008-01-03 14:53:40]
keran at kiwi-interactive dot com wrote (5ish years ago) 07-Mar-2003 08:21
$feedback = "Error: $email isn't a valid mail address!";
return $feedback;
-- and --
$feedback = "Error: $domain isn't a valid domain!";
return $feedback;
I've been crushed with patching up XSS and anytime you get user input, it's best to just not show it back to them if possible. The auditor loves throwing these at me: >"><script>alert(123)</script><" in the url, in forms, everywhere. Some looking around and you can find and build an amazing testing string.
I have found that using htmlentities($user_input) isn't enough, either. There are a few tricks that can help like
// from http://us3.php.net/manual/en/function.strip-tags.php
while($input != strip_tags($input)) {
$input = strip_tags($input);
}
In my case, I'm starting off by testing for <[tag]> as well as keyword() and then do some preg_replace ing.
[#8] shwetank dot sharma86 at gmail dot com [2007-12-07 05:13:07]
we define some notficatio for making the Expression for eregi(exp, string)
so first of all syntax
[ ] this brakit used to define chracters
eg [a-z], [0-9]
{ } this brakit used ti define range
eg {1,3}
if you wanna to make a expresion which take which take maximum
three digit no
"^[0-9]{1,3}$"
and if make exp for only three digit no than
"^[0-9]{3}$"
ok if any problem mail me
[#9] c00lways at gmail dot com [2007-03-10 18:10:54]
hodsfords:
i love your expression,
and i've came out with a solutions which does not need to set the number of times {1,3} for the domain.
it can accept unlimited number of times, but @ least 1 time .com / .xxx
$exp = "^[a-z0-9]+[a-z0-9\?\.\+-_]*" .
@[a-z0-9_-]+(\.[a-z0-9_-]+)*\.[a-z]+$";
[#10] m at tthew dot org dot uk [2006-11-07 14:47:16]
This example checks for a valid IP address or CIDR notation address range. (Thanks Walo for just the start I needed.)
The reg exp is too long to post in the code. It is:
^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}(/[0-9]{1,2}){0,1}$
(substitute for EXPR)
<?php
function checkValidIp($cidr) {
// Checks for a valid IP address or optionally a cidr notation range
// e.g. 1.2.3.4 or 1.2.3.0/24
if(!eregi("EXPR", $cidr)) {
$return = FALSE;
} else {
$return = TRUE;
}
if ( $return == TRUE ) {
$parts = explode("/", $cidr);
$ip = $parts[0];
$netmask = $parts[1];
$octets = explode(".", $ip);
foreach ( $octets AS $octet ) {
if ( $octet > 255 ) {
$return = FALSE;
}
}
if ( ( $netmask != "" ) && ( $netmask > 32 ) ) {
$return = FALSE;
}
}
return $return;
}
?>
[#11] benjohnson{-at-}maine{-dot-}rr{-dot-}com [2006-09-11 14:33:40]
It's probably worth noting that eregi() (and most likely, the related variations) appears to have a 255-character limit with respect to the length of the input it will attempt to parse.
If you try to do something like
if (!eregi("^[a-zA-Z0-9]{0,256}$", $text)) { ...
eregi() will return FALSE, irrespective of whether or not the input matches the pattern.
[#12] markus dot sipila at no dot spam dot iki dot fi dot invalid [2006-08-02 14:29:56]
One more comment about email validation and usability of validators.
The fact that RFC 2822 allows broader set of characters in email addresses than typically used makes things quite challenging usability wise.
A very common usability problem with email validators is that they do not accept all valid addresses (such as foo{bar}.baz!@example.com. Almost as common problem is that the validator only checks that the syntax is valid and passes addresses like foo#@example.com without any warning. Even though foo#@example.com is syntactically valid it might just as well be a typo of foo@example.com.
I resolved this usability challenge by doing the validation in two phases. In the first phase the address is validated so that it can't include exotic characters like { or |. Most addresses pass this validation.
If they don't, they are validated with the other validator that allows all RFC-compliant addresses. In this case the validator shows a message that the address is syntactically valid but it recommends to double check it for typos.
An example without regexps:
<?php
if (eregi($normal, $email)) {
echo("The address $email is valid and looks normal.");
}
else if (eregi($validButRare, $email)) {
echo("The address $email looks a bit strange but it is syntactically valid. You might want to check it for typos.");
}
else {
echo("The address $email is not valid.");
}
?>
The full article with the regexps and demo can be found at http://www.iki.fi/markus.sipila/pub/emailvalidator.php
[#13] tim at rocketry dot org [2006-04-29 21:32:03]
The easiest way I've found to validate a properly formed email address is this:
if(!eregi("^[[:alnum:]][a-z0-9_.-]*@[a-z0-9.-]+\.[a-z]{2,4}$", $_POST['EmailAddress'])) {
echo "<p>Not a valid email address</p>\n";
}
It basically just wants to see some alphanumeric characters + an @ sign + a . + 2 to 4 alpha characters. So far it has done what I need for quite a while now.. Hope that helps someone. :)
Tim
[#14] info at stenschke dot com [2006-02-03 04:39:59]
i needed a function to find hyperlinks containing a url as text of the hyperlink, exceeding a given maximum length.
here my function to finds too long hyperlinks and insert <br />s where needed into the linktext:
function breakTooLongLinks($text,$maxLen) {
//find hyperlinks that contain too many chars & insert <br>s where neccessary
$pattern= '[>]www[.].*
'(\.[_a-z0-9-]+)*'.
'@'.
'[a-z0-9-]+'.
'(\.[a-z0-9-]{2,})+'.
'$';
foreach ($email as $example) {
if (eregi($regex, $example)) {
echo $example . ' matched<br>';
} else {
echo $example . ' not matched<br>';
}
}
?>
[#32] banerian at u dot washington dot edu [2003-09-16 18:55:17]
It should be noted that in the function validateEmail, the final verification relies on the mail server responding to the test $email with a reply code of 250. Nowadays, many/most/all servers will reply 250/Ok for any user@thier.domain regardless of whether or not the userid actually exists. mail servers tend to accept the mail (if it passes other checks), and if the user really does not exist, it just "bounces" the mail back to sender. thus the validateEmail function can give false positives...an unfortunate victim of having to deal with spam.
[#33] Hodfords [2003-09-06 02:17:42]
Most of the Ereg functions for emails that I have seen do not
work for emails like :-
joe@something.co..uk
And emails such as
joe@something.co.uk+1234
This eregi which we put together, I think works, it should be just one line, but php.net did not let us submit, because it said that even after using wordwrap() the line was too long :-
$ereg_string =
"^[\'+\\./0-9A-Z^_\`a-z{|}~\-]+@
[a-zA-Z0-9_\-]+(\.[a-zA-Z0-9_\-]+){1,3}$";
Adjust the number "3" right at the end to whatever number
you want.
A long domain name such as "@finance.uk.yahoo.co.uk" which
has 5 parts will require the number to be higher.
[#34] tomi at vacilando dot net [2003-06-26 11:03:52]
Keran's solution is a good one but I found it does not work for a small number of domains -- because of a bug in gethostbyname() that causes some valid domains not to resolve properly. The workaround is to test such domains with "www." added to the beginning. This script does all that and works perfectly for me. Enjoy!
// Function to validate emails.
function validate_email($email_raw)
{
// replace any ' ' and \n in the email
$email_nr = eregi_replace("\n", "", $email_raw);
$email = eregi_replace(" +", "", $email_nr);
$email = strtolower( $email );
// do the eregi to look for bad characters
if( !eregi("^[a-z0-9]+([_\\.-][a-z0-9]+)*". "@([a-z0-9]+([\.-][a-z0-9]+))*$",$email) ){
// okay not a good email
$feedback = 'Error: "' . $email . '" is not a valid e-mail address!';
return $feedback;
} else {
// okay now check the domain
// split the email at the @ and check what's left
$item = explode("@", $email);
$domain = $item["1"];
if ( ( gethostbyname($domain) == $domain ) )
{
if ( gethostbyname("www." . $domain) == "www." . $domain )
{
$feedback = 'Error: "' . $domain . '" is most probably not a valid domain!';
return $feedback;
}
// ?
$feedback = "valid";
return $feedback;
} else {
$feedback = "valid";
return $feedback;
}
}
}
[#35] steve at brainwashstudios dot com [2003-04-10 10:22:50]
why not match some filenames?
if (!ereg("^[a-ZA-Z0-9]+[/.gif|/.jpg|/.png]$",$match){
echo "Invalid Filename";
}else{
if (file_exists($match)){
echo "blah";
}
}
I suppose that would work off the top of my head but I have yet to test it, so you may want to check it first.
[#36] keran at kiwi-interactive dot com [2003-03-07 20:21:33]
I couldn't get any of the email validation items above to actually work (maybe I'm thick) :)
So I adapted a couple and came up with this, and it seems to work
// function to validate email
function validate_email($email_raw) {
// replace any ' ' and \n in the email
$email_nr = eregi_replace("\n", "", $email_raw);
$email = eregi_replace(" +", "", $email_nr);
// do the ergei to look for bad characters
if( !eregi("^[a-z0-9]+([_\\.-][a-z0-9]+)*".
"@([a-z0-9]+([\.-][a-z0-9]+))*$",$email) ){
// okay not a good email
$feedback = "Error: $email isn't a valid mail address!";
return $feedback;
} else {
// okay now check the domain
// split the email at the @ and check what's left
$item = explode("@", $email);
$domain = $item["1"];
if( gethostbyname($domain) == $domain) {
$feedback = "Error: $domain isn't a valid domain!";
return $feedback;
} else {
$feedback = "valid";
return $feedback;
}
}
}
[#37] slavo at polovnictvo-mn dot sk [2003-02-19 17:30:55]
This is what I found (by Jon S. Stevens jon@clearink.com with
Copyright 1998 Jon S. Stevens, Clear Ink)
function validateEmail ($email){
global $SERVER_NAME;
$return = array(false, "" );
list ($user, $domain) = split( "@", $email, 2);
$arr = explode( ".", $domain);
$count = count ($arr);
$tld = $arr[$count - 2] . "." . $arr[$count - 1];
if(checkdnsrr($tld, "MX")) {
if(getmxrr($tld, $mxhosts, $weight)) {
for($i = 0; $i < count($mxhosts); $i++){
$fp = fsockopen($mxhosts[$i], 25);
if ($fp){
$s = 0;
$c = 0;
$out = "";
set_socket_blocking($fp, false);
do {
$out = fgets($fp, 2500);
if(ereg( "^220", $out)){
$s = 0;
$out = "";
$c++;
}
else if(($c > 0) && ($out == "")){
break;
}
else {
$s++;
}
if($s == 9999) {
break;
}
} while($out == "");
set_socket_blocking($fp, true);
fputs($fp, "HELO $SERVER_NAME\n");
$output = fgets ($fp, 2000);
fputs($fp, "MAIL FROM: <info@" . $tld . ">\n" );
$output = fgets($fp, 2000);
fputs($fp, "RCPT TO: <$email>\n");
$output = fgets($fp, 2000);
if(ereg( "^250", $output )) {
$return[0] = true;
}
else {
$return[0] = false;
$return[1] = $output;
}
fputs ($fp, "QUIT\n");
fclose($fp);
if($return[0] == true){
break;
}
}
}
}
}
return $return;
}
----
hope it helps you...
[#38] webmaster at textedit dot co dot uk [2003-01-19 11:03:17]
I notice a lot of queries involving regular expressions and is_float() regarding currency validation.
If you have a version of php that uses the std C ctype.h functions then it will ALWAYS work with the following code.
function check_price($price)
{
$data=split('[.]',$price);
if ( count($data) != 2 )
return "false";
if ( ctype_digit($data[0]) && ctype_digit($data[1]) && $data[0][0] != '0')
return "true";
else
return "false";
}
Peter Lorimer
http:www.textedit.co.uk
webmaster@textedit.co.uk
[#39] X-Istence.com [2003-01-17 18:48:50]
To check email, i use the following code:
if (!eregi ("^([a-z0-9_]|\\-|\\.)+@(([a-z0-9_]|\\-)+\\.)+[a-z]{2,4}$", $email) {
echo "Invalid Email Adress";
}
else {
echo "Valid Email Adress";
}
[#40] respectthepinguin at yahoo dot com [2003-01-03 23:30:52]
validating an email can be pretty tough... however this....
^([[:alnum]]|_|\.|-)+@([[:alnum]]|\.|-)+(\.)([a-z]{2,4})$
makes this task much easier. (:)
[#41] BKDotCom at hotmail dot com [2002-09-30 11:56:36]
in regards to using gethostbyname() to validate email addresses:
Large flaw:
Doesn't work if the host has a NS & MX record, but no A record.
[#42] henrik jensen<hj at this_netwerk dot dk> [2002-07-27 10:05:51]
Looking for at regex to check if a file is an image file, this seems to work?
Note: it does not check for illegeal filesystem names - it only looks at the filename extension.
if (eregi ("(.)+\\.(jp(e){0,1}g$|gif$|png$)",$filename)){
// This is an imagefile
}
[ remove this_ from emailaddress ]
[#43] trosho at vip dot bg [2002-06-09 04:47:06]
Please note that eregi() is case sensitive if you use cyrillic chars
[#44] re at lloc dot de [2002-03-06 07:36:35]
An additional note to one of these
expressions above: I use
"^[a-z0-9]+([_.-][a-z0-9]+)*@([a-z0-9]
+([.-][a-z0-9]+)*)+\\.[a-z]{2,4}$"
[#45] tuxx at tuxx-home dot at [2002-02-27 03:11:08]
A small note to one of these expressions above:
Inside a character class ([...]) one does not need to escape the fullstop, therefore
[-\\._]
would become
[-._]
[#46] steve at fish2find dot co dot uk [2002-02-26 07:40:05]
hope this helps with some validation problems, simple text validation:
//$firstname = "somename"; // valid
or //$firstname = "somenam3"; // not valid
//lets validate a field entry & trim off any white spaces
$firstname = trim($firstname);
//set lenght of the field to a max 12 characters
$len = "0,12";
$field = $firstname;
//call function
if (is_valid($field, $len)){
//if field entry valid then set
$valid_firstname = $field;
//set record to yes
$valid_record = "TRUE";
echo "$field is a valid name";
}
else{
//set record to null
$valid_record = "FALSE";
echo " $field is not a valid name";
}
//process to see if valid record
if ($valid_record == "FALSE"){
echo " & is not a valid record";
}
else{ //assume record is valid
echo " & is a valid record";
}
//validate field entry function
function is_valid($field, $len) {
if(eregi("^[[:alpha:]]{{$len}}$", $field)) return TRUE;
else return FALSE;
}
[#47] ruben dot no dot spam at artek dot no dot spam dot es [2002-02-08 12:28:54]
Well, this can be improved a little. According to the previous, these email addresses would be correct:
user@domain.e
user@domain.123
user@domain-ltd
I suggest this regexp:
if( !eregi( "^" .
"[a-z0-9]+([_\\.-][a-z0-9]+)*" . //user
"@" .
"([a-z0-9]+([\.-][a-z0-9]+)*)+" . //domain
"\\.[a-z]{2,}" . //sld, tld
"$", $email, $regs)
)
...
[#48] mmc at nospam dot dk [2002-01-27 06:56:51]
I'm not sure about "\." being the same as "." above.
Anyway, the mentioned regex would not recognize .museum-names, and generally isn't future safe.
Also, it doesn't verify that usernames and hostnames cannot start with "-._".
I would recommend a more general (=future safe) expression and then instead check the hostname. An example:
if( !eregi("^[a-z0-9]+([_\\.-][a-z0-9]+)*"
."@([a-z0-9]+([\.-][a-z0-9]+))*$",
$mail, $regs) )
{
echo "Error: '$mail' isn't a valid mail address!\n";
}
elseif( gethostbyname($regs[2]) == $regs[2] )
{
echo "Error: Can't find the host '$regs[2]'!<br>\n";
}
Note: I had to split the regex for it to fit this note.
Also note: The reason I'm using gethostbyname() and not getmxrr() or such is that getmxrr() doesn't work on Win2000/XP.