Some pitfalls in php-PHP Tutorial-php.cn

1. Decimals (number of dots) cannot be directly compared to see if they are equal

For example, the result of if(0.5+0.2==0.7) is false. The reason is that PHP is based on the C language, and the C language cannot accurately represent most symbol points due to its representation of binary symbol points. In fact, almost all programming languages cannot accurately represent decimals (dotted numbers). This is a common phenomenon, because this is a defect of IEEE 754. To solve this problem, we can only establish another standard. It seems that only Mathematica has solved this problem.

2. If the strings are the same, it is recommended to use === instead of ==

why? Because this comparison is weakly typed, when two comparisons are made, PHP will first try to determine whether the left and right are numbers. The question is what kind of string is a number? Is it a simple string of numbers? Far more than that, it also includes hexadecimal numbers starting with 0x, scientific notation of type XXeX, etc. For example, '12e0'=='0x0C' gets true. When numeric types are compared with strings, even some non-numeric strings starting with numbers, such as 12=='12 this string', will get true.

So in these cases, strings that are not the same may be judged to be equal. The use of === comparison is an inclusive type comparison without any conversion, so it can accurately compare whether the strings are the same.

In addition, I want to complain about JAVA. == cannot compare whether strings are equal, because strings are an object, and == becomes a judgment of whether they are the same object...

3. Excessive removal of trim series functions

The basic usage of the trim function is to remove the outermost spaces, newlines, etc. Because of its optional parameters, many people also use it to remove UTF8BOM headers, file extensions, etc., such as ltrim($str, "\xEF\xBB \xBF"); rtrim($str, ".txt"); . But soon, you will find that these functions will remove some more things. For example, if you originally wanted to remove the suffix, logtext.txt will become logte instead of logtext. why? Because the latter parameter does not mean a complete string, but a list of characters , which means that it will always check whether the leftmost/rightmost matches one of this list.

So how do we really want to remove the first and last? The saying on the Internet is to use regular expressions. I have encapsulated the corresponding three methods for easy use. The naming rule is that there is one more s than the original PHP function, which means string. The usage is the same as the original PHP function.

 ltrims(,  = ( ("/^{}/", '',  rtrims(,  = ( ("/{}$/", '',  trims(,  = ( = ("/^{}/", '',  ("/{}$/", '',  trimBOM( ("/^\xEF\xBB\xBF/", '',

Copy after login

4. Various methods of obtaining the client IP address mentioned on the Internet

A popular PHP function on the Internet to obtain the client IP address is as follows:

function getIP() {    if (getenv('HTTP_CLIENT_IP')) {        $ip = getenv('HTTP_CLIENT_IP');
    }elseif (getenv('HTTP_X_FORWARDED_FOR')) {        $ip = getenv('HTTP_X_FORWARDED_FOR');
    }elseif (getenv('HTTP_X_FORWARDED')) {        $ip = getenv('HTTP_X_FORWARDED');
    }elseif (getenv('HTTP_FORWARDED_FOR')) {        $ip = getenv('HTTP_FORWARDED_FOR');}
    }elseif (getenv('HTTP_FORWARDED')) {        $ip = getenv('HTTP_FORWARDED');
    }else {        $ip = $_SERVER['REMOTE_ADDR'];
    }    return $ip;
}

Copy after login

这函数看起来并没有什么问题，很多开源CMS之类的也在用。然而事实上，问题大着呢！首先第一步，是要了解这些 getenv 读取的东西到底是什么玩意，又是从哪来的。简单来说这些其实是HTTP header，有些代理服务器会把源请求地址放到header里，所以我们服务器可以知道访问用户的原始IP地址。但是，并不是所有代理服务器都会这么做，也并不是只有代理服务器会这么做。

而实际上，这些HTTP header是可以随便改动的，比如curl就可以自己设置各种HTTP header。如果用此函数得到的结果，进行IP限制等操作的话是很轻易绕过的。更可怕的是，如果后续程序没有对此函数取得的IP地址进行格式校验过滤的话，就很微妙地为SQL注入打开了一扇窗户。所以比较保险的方式是只读取非HTTP header的 $_SERVER['REMOTE_ADDR']

PHP5.4及以上可以使用以下函数判断是否符合IP地址格式 filter_var($ip, FILTER_VALIDATE_IP) ，老版本需自行写正则。

五、foreach的保留现象

使用 foreach($someArr as $someL){ } 之类的用法时，要注意最后的一个 $someL 会一直保留到该函数/方法结束。而当使用引用的时候 foreach($someArr as &$someL){ }这是以引用来保存，也就是说后面若有使用同一个名字的变量名，将会把原数据改变（就像一个乱用的C指针）。为安全起见，建议每个foreach（尤其是引用的）结束之后都使用unset把这些变量清除掉。

foreach($someArr as &$someL){    //doSomething ...}unset($someL);

Copy after login

六、htmlspecialchars 函数默认不转义单引号

不少网站都是使用此函数作为通用的输入过滤函数，但是此函数默认情况是不过滤单引号的。这是非常非常地容易造成XSS漏洞。这样的做法和不过滤双引号没太大区别，只要前端写得稍微有点不规范（用了单引号）就会中招。下面这个示例改编自知乎梧桐雨的回答

  ' />

Copy after login

要求所有的时候都使用双引号不得使用单引号，这其实不太现实。所以，这个主要还是后端的责任，把单引号也要转义，我们用的时候一定要给这个函数加上参数 htmlspecialchars( $data, ENT_QUOTES);

很多人向Thinkphp框架提出过这个问题，因为其默认过滤方法就是无参数的htmlspecialchars，不过滤单引号，而其官方答复是“I函数的作用不能等同于防止SQL注入，可以自定义函数来过滤”……毛线啊，最基本的防护都不给力，这是给埋了多少隐患啊。在此强烈各位使用者重新定义默认过滤函数，我自己定义的是 htmlspecialchars(trim($data), ENT_QUOTES); ，有更好建议欢迎评论。同时非常希望TP官方更正此问题。

关于XSS，容我多说两句，请看下面这个例子。

<span style="color:#ff00ff;"><?</span><span style="color:#ff00ff;"><span style="color:#000000;"><span style="color:#ff00ff;">php</span> $name='alert(1)';</span> </span><span style="color:#ff00ff;">?></span><span style="color:#0000ff;"><</span><span style="color:#800000;">p </span><span style="color:#ff0000;">id</span><span style="color:#0000ff;">="XSS2"</span><span style="color:#0000ff;">></</span><span style="color:#800000;">p</span><span style="color:#0000ff;">></span><span style="color:#0000ff;"><</span><span style="color:#800000;">script </span><span style="color:#ff0000;">src</span><span style="color:#0000ff;">="//cdn.batsing.com/jquery.js"</span><span style="color:#0000ff;">></</span><span style="color:#800000;">script</span><span style="color:#0000ff;">></span><span style="color:#0000ff;"><</span><span style="color:#800000;">script</span><span style="color:#0000ff;">></span><span style="background-color:#f5f5f5;color:#000000;">$(</span><span style="background-color:#f5f5f5;color:#000000;">"</span><span style="background-color:#f5f5f5;color:#000000;">#XSS2</span><span style="background-color:#f5f5f5;color:#000000;">"</span><span style="background-color:#f5f5f5;color:#000000;">)[</span><span style="background-color:#f5f5f5;color:#000000;">0</span><span style="background-color:#f5f5f5;color:#000000;">].innerHTML </span><span style="background-color:#f5f5f5;color:#000000;">=</span> <span style="background-color:#f5f5f5;color:#000000;"><?=</span><span style="background-color:#f5f5f5;color:#000000;">$name</span><span style="background-color:#f5f5f5;color:#000000;">?></span><span style="background-color:#f5f5f5;color:#000000;">;
$("#XSS2").html( <?=$name?> );
$(</span><span style="background-color:#f5f5f5;color:#000000;">"</span><span style="background-color:#f5f5f5;color:#000000;">#XSS2</span><span style="background-color:#f5f5f5;color:#000000;">"</span><span style="background-color:#f5f5f5;color:#000000;">)[</span><span style="background-color:#f5f5f5;color:#000000;">0</span><span style="background-color:#f5f5f5;color:#000000;">].innerHTML </span><span style="background-color:#f5f5f5;color:#000000;">=</span> <span style="background-color:#f5f5f5;color:#000000;">"</span><span style="background-color:#f5f5f5;color:#000000;"><?=$name?></span><span style="background-color:#f5f5f5;color:#000000;">"</span><span style="background-color:#f5f5f5;color:#000000;">;
$("#XSS2").html(" <?=$name?> ");</span><span style="color:#0000ff;"></</span><span style="color:#800000;">script</span><span style="color:#0000ff;">></span>

Copy after login

The 1st and 2nd lines of JS will cause XSS vulnerabilities, but the 3rd and 4th lines will not. As for alert(1), there is no better way to filter such a string on the back end. The only effective method may be to add quotation marks at both ends of the data. The main responsibility still lies with the front end. When using the output of innerHTML and jQuery's html(), be sure to ensure the parameters passed in is a string, otherwise it is no less dangerous than the eval function