Table of Contents
asort 的问题
源码分析
SORT_REGULAR
SORT_STRING
总结
最后的测试
Home Backend Development PHP Tutorial php 中 SORT_REGULAR 和 SORT_STRING 的区别

php 中 SORT_REGULAR 和 SORT_STRING 的区别

Jun 23, 2016 pm 01:04 PM

asort 的问题

有一次在使用 php 基本函数 asort 的时候遇到了一个问题:

<?php    $arr = [        "nonce_str" => "441469",        "timestamp" => "1464334314"    ];    asort($arr);    var_dump($arr);?>
Copy after login

这样排序出来的结果是:

array(2) {    ["nonce_str"]=>    string(6) "441469"    ["timestamp"]=>    string(10) "1464334314"}
Copy after login

WTF?

不应该呀,为什么排序出来 4 在 1 的前面呢,字符串不应该是以字符串比较的方式来排序么?

查阅 php.net 参考文档得知 sort 类函数的第二个参数为 SORT_FLAG:

SORT_REGULAR - 正常比较单元(不改变类型)

SORT_NUMERIC - 单元被作为数字来比较

SORT_STRING - 单元被作为字符串来比较

SORT_LOCALE_STRING - 根据当前的区域(locale)设置来把单元当作字符串比较,可以用 setlocale() 来改变。

SORT_NATURAL - 和 natsort() 类似对每个单元以“自然的顺序”对字符串进行排序。 PHP 5.4.0 中新增的。

SORT_FLAG_CASE - 能够与 SORT_STRING 或 SORT_NATURAL 合并(OR 位运算),不区分大小写排序字符串。

看来默认的排序方式是所谓的 SORT_REGULAR。可是,“正常比较单元”是什么意思呢?“不改变类型”又是指什么,既然不改变类型,那么我的两个字符串就应该以字符串比较 strcmp 的顺序排列吧,也就是说,字符串 "1464334314" 应该在 "441469" 前面才对!带着这些疑问,我找到了 php 的源代码。

源码分析

http://git.php.net/?p=php-src.git;a=blob_plain;f=ext/standard/array.c;hb=42be298b3020337653cfcbdd87698b90006b2197

php-src.git/ext/standard/array.c, 887:

/* {{{ proto bool asort(array &array_arg [, int sort_flags])   Sort an array and maintain index association */PHP_FUNCTION(asort){    zval *array;    zend_long sort_type = PHP_SORT_REGULAR;    compare_func_t cmp;    if (zend_parse_parameters(ZEND_NUM_ARGS(), "a/|l", &array, &sort_type) == FAILURE) {        RETURN_FALSE;    }    cmp = php_get_data_compare_func(sort_type, 0);    if (zend_hash_sort(Z_ARRVAL_P(array), cmp, 0) == FAILURE) {        RETURN_FALSE;    }    RETURN_TRUE;}/* }}} */
Copy after login

可以看到,进行具体比较顺序控制的函数指针是 cmp,是通过向 php_get_data_compare_func 传入 sort_type 得到的,sort_type 也就是 SORT_REGULAR / SORT_STRING 这样的标记。

php-src.git/ext/standard/array.c, 632:

static compare_func_t php_get_data_compare_func(zend_long sort_type, int reverse) /* {{{ */{    switch (sort_type & ~PHP_SORT_FLAG_CASE) {        // ...        case PHP_SORT_STRING:            if (sort_type & PHP_SORT_FLAG_CASE) {                if (reverse) {                    return php_array_reverse_data_compare_string_case;                } else {                    return php_array_data_compare_string_case;                }            } else {                if (reverse) {                    return php_array_reverse_data_compare_string;                } else {                    return php_array_data_compare_string;                }            }            break;        // ...        case PHP_SORT_REGULAR:        default:            if (reverse) {                return php_array_reverse_data_compare;            } else {                return php_array_data_compare;            }            break;    }    return NULL;}/* }}} */
Copy after login

在这个函数中我们可以看到,SORT_REGULAR 采用了 php_array_data_compare 进行比较,而 SORT_STRING 采用了 php_array_data_compare_string 进行比较。

php-src.git/ext/standard/array.c, 370:

/* Numbers are always smaller than strings int this function as it * anyway doesn't make much sense to compare two different data types. * This keeps it consistent and simple. * * This is not correct any more, depends on what compare_func is set to. */static int php_array_data_compare(const void *a, const void *b) /* {{{ */{    // ...    if (compare_function(&result, first, second) == FAILURE) {        return 0;    }    ZEND_ASSERT(Z_TYPE(result) == IS_LONG);    return Z_LVAL(result);}/* }}} */
Copy after login

php-src.git/ext/standard/array.c, 465:

static int php_array_data_compare_string(const void *a, const void *b) /* {{{ */{    // ...    return string_compare_function(first, second);}/* }}} */
Copy after login

现在我们得到了两条调用链:

SORT_REGULAR -> php_get_data_compare_func -> php_array_data_compare -> compare_function;

SORT_STRING -> php_get_data_compare_func -> php_array_data_compare_string -> string_compare_function;

SORT_REGULAR

先从第一条的 compare_function 开始:

http://git.php.net/?p=php-src.git;a=blob_plain;f=Zend/zend_operators.c;hb=42be298b3020337653cfcbdd87698b90006b2197

php-src.git/Zend/zend_operators.c, 1818:

ZEND_API int ZEND_FASTCALL compare_function(zval *result, zval *op1, zval *op2) /* {{{ */{    // ...    while (1) {        switch (TYPE_PAIR(Z_TYPE_P(op1), Z_TYPE_P(op2))) {            // ...            case TYPE_PAIR(IS_STRING, IS_STRING):                if (Z_STR_P(op1) == Z_STR_P(op2)) {                    ZVAL_LONG(result, 0);                    return SUCCESS;                }                ZVAL_LONG(result, zendi_smart_strcmp(Z_STR_P(op1), Z_STR_P(op2)));                return SUCCESS;            // ...        }    }}/* }}} */
Copy after login

SORT_REGULAR -> php_get_data_compare_func -> php_array_data_compare -> compare_function -> zendi_smart_strcmp;

php-src.git/Zend/zend_operators.c, 2681:

ZEND_API zend_long ZEND_FASTCALL zendi_smart_strcmp(zend_string *s1, zend_string *s2) /* {{{ */{    int ret1, ret2;    int oflow1, oflow2;    zend_long lval1 = 0, lval2 = 0;    double dval1 = 0.0, dval2 = 0.0;    if ((ret1 = is_numeric_string_ex(s1->val, s1->len, &lval1, &dval1, 0, &oflow1)) &&        (ret2 = is_numeric_string_ex(s2->val, s2->len, &lval2, &dval2, 0, &oflow2))) {#if ZEND_ULONG_MAX == 0xFFFFFFFF        if (oflow1 != 0 && oflow1 == oflow2 && dval1 - dval2 == 0. &&            ((oflow1 == 1 && dval1 > 9007199254740991. /*0x1FFFFFFFFFFFFF*/)            || (oflow1 == -1 && dval1 < -9007199254740991.))) {#else        if (oflow1 != 0 && oflow1 == oflow2 && dval1 - dval2 == 0.) {#endif            /* both values are integers overflown to the same side, and the             * double comparison may have resulted in crucial accuracy lost */            goto string_cmp;        }        if ((ret1 == IS_DOUBLE) || (ret2 == IS_DOUBLE)) {            if (ret1 != IS_DOUBLE) {                if (oflow2) {                    /* 2nd operand is integer > LONG_MAX (oflow2==1) or < LONG_MIN (-1) */                    return -1 * oflow2;                }                dval1 = (double) lval1;            } else if (ret2 != IS_DOUBLE) {                if (oflow1) {                    return oflow1;                }                dval2 = (double) lval2;            } else if (dval1 == dval2 && !zend_finite(dval1)) {                /* Both values overflowed and have the same sign,                 * so a numeric comparison would be inaccurate */                goto string_cmp;            }            dval1 = dval1 - dval2;            return ZEND_NORMALIZE_BOOL(dval1);        } else { /* they both have to be long's */            return lval1 > lval2 ? 1 : (lval1 < lval2 ? -1 : 0);        }    } else {        int strcmp_ret;string_cmp:        strcmp_ret = zend_binary_strcmp(s1->val, s1->len, s2->val, s2->len);        return ZEND_NORMALIZE_BOOL(strcmp_ret);    }}/* }}} */
Copy after login

关键来了,这里我们可以看到,zendi_smart_strcmp 先判断需要比较的两个字符串是否是以数字的形式表现的类型。如果是,则将“数字”一样的字符串作为整型或浮点型进行比较,如果不是,则将字符串用 zend_binary_strcmp 进行比较。

php-src.git/Zend/zend_operators.c, 2791:

ZEND_API zend_uchar ZEND_FASTCALL _is_numeric_string_ex(const char *str, size_t length, zend_long *lval, double *dval, int allow_errors, int *oflow_info) /* {{{ */{    const char *ptr;    int digits = 0, dp_or_e = 0;    double local_dval = 0.0;    zend_uchar type;    zend_long tmp_lval = 0;    int neg = 0;    if (!length) {        return 0;    }    if (oflow_info != NULL) {        *oflow_info = 0;    }    /* Skip any whitespace     * This is much faster than the isspace() function */    while (*str == ' ' || *str == '\t' || *str == '\n' || *str == '\r' || *str == '\v' || *str == '\f') {        str++;        length--;    }    ptr = str;    if (*ptr == '-') {        neg = 1;        ptr++;    } else if (*ptr == '+') {        ptr++;    }    if (ZEND_IS_DIGIT(*ptr)) {        /* Skip any leading 0s */        while (*ptr == '0') {            ptr++;        }        /* Count the number of digits. If a decimal point/exponent is found,         * it's a double. Otherwise, if there's a dval or no need to check for         * a full match, stop when there are too many digits for a long */        for (type = IS_LONG; !(digits >= MAX_LENGTH_OF_LONG && (dval || allow_errors == 1)); digits++, ptr++) {check_digits:            if (ZEND_IS_DIGIT(*ptr)) {                tmp_lval = tmp_lval * 10 + (*ptr) - '0';                continue;            } else if (*ptr == '.' && dp_or_e < 1) {                goto process_double;            } else if ((*ptr == 'e' || *ptr == 'E') && dp_or_e < 2) {                const char *e = ptr + 1;                if (*e == '-' || *e == '+') {                    ptr = e++;                }                if (ZEND_IS_DIGIT(*e)) {                    goto process_double;                }            }            break;        }        if (digits >= MAX_LENGTH_OF_LONG) {            if (oflow_info != NULL) {                *oflow_info = *str == '-' ? -1 : 1;            }            dp_or_e = -1;            goto process_double;        }    } else if (*ptr == '.' && ZEND_IS_DIGIT(ptr[1])) {process_double:        type = IS_DOUBLE;        /* If there's a dval, do the conversion; else continue checking         * the digits if we need to check for a full match */        if (dval) {            local_dval = zend_strtod(str, &ptr);        } else if (allow_errors != 1 && dp_or_e != -1) {            dp_or_e = (*ptr++ == '.') ? 1 : 2;            goto check_digits;        }    } else {        return 0;    }    if (ptr != str + length) {        if (!allow_errors) {            return 0;        }        if (allow_errors == -1) {            zend_error(E_NOTICE, "A non well formed numeric value encountered");        }    }    if (type == IS_LONG) {        if (digits == MAX_LENGTH_OF_LONG - 1) {            int cmp = strcmp(&ptr[-digits], long_min_digits);            if (!(cmp < 0 || (cmp == 0 && *str == '-'))) {                if (dval) {                    *dval = zend_strtod(str, NULL);                }                if (oflow_info != NULL) {                    *oflow_info = *str == '-' ? -1 : 1;                }                return IS_DOUBLE;            }        }        if (lval) {            if (neg) {                tmp_lval = -tmp_lval;            }            *lval = tmp_lval;        }        return IS_LONG;    } else {        if (dval) {            *dval = local_dval;        }        return IS_DOUBLE;    }}/* }}} */
Copy after login

上面的代码是判断字符串是否为“数字”形式的比较函数。

SORT_STRING

然后我们看看第二条调用链:SORT_STRING -> php_get_data_compare_func -> php_array_data_compare_string -> string_compare_function;

php-src.git/Zend/zend_operators.c, 1729:

ZEND_API int ZEND_FASTCALL string_compare_function(zval *op1, zval *op2) /* {{{ */{    if (EXPECTED(Z_TYPE_P(op1) == IS_STRING) &&        EXPECTED(Z_TYPE_P(op2) == IS_STRING)) {        if (Z_STR_P(op1) == Z_STR_P(op2)) {            return 0;        } else {            return zend_binary_strcmp(Z_STRVAL_P(op1), Z_STRLEN_P(op1), Z_STRVAL_P(op2), Z_STRLEN_P(op2));        }    } else {        zend_string *str1 = zval_get_string(op1);        zend_string *str2 = zval_get_string(op2);        int ret = zend_binary_strcmp(ZSTR_VAL(str1), ZSTR_LEN(str1), ZSTR_VAL(str2), ZSTR_LEN(str2));        zend_string_release(str1);        zend_string_release(str2);        return ret;    }}/* }}} */
Copy after login

可以看到,SORT_STRING 最终也使用 zend_binary_strcmp 函数进行字符串比较。下面的代码,是 zend_binary_strcmp 的实现:

php-src.git/Zend/zend_operators.c, 2539:

ZEND_API int ZEND_FASTCALL zend_binary_strcmp(const char *s1, size_t len1, const char *s2, size_t len2) /* {{{ */{    int retval;    if (s1 == s2) {        return 0;    }    retval = memcmp(s1, s2, MIN(len1, len2));    if (!retval) {        return (int)(len1 - len2);    } else {        return retval;    }}/* }}} */
Copy after login

总结

经过以上分析我们可以得知,SORT_STRING 排序方式的底层实现是 C 语言的 memcmp,也就是说,它对两个字符串从前往后,按照逐个字节比较,一旦字节有差异,就终止并比较出大小。

而 SORT_REGULAR 会智能判断需排序对象的类型,如果两个字符串都是“纯数字”形式的字符串,会以比较整个字符串所代表的十进制整数、浮点数大小的形式进行排序。如果两个字符串不是“纯数字“形式的,才会和 SORT_STRING 一样。

因此,如果需要以字符串 strcmp 方式逐个字节从前往后比较来进行排序,在调用 php 的 sort 类函数的时候请务必使用 SORT_STRING 这个 flag,否则如果两个字符串都是”纯数字“形式的,就会按照它们所代表的数字大小进行排序。

而且需要注意的是,如果两个值的类型不同,那么这样的比较是毫无意义的,也可能会产生意想不到的结果。

最后的测试

最后,我们在欲排序的值最后添加了一个字符 "s",使它们不再是”纯数字“形式的字符串:

<?php    $arr = [        "nonce_str" => "441469s",        "timestamp" => "1464334314s"    ];    asort($arr);    var_dump($arr);?>
Copy after login

最后排序的结果变成了:

array(2) {    ["timestamp"]=>    string(11) "1464334314s"    ["nonce_str"]=>    string(7) "441469s"}
Copy after login

这才是我们想要的结果。

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Explain JSON Web Tokens (JWT) and their use case in PHP APIs. Explain JSON Web Tokens (JWT) and their use case in PHP APIs. Apr 05, 2025 am 12:04 AM

JWT is an open standard based on JSON, used to securely transmit information between parties, mainly for identity authentication and information exchange. 1. JWT consists of three parts: Header, Payload and Signature. 2. The working principle of JWT includes three steps: generating JWT, verifying JWT and parsing Payload. 3. When using JWT for authentication in PHP, JWT can be generated and verified, and user role and permission information can be included in advanced usage. 4. Common errors include signature verification failure, token expiration, and payload oversized. Debugging skills include using debugging tools and logging. 5. Performance optimization and best practices include using appropriate signature algorithms, setting validity periods reasonably,

Describe the SOLID principles and how they apply to PHP development. Describe the SOLID principles and how they apply to PHP development. Apr 03, 2025 am 12:04 AM

The application of SOLID principle in PHP development includes: 1. Single responsibility principle (SRP): Each class is responsible for only one function. 2. Open and close principle (OCP): Changes are achieved through extension rather than modification. 3. Lisch's Substitution Principle (LSP): Subclasses can replace base classes without affecting program accuracy. 4. Interface isolation principle (ISP): Use fine-grained interfaces to avoid dependencies and unused methods. 5. Dependency inversion principle (DIP): High and low-level modules rely on abstraction and are implemented through dependency injection.

How to automatically set permissions of unixsocket after system restart? How to automatically set permissions of unixsocket after system restart? Mar 31, 2025 pm 11:54 PM

How to automatically set the permissions of unixsocket after the system restarts. Every time the system restarts, we need to execute the following command to modify the permissions of unixsocket: sudo...

Explain the concept of late static binding in PHP. Explain the concept of late static binding in PHP. Mar 21, 2025 pm 01:33 PM

Article discusses late static binding (LSB) in PHP, introduced in PHP 5.3, allowing runtime resolution of static method calls for more flexible inheritance.Main issue: LSB vs. traditional polymorphism; LSB's practical applications and potential perfo

How to send a POST request containing JSON data using PHP's cURL library? How to send a POST request containing JSON data using PHP's cURL library? Apr 01, 2025 pm 03:12 PM

Sending JSON data using PHP's cURL library In PHP development, it is often necessary to interact with external APIs. One of the common ways is to use cURL library to send POST�...

Framework Security Features: Protecting against vulnerabilities. Framework Security Features: Protecting against vulnerabilities. Mar 28, 2025 pm 05:11 PM

Article discusses essential security features in frameworks to protect against vulnerabilities, including input validation, authentication, and regular updates.

Customizing/Extending Frameworks: How to add custom functionality. Customizing/Extending Frameworks: How to add custom functionality. Mar 28, 2025 pm 05:12 PM

The article discusses adding custom functionality to frameworks, focusing on understanding architecture, identifying extension points, and best practices for integration and debugging.

See all articles