今天偶然调试时,发现了base64编码时内存分配的BUG,为编码分配的缓冲区计算方式有隐患,偶尔出现缓冲区过小,导致后续堆内存被覆盖,访问越界. php-4.4.2/ext/standard/base64.c /* {{{ php_base64_encode */ PHPAPI unsigned char *php_base64_encode(const uns
今天偶然调试时,发现了base64编码时内存分配的BUG,为编码分配的缓冲区计算方式有隐患,偶尔出现缓冲区过小,导致后续堆内存被覆盖,访问越界.
php-4.4.2/ext/standard/base64.c
/* {{{ php_base64_encode */
PHPAPI unsigned char *php_base64_encode(const unsigned char *str, int length, int *ret_length)
{
const unsigned char *current = str;
unsigned char *p;
unsigned char *result;
if ((length 2) < 0 || ((length 2) / 3) >= (1 << (sizeof(int) * 8 - 2))) {
if (ret_length != NULL) {
*ret_length = 0;
}
return NULL;
}
result = (unsigned char *)safe_emalloc(((length 2) / 3) * 4, sizeof(char), 1);
p = result;
while (length > 2) { /* keep going until we have less than 24 bits */
*p = base64_table[current[0] >> 2];
*p = base64_table[((current[0] & 0x03) << 4) (current[1] >> 4)];
*p = base64_table[((current[1] & 0x0f) << 2) (current[2] >> 6)];
*p = base64_table[current[2] & 0x3f];
current = 3;
length -= 3; /* we just handle 3 octets of data */
}
/* now deal with the tail end of things */
if (length != 0) {
*p = base64_table[current[0] >> 2];
if (length > 1) {
*p = base64_table[((current[0] & 0x03) << 4) (current[1] >> 4)];
*p = base64_table[(current[1] & 0x0f) << 2];
*p = base64_pad;
} else {
*p = base64_table[(current[0] & 0x03) << 4];
*p = base64_pad;
*p = base64_pad;
}
}
if (ret_length != NULL) {
*ret_length = (int)(p - result);
}
*p = '/0';
return result;
}
我觉得计算方式应改为如下:
/* Account the result buffer size and alloc the memory for it. */
if ((length % 3) != 0)
{
padnum = 3 - length % 3;
}
retsize = (length padnum) ((length padnum) / 3) 1; // 正确的大小
稍微解释一下,因为BASE64需要将3个8位字节转换成4个6位的元组,4个6位元组每一组都可以用编码表中的一个ASCII码表示,这样的话,即就是每3个字节会多出一个字节,所以最终编码应该多出((length padnum) / 3)个. 原理就是如此,而standard中默认的编码内存分配计算得有隐患.