The article first introduces the file content structure of qqwry.dat. Then according to its characteristics, we can write to read the content of the qqwry.dat ip library to find the content we want.
First, let’s take a look at the content structure of the QQWry.Data file and how to interpret it.
1. File structure
Files are mainly divided into three structures
1. File header, 8 bytes;
2. Data recording area, variable length;
3. Index Area, the length is an integer multiple of 7;
2. File header
The 8 bytes of the file header are divided into two parts, each part is 4 bytes, respectively specifying the starting address and the index area end address. Therefore, the total number of records can be calculated by dividing the difference between the two addresses by 7 and then adding 1.
2. Recording area
The data in the recording area needs to obtain the starting position of each data through the data in the index area; the data in this area records the end address of the IP address and the region string; all regional characters Strings end with 0×00.
3. Index area
To retrieve the area corresponding to the IP, the key is to find the index content corresponding to the IP starting address. An IP index data contains 7 bytes, the first 4 bytes are the starting value of the IP address, and the last 3 bytes are the offset address of the corresponding IP data record in the file; in the IP data record, the first 4 bytes The stanza is the IP end address; the data that follows has two patterns: 0×01 pattern and 0×02 pattern.
0×01 mode, that is, the 5th byte of the IP data is 0×01, and the next 3 bytes are the offset address of the country and region data; the country and region data includes the country and region these two strings. That is,
————————————————————
4 bytes | 3 bytes redirection 0x NN NN NN -> File offset of country and region data Shift address
————————————————————
0×02 mode, that is, the 5th byte of IP data is 0× 02, then the next 3 bytes are the offset address of the national data, and the regional data is the following string, ending with 0×00. i.e.
——————————————————————————–
4 bytes | 3 bytes Redirect 0x NN NN NN -> Country File offset address of data | Region string | 0×00
————————————————————————————–
for In the country and region data obtained by the 0×01 pattern, it may also contain a redirection structure, that is,
————————————–
Country string | 0×00 | Region string | 0×00
————————————–
or
—————————————————— ————-
Country string | 0×00 | 0×02 | 3 bytes 0x NN NN NN -> File offset address of region string
—————————— ————————————————-
For the former case, it is relatively simple, just read the two string data directly; for the latter case, you need Redirect again to the offset address of the region string, and then read to 0×00 as the end of the string.
For this method of mapping actual string values by address, the main function is to avoid repeated recording of string values. There are too many identical string records in the entire IP address library file. Using a 3-byte mapped address saves much space than repeatedly recording string values.
PHP code reading operation QQWry.dat file:
The code is as follows | Copy code |
function bin2ip($bin){ //------------------------ -------------------------------- $index_begin = implode('', unpack('L', $c)); $ip_num = ($index_end - $index_begin) / 7 + 1; echo "index begin at: $index_beginn"; $output = ''; for($i = 0; $i < $ip_num; $i++){ //The file pointer points to each Get the index data (7 bytes) of the IP data file on $ip3 = fread($f, 3); //IP record offset address $dataseek = implode('', unpack('L', $ip3 . chr(0)) ); //Point to the record area $dataseek location to find the record $area = ''; //Read a flag bit $flag = fread($f, 1) ; //If the area is redirected |
This function we What I see most are the file operation related functions such as fopen, fseek, and fread. Friends in need can take a look.