Home Backend Development PHP Problem How to solve php gzip css garbled problem

How to solve php gzip css garbled problem

Sep 26, 2021 am 10:21 AM
css gzip php

php gzip css garbled solution: 1. Use the built-in zlib library; 2. Use CURL instead of "file_get_contents"; 3. Use the gzip decompression function to solve the garbled problem.

How to solve php gzip css garbled problem

The operating environment of this article: Windows 7 system, PHP version 7.1, DELL G3 computer.

How to solve the php gzip css garbled problem?

Three solutions to php file_get_contents grabbing Gzip web page garbled

Using the file_get_contents() function to crawl web pages will cause garbled characters. There are two reasons that will cause garbled characters. One is the encoding problem, and the other is that Gzip is turned on on the target page. The following is how to avoid garbled characters if the Gzip function is turned on.

Capture The received content can be encoded ($content=iconv("GBK", "UTF-8//IGNORE", $content);). What we are discussing here is how to crawl the page with Gzip turned on. How to judge? The obtained header contains Content-Encoding: gzip indicating that the content is GZIP compressed. Use FireBug to check whether gzip is enabled on the page. The following is the header information of my blog viewed using firebug. Gzip is turned on.

The code is as follows:

Request header information original header information

Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding gzip, deflate
Accept-Language zh-cn,zh;q=0.8,en-us;q=0.5,en;q=0.3
Connection keep-alive
Cookie __utma=225240837.787252530.1317310581.1335406161.1335411401.1537; __utmz=225240837.1326850415.887.3.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=%E4%BB%BB%E4%BD%95%E9%A1%B9%E7%9B%AE%E9%83%BD%E4%B8%8D%E4%BC%9A%E9%82%A3%E4%B9%88%E7%AE%80%E5%8D%95%20site%3Awww.nowamagic.net; PHPSESSID=888mj4425p8s0m7s0frre3ovc7; __utmc=225240837; __utmb=225240837.1.10.1335411401
Host www.nowamagic.net
User-Agent Mozilla/5.0 (Windows NT 5.1; rv:12.0) Gecko/20100101 Firefox/12.0
Copy after login

Here are some solutions:

1. Use the built-in zlib library

If the server has installed the zlib library, you can easily solve the garbled code problem by using the following code.

The code is as follows:

$data = file_get_contents("compress.zlib://".$url);
Copy after login

2. Use CURL instead of file_get_contents

The code is as follows:

function curl_get($url, $gzip=false){
 $curl = curl_init($url);
 curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
 curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 10);
 if($gzip) curl_setopt($curl, CURLOPT_ENCODING, "gzip"); // 关键在这里
 $content = curl_exec($curl);
 curl_close($curl);
 return $content;
}
Copy after login

3. Use the gzip decompression function

The code is as follows:

function gzdecode($data) {
  $len = strlen($data);
  if ($len < 18 || strcmp(substr($data,0,2),"\x1f\x8b")) {
    return null;  // Not GZIP format (See RFC 1952)
  }
  $method = ord(substr($data,2,1));  // Compression method
  $flags  = ord(substr($data,3,1));  // Flags
  if ($flags & 31 != $flags) {
    // Reserved bits are set -- NOT ALLOWED by RFC 1952
    return null;
  }
  // NOTE: $mtime may be negative (PHP integer limitations)
  $mtime = unpack("V", substr($data,4,4));
  $mtime = $mtime[1];
  $xfl   = substr($data,8,1);
  $os    = substr($data,8,1);
  $headerlen = 10;
  $extralen  = 0;
  $extra     = "";
  if ($flags & 4) {
    // 2-byte length prefixed EXTRA data in header
    if ($len - $headerlen - 2 < 8) {
      return false;    // Invalid format
    }
    $extralen = unpack("v",substr($data,8,2));
    $extralen = $extralen[1];
    if ($len - $headerlen - 2 - $extralen < 8) {
      return false;    // Invalid format
    }
    $extra = substr($data,10,$extralen);
    $headerlen += 2 + $extralen;
  }
  $filenamelen = 0;
  $filename = "";
  if ($flags & 8) {
    // C-style string file NAME data in header
    if ($len - $headerlen - 1 < 8) {
      return false;    // Invalid format
    }
    $filenamelen = strpos(substr($data,8+$extralen),chr(0));
    if ($filenamelen === false || $len - $headerlen - $filenamelen - 1 < 8) {
      return false;    // Invalid format
    }
    $filename = substr($data,$headerlen,$filenamelen);
    $headerlen += $filenamelen + 1;
  }
  $commentlen = 0;
  $comment = "";
  if ($flags & 16) {
    // C-style string COMMENT data in header
    if ($len - $headerlen - 1 < 8) {
      return false;    // Invalid format
    }
    $commentlen = strpos(substr($data,8+$extralen+$filenamelen),chr(0));
    if ($commentlen === false || $len - $headerlen - $commentlen - 1 < 8) {
      return false;    // Invalid header format
    }
    $comment = substr($data,$headerlen,$commentlen);
    $headerlen += $commentlen + 1;
  }
  $headercrc = "";
  if ($flags & 1) {
    // 2-bytes (lowest order) of CRC32 on header present
    if ($len - $headerlen - 2 < 8) {
      return false;    // Invalid format
    }
    $calccrc = crc32(substr($data,0,$headerlen)) & 0xffff;
    $headercrc = unpack("v", substr($data,$headerlen,2));
    $headercrc = $headercrc[1];
    if ($headercrc != $calccrc) {
      return false;    // Bad header CRC
    }
    $headerlen += 2;
  }
  // GZIP FOOTER - These be negative due to PHP&#39;s limitations
  $datacrc = unpack("V",substr($data,-8,4));
  $datacrc = $datacrc[1];
  $isize = unpack("V",substr($data,-4));
  $isize = $isize[1];
  // Perform the decompression:
  $bodylen = $len-$headerlen-8;
  if ($bodylen < 1) {
    // This should never happen - IMPLEMENTATION BUG!
    return null;
  }
  $body = substr($data,$headerlen,$bodylen);
  $data = "";
  if ($bodylen > 0) {
    switch ($method) {
      case 8:
        // Currently the only supported compression method:
        $data = gzinflate($body);
        break;
      default:
        // Unknown compression method
        return false;
    }
  } else {
    // I&#39;m not sure if zero-byte body content is allowed.
    // Allow it for now...  Do nothing...
  }
  // Verifiy decompressed size and CRC32:
  // NOTE: This may fail with large data sizes depending on how
  //       PHP&#39;s integer limitations affect strlen() since $isize
  //       may be negative for large sizes.
  if ($isize != strlen($data) || crc32($data) != $datacrc) {
    // Bad format!  Length or CRC doesn&#39;t match!
    return false;
  }
  return $data;
}
Copy after login

Usage:

The code is as follows:

$html=file_get_contents(&#39;https://www.jb51.net/&#39;);
$html=gzdecode($html);
Copy after login

I will introduce these three methods, which should be able to solve most of the garbled crawling problems caused by gzip.

Recommended learning: "PHP Video Tutorial"

The above is the detailed content of How to solve php gzip css garbled problem. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Chat Commands and How to Use Them
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to write split lines on bootstrap How to write split lines on bootstrap Apr 07, 2025 pm 03:12 PM

There are two ways to create a Bootstrap split line: using the tag, which creates a horizontal split line. Use the CSS border property to create custom style split lines.

The Roles of HTML, CSS, and JavaScript: Core Responsibilities The Roles of HTML, CSS, and JavaScript: Core Responsibilities Apr 08, 2025 pm 07:05 PM

HTML defines the web structure, CSS is responsible for style and layout, and JavaScript gives dynamic interaction. The three perform their duties in web development and jointly build a colorful website.

How to insert pictures on bootstrap How to insert pictures on bootstrap Apr 07, 2025 pm 03:30 PM

There are several ways to insert images in Bootstrap: insert images directly, using the HTML img tag. With the Bootstrap image component, you can provide responsive images and more styles. Set the image size, use the img-fluid class to make the image adaptable. Set the border, using the img-bordered class. Set the rounded corners and use the img-rounded class. Set the shadow, use the shadow class. Resize and position the image, using CSS style. Using the background image, use the background-image CSS property.

How to resize bootstrap How to resize bootstrap Apr 07, 2025 pm 03:18 PM

To adjust the size of elements in Bootstrap, you can use the dimension class, which includes: adjusting width: .col-, .w-, .mw-adjust height: .h-, .min-h-, .max-h-

How to use bootstrap in vue How to use bootstrap in vue Apr 07, 2025 pm 11:33 PM

Using Bootstrap in Vue.js is divided into five steps: Install Bootstrap. Import Bootstrap in main.js. Use the Bootstrap component directly in the template. Optional: Custom style. Optional: Use plug-ins.

How to set up the framework for bootstrap How to set up the framework for bootstrap Apr 07, 2025 pm 03:27 PM

To set up the Bootstrap framework, you need to follow these steps: 1. Reference the Bootstrap file via CDN; 2. Download and host the file on your own server; 3. Include the Bootstrap file in HTML; 4. Compile Sass/Less as needed; 5. Import a custom file (optional). Once setup is complete, you can use Bootstrap's grid systems, components, and styles to create responsive websites and applications.

How can you prevent a class from being extended or a method from being overridden in PHP? (final keyword) How can you prevent a class from being extended or a method from being overridden in PHP? (final keyword) Apr 08, 2025 am 12:03 AM

In PHP, the final keyword is used to prevent classes from being inherited and methods being overwritten. 1) When marking the class as final, the class cannot be inherited. 2) When marking the method as final, the method cannot be rewritten by the subclass. Using final keywords ensures the stability and security of your code.

How to use bootstrap button How to use bootstrap button Apr 07, 2025 pm 03:09 PM

How to use the Bootstrap button? Introduce Bootstrap CSS to create button elements and add Bootstrap button class to add button text

See all articles