Headless browser refers to a browser that can run with a graphical interface. I can control the headless browser to automatically perform various tasks through programming, such as doing tests, taking screenshots of web pages, etc.
The word "headless" comes from the original "headless computer". Wikipedia's entry on "headless computer":
A headless system is one that is configured without a monitor (i.e. "head"), keyboard and A computer system or device operated by a mouse. Headless systems are usually controlled through a network connection, but some headless system devices require device management through an RS-232 serial connection. Servers usually use headless mode to reduce operating costs.
In addition to the two previously mentioned harmless use cases, headless browsers can be used to automate malicious tasks. The most common form is to crawl the web, fake traffic, or detect website vulnerabilities.
A very popular headless browser is PhantomJS. Because it is based on the Qt framework, it has many different features compared to our common browsers, so there are many ways to judge. out of it.
However, starting from chrome 59, Google released a headless Google Chrome. It is different from PhantomJS in that it is developed based on the orthodox Google Chrome, not other frameworks, which makes it difficult for the program to distinguish whether it is a normal browser or a headless browser.
Below, we will introduce several methods to determine whether a program is running in a normal browser or a headless browser.
Note: These methods have only been tested on four devices (2 Linux, 2 Mac), and That said, there are certainly many other ways to detect headless browsers.
First introduce the most common method to determine the type of browser, checking the User agent. In the Linux computer, the User agent value of Chrome version 59 headless browser is:
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/ 59.0.3071.115 Safari/537.36”
So, we can detect whether it is a headless Chrome browser like this:
if (/HeadlessChrome/.test(window.navigator.userAgent)) { console.log("Chrome headless detected"); }
User agent can also Obtained from HTTP headers. However, both cases are easily faked.
navigator.plugins will return an array containing the plug-in information in the current browser. Usually, the ordinary Chrome browser has some default plug-ins, such as Chrome PDF viewer or Google Native Client. In contrast, in headless mode, there are no plugins and an empty array is returned.
if(navigator.plugins.length == 0) { console.log("It may be Chrome headless"); }
In Google Chrome, there are two JavaScript properties that can get the current browser language settings: navigator.language and navigator.languages. The first one refers to the language of the browser interface, and the latter one returns an array, which stores all the second-choice languages of the browser user. However, in headless mode, navigator.languages returns an empty string.
if(navigator.languages == "") { console.log("Chrome headless detected"); }
WebGL provides a set of APIs that can perform 3D rendering in HTML canvas. Through these APIs, we can query the graphics driver vendor and renderer.
In the ordinary Google Chrome on Linux, the renderer and vendor values we get are: "Google SwiftShader" and "Google Inc.".
而在无头模式里,我们获得的一个是 “Mesa OffScreen”——它是没有使用任何 window 系统的渲染技术的名称,和 “Brian Paul” ——开源 Mesa 图形库的最初的程序。
var canvas = document.createElement('canvas'); var gl = canvas.getContext('webgl'); var debugInfo = gl.getExtension('WEBGL_debug_renderer_info'); var vendor = gl.getParameter(debugInfo.UNMASKED_VENDOR_WEBGL); var renderer = gl.getParameter(debugInfo.UNMASKED_RENDERER_WEBGL); if(vendor == "Brian Paul" && renderer == "Mesa OffScreen") { console.log("Chrome headless detected"); }
并不是所有版本的无头浏览器都有同样的这两个值。然而目前在无头浏览器里是“Mesa Offscreen” 和 “Brian Paul” 这两个值。
Modernizr 可以探测出当前浏览器对HTML和CSS各种特性的支持程度。我发现,普通Chrome和无头Chrome里唯一的区别是,无头模式下没有 hairline 特征,它是用来检测是否支持 hidpi/retina hairlines的
if(!Modernizr["hairline"]) { console.log("It may be Chrome headless"); }
最后,我发现的最后一个方法,也是看起来最有效的方法,切入点是检查浏览器里不能正常加载的图片的高和宽。
var body = document.getElementsByTagName("body")[0]; var image = document.createElement("img"); image.src = "http://iloveponeydotcom32188.jg"; image.setAttribute("id", "fakeimage"); body.appendChild(image); image.onerror = function(){ if(image.width == 0 && image.height == 0) { console.log("Chrome headless detected"); } }
这就是检测无头浏览器的详细步骤
推荐教程:《JS教程》
The above is the detailed content of How does JavaScript detect that the current browser is a headless browser?. For more information, please follow other related articles on the PHP Chinese website!