Headless Chrome: A powerful tool for automated web testing and crawling
Core points
chrome-remote-interface
module (for simplifying abstraction of commands and notifications) and the chrome-launcher
module (for launching Chrome from Node.js across multiple platforms). captureScreenshot
function to capture page screenshots while navigating the website. In our work, it is often necessary to replicate user journeys repeatedly to ensure that the page provides a consistent experience when changing the website. The key to achieving this is to allow us to write libraries of these test scripts so that we can run assertions on them and maintain the result documentation. This is what the headless browser does: a command-line tool that allows you to programmatically simulate user interaction with your website and capture results for testing.
For many years, many people have been using PhantomJS, CasperJS and other tools to do this. But, just like love, our hearts may be transferred elsewhere. Starting with Chrome version 59 (version 60 for Windows users), Chrome comes with its own headless browser. While it does not support Selenium at the moment, it uses Chromium and Blink engines, that is, it simulates the actual user experience in Chrome.
The code for this article can be found in our GitHub repository.
Run headless Chrome from the command line
Running headless Chrome from the command line is relatively easy. On a Mac, you can set an alias for Chrome and run it with the --headless
command line parameter:
alias chrome="/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome" chrome --headless --disable-gpu --remote-debugging-port=9090 https://www.sitepoint.com/
On Linux, it's even easier:
google-chrome --headless --disable-gpu --remote-debugging-port=9090 https://www.sitepoint.com/
--headless
: No UI required or display server dependencies running --disable-gpu
: Disable GPU hardware acceleration. This parameter is currently required. --remote-debugging-port
: Enable remote debugging over HTTP on the specified port. You can also interact with the requested page, for example, to print document.body.innerHTML
to standard output, you can do the following:
google-chrome --headless --disable-gpu --dump-dom http://endless.horse/
If you are curious about the possibility, you can find the complete list of parameters here.
Run headless Chrome in Node.js
However, the focus of this article is not on the command line, but on how to run headless Chrome in Node.js. To do this, we need the following module:
chrome-remote-interface
: The JavaScript API provides a simple abstraction of commands and notifications. chrome-launcher
: Allows us to launch Chrome in Node.js on multiple platforms. Then we can set up our environment. This assumes that Node.js and npm are installed on your machine. If this is not the case, check out our tutorial.
alias chrome="/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome" chrome --headless --disable-gpu --remote-debugging-port=9090 https://www.sitepoint.com/
file in the project folder: index.js
google-chrome --headless --disable-gpu --remote-debugging-port=9090 https://www.sitepoint.com/
flag is required at the time of writing, but may not be needed when you read this, as it is just a workaround (as Google recommends). We will use --disable-gpu
to make sure our application waits for the headless browser to start before performing the next steps. async/await
google-chrome --headless --disable-gpu --dump-dom http://endless.horse/
Explore Page
After initializing the session and defining the domain, we can start navigating the website. We want to select a starting point, so we use the Page domain enabled above to navigate:
mkdir headless cd headless npm init -y npm install chrome-remote-interface --save npm install chrome-launcher --save
method to define the steps to run the application to execute the code to copy our user journey. In this example, we just get the content of the first paragraph: loadEventFired
const chromeLauncher = require('chrome-launcher'); const CDP = require('chrome-remote-interface'); (async function() { async function launchChrome() { return await chromeLauncher.launch({ chromeFlags: [ '--disable-gpu', '--headless' ] }); } const chrome = await launchChrome(); const protocol = await CDP({ port: chrome.port }); // 所有后续代码片段都位于此处 })();
, you should see results similar to the following output: node index.js
Go a step further - grab screenshot
This is good, but we can just as easily replace any code with a value to use the query selector to click links, fill in form fields, and run a series of interactions. Each step can be stored in a JSON configuration file and loaded into your Node.js script to execute in sequence. The results of these scripts can be verified using test platforms such as Mocha, allowing you to cross-reference whether the captured values meet UI/UX requirements. script1
function that does this accurately. captureScreenshot
const { DOM, Page, Emulation, Runtime } = protocol; await Promise.all([Page.enable(), Runtime.enable(), DOM.enable()]);
logo is another logo that needs to be supported across platforms at the time of writing, and may not be needed in future iterations. fromSurface
and you should see results similar to the following output: node index.js
Conclusion
If you are writing automation scripts, you should now start using Chrome's headless browser. While it still doesn't fully integrate into tools like Selenium, the benefits of simulating Chrome rendering engines cannot be underestimated. This is the best way to reproduce the user experience in a fully automated way.
I will provide you with some further reading materials:
Please tell me your experience with headless Chrome in the comments below.
(The FAQs part is omitted here because it is repeated with the original text and is too long. The FAQs content can be optionally retained or reorganized as needed.)
The above is the detailed content of Quick Tip: Getting Started with Headless Chrome in Node.js. For more information, please follow other related articles on the PHP Chinese website!