Quick Tip: Getting Started with Headless Chrome in Node.js-JS Tutorial-php.cn

Headless Chrome: A powerful tool for automated web testing and crawling

Quick Tip: Getting Started with Headless Chrome in Node.js

Core points

Starting with Chrome version 59 (version 60 for Windows users), headless Chrome allows you to programmatically simulate user interaction with websites and capture results for testing. It uses Chromium and Blink engines to simulate the user experience in Chrome.
Running headless Chrome in Node.js requires the chrome-remote-interface module (for simplifying abstraction of commands and notifications) and the chrome-launcher module (for launching Chrome from Node.js across multiple platforms).
After initializing the session and defining the test domain, you can navigate the website, copy user journeys, and capture results. You can also use the captureScreenshot function to capture page screenshots while navigating the website.
While headless Chrome is not fully integrated into tools like Selenium, due to its ability to render JavaScript, it is the best way to reproduce the user experience in a fully automated way, ideal for large-scale automated web crawling tasks .

In our work, it is often necessary to replicate user journeys repeatedly to ensure that the page provides a consistent experience when changing the website. The key to achieving this is to allow us to write libraries of these test scripts so that we can run assertions on them and maintain the result documentation. This is what the headless browser does: a command-line tool that allows you to programmatically simulate user interaction with your website and capture results for testing.

For many years, many people have been using PhantomJS, CasperJS and other tools to do this. But, just like love, our hearts may be transferred elsewhere. Starting with Chrome version 59 (version 60 for Windows users), Chrome comes with its own headless browser. While it does not support Selenium at the moment, it uses Chromium and Blink engines, that is, it simulates the actual user experience in Chrome.

The code for this article can be found in our GitHub repository.

Run headless Chrome from the command line

Running headless Chrome from the command line is relatively easy. On a Mac, you can set an alias for Chrome and run it with the --headless command line parameter:

alias chrome="/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome"
chrome --headless --disable-gpu --remote-debugging-port=9090 https://www.sitepoint.com/

Copy after login

On Linux, it's even easier:

google-chrome --headless --disable-gpu --remote-debugging-port=9090 https://www.sitepoint.com/

Copy after login

--headless: No UI required or display server dependencies running
--disable-gpu: Disable GPU hardware acceleration. This parameter is currently required.
--remote-debugging-port: Enable remote debugging over HTTP on the specified port.

You can also interact with the requested page, for example, to print document.body.innerHTML to standard output, you can do the following:

google-chrome --headless --disable-gpu --dump-dom http://endless.horse/

Copy after login

If you are curious about the possibility, you can find the complete list of parameters here.

Run headless Chrome in Node.js

However, the focus of this article is not on the command line, but on how to run headless Chrome in Node.js. To do this, we need the following module:

chrome-remote-interface: The JavaScript API provides a simple abstraction of commands and notifications.
chrome-launcher: Allows us to launch Chrome in Node.js on multiple platforms.

Then we can set up our environment. This assumes that Node.js and npm are installed on your machine. If this is not the case, check out our tutorial.

alias chrome="/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome"
chrome --headless --disable-gpu --remote-debugging-port=9090 https://www.sitepoint.com/

Copy after login

After that, we want to instantiate a session using headless-chrome. Let's start by creating a

file in the project folder: index.js

google-chrome --headless --disable-gpu --remote-debugging-port=9090 https://www.sitepoint.com/

Copy after login

First, we are introducing dependencies and then creating a self-call function that will instantiate the Chrome session. Note that the

flag is required at the time of writing, but may not be needed when you read this, as it is just a workaround (as Google recommends). We will use --disable-gpu to make sure our application waits for the headless browser to start before performing the next steps. async/await

Next, we need to publicly expose the domain required for testing:

google-chrome --headless --disable-gpu --dump-dom http://endless.horse/

Copy after login

The most important Page object here - we will use it to access the content rendered to the UI. This will also be where we specify navigation locations, interactive elements, and where we run the script.

Explore Page

After initializing the session and defining the domain, we can start navigating the website. We want to select a starting point, so we use the Page domain enabled above to navigate:

mkdir headless
cd headless
npm init -y
npm install chrome-remote-interface --save
npm install chrome-launcher --save

Copy after login

This will load the page. We can then use the

method to define the steps to run the application to execute the code to copy our user journey. In this example, we just get the content of the first paragraph: loadEventFired

const chromeLauncher = require('chrome-launcher');
const CDP = require('chrome-remote-interface');

(async function() {
  async function launchChrome() {
    return await chromeLauncher.launch({
      chromeFlags: [
        '--disable-gpu',
        '--headless'
      ]
    });
  }
  const chrome = await launchChrome();
  const protocol = await CDP({
    port: chrome.port
  });

  // 所有后续代码片段都位于此处

})();

Copy after login

If you run the script using

, you should see results similar to the following output: node index.js

Go a step further - grab screenshot

This is good, but we can just as easily replace any code with a

value to use the query selector to click links, fill in form fields, and run a series of interactions. Each step can be stored in a JSON configuration file and loaded into your Node.js script to execute in sequence. The results of these scripts can be verified using test platforms such as Mocha, allowing you to cross-reference whether the captured values meet UI/UX requirements. script1

As a supplement to the test script, you may want to capture screenshots of the page while navigating the website. Fortunately, the provided domain has a

function that does this accurately. captureScreenshot

The

const {
  DOM,
  Page,
  Emulation,
  Runtime
} = protocol;
await Promise.all([Page.enable(), Runtime.enable(), DOM.enable()]);

Copy after login

logo is another logo that needs to be supported across platforms at the time of writing, and may not be needed in future iterations. fromSurface

Run the script with

and you should see results similar to the following output: node index.js

Quick Tip: Getting Started with Headless Chrome in Node.js

Conclusion

If you are writing automation scripts, you should now start using Chrome's headless browser. While it still doesn't fully integrate into tools like Selenium, the benefits of simulating Chrome rendering engines cannot be underestimated. This is the best way to reproduce the user experience in a fully automated way.

I will provide you with some further reading materials:

API Documentation: https://www.php.cn/link/fc56459a18776e2a100854c16a1fd78b
Beginner of headless Chrome: https://www.php.cn/link/ada77e9fac537039c9adb2787b9af7da

Please tell me your experience with headless Chrome in the comments below.

(The FAQs part is omitted here because it is repeated with the original text and is too long. The FAQs content can be optionally retained or reorganized as needed.)

The above is the detailed content of Quick Tip: Getting Started with Headless Chrome in Node.js. For more information, please follow other related articles on the PHP Chinese website!