A JavaScript scraper for the Wikipedia Academy Award List.-JS Tutorial-php.cn

A JavaScript scraper for the Wikipedia Academy Award List.

Susan Sarandon

Release： 2025-01-24 16:39:12

Original

1076 people have browsed it

This tutorial demonstrates web scraping using JavaScript's Cheerio library to extract Academy Award-winning films from Wikipedia and save them to a CSV file.

First, install the required packages:

npm install cheerio axios

Copy after login

The Wikipedia page URL is:

const url = 'https://en.wikipedia.org/wiki/List_of_Academy_Award%E2%80%93winning_films';

Copy after login

The code fetches the page's HTML using axios, then uses Cheerio to parse it:

const { data: html } = await axios.get(url);
const $ = cheerio.load(html);

const theadData = [];
const tableData = [];

Copy after login

The script navigates the DOM, extracting data from table cells:

$('tbody').each((i, column) => {
  const columnData = [];
  $(column).find('th').each((j, cell) => {
    columnData.push($(cell).text().replace('\n', ''));
  });
  theadData.push(columnData);
});

tableData.push(theadData[0]);

$('table tr').each((i, row) => {
  const rowData = [];
  $(row).find('td').each((j, cell) => {
    rowData.push($(cell).text().trim());
  });
  if (rowData.length) tableData.push(rowData);
});

Copy after login

Finally, the extracted data is formatted and saved to a CSV file using fs.writeFileSync, with semicolons as delimiters:

const csvContent = tableData.map((row) => row.join(';')).join('\n');
fs.writeFileSync('academy_awards.csv', csvContent, 'utf-8');

Copy after login

Run the script using:

node scraper.js

Copy after login

The resulting academy_awards.csv file contains the scraped data.

A JavaScript scraper for the Wikipedia Academy Award List.

This tutorial builds upon previous scraping tutorials using Go and Python. Consider supporting the author if this was helpful: A JavaScript scraper for the Wikipedia Academy Award List.

The above is the detailed content of A JavaScript scraper for the Wikipedia Academy Award List.. For more information, please follow other related articles on the PHP Chinese website!