Home > Backend Development > PHP Tutorial > How to get and parse XML data using PHP crawler

How to get and parse XML data using PHP crawler

王林
Release: 2023-06-13 16:32:01
Original
1898 people have browsed it

In web development, obtaining and parsing XML data is a very common operation. This article will focus on how to use a PHP crawler to obtain and parse XML data.

1. Obtain XML data

  1. cURL library

cURL library is a very commonly used PHP library for obtaining data. You can use the following code to get XML data from a website:

$url = 'http://example.com/example.xml';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$xml = curl_exec($ch);
curl_close($ch);
Copy after login

Here we use curl_init() to initialize a cURL object and set the CURLOPT_URL parameter to the target URL. Setting the CURLOPT_RETURNTRANSFER parameter to 1 will cause cURL to return a string instead of outputting the content directly.

  1. file_get_contents() function

While the cURL library obtains XML data, the file_get_contents() method can also obtain XML data. We can achieve this goal by following the following example:

$url = 'http://example.com/example.xml';
$xml = file_get_contents($url);
Copy after login

2. Parse XML data

PHP provides a variety of methods to parse XML data.

  1. SimpleXML

SimpleXML is a very easy-to-use XML parser in PHP. We can use SimpleXML as follows:

$xml = simplexml_load_string($xml);
Copy after login

Here we have used the simplexml_load_string() method to parse the XML string and convert it into an object.

For example, suppose we have the following XML document:

<?xml version="1.0" encoding="UTF-8" ?>
<bookstore>
  <book>
    <title>PHP 7 Programming Blueprints</title>
    <author>Vikram Vaswani</author>
    <price>28.99</price>
  </book>
  <book>
    <title>Mastering PHP 7</title>
    <author>Chad Russell</author>
    <price>39.99</price>
  </book>
</bookstore>
Copy after login

We can use the following code to access and output this XML data:

foreach ($xml->book as $book) {
  echo "Title: " . $book->title . "<br>";
  echo "Author: " . $book->author . "<br>";
  echo "Price: " . $book->price . "<br>";
}
Copy after login

The output is as follows:

Title: PHP 7 Programming Blueprints
Author: Vikram Vaswani
Price: 28.99
Title: Mastering PHP 7
Author: Chad Russell
Price: 39.99
Copy after login
  1. DOMDocument

DOMDocument is another commonly used XML parser in PHP. We can use DOMDocument as follows:

$doc = new DOMDocument();
$doc->loadXML($xml);
$books = $doc->getElementsByTagName("book");

foreach ($books as $book) {
  $titles = $book->getElementsByTagName("title");
  $title = $titles->item(0)->nodeValue;

  $authors = $book->getElementsByTagName("author");
  $author = $authors->item(0)->nodeValue;

  $prices = $book->getElementsByTagName("price");
  $price = $prices->item(0)->nodeValue;

  echo "Title: " . $title . "<br>";
  echo "Author: " . $author . "<br>";
  echo "Price: " . $price . "<br>";
}
Copy after login

Here we use the DOMDocument class to parse the XML document, and then use the getElementsByTagName() method to obtain specific elements. The final output is the same as the SimpleXML parser.

3. Summary

In this article, we learned how to use PHP crawler to obtain and parse XML data, including using the cURL library and file_get_contents() function to obtain XML data, and using SimpleXML and DOMDocument parse XML data. Hope this article is helpful to you.

The above is the detailed content of How to get and parse XML data using PHP crawler. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template