Home > Backend Development > PHP Tutorial > What\'s the Best Way to Parse RSS/Atom Feeds in PHP Using SimpleXML?

What\'s the Best Way to Parse RSS/Atom Feeds in PHP Using SimpleXML?

Patricia Arquette
Release: 2024-11-25 15:51:16
Original
731 people have browsed it

What's the Best Way to Parse RSS/Atom Feeds in PHP Using SimpleXML?

Best Way to Parse RSS/Atom Feeds with PHP

Magpie RSS is a popular library for parsing RSS and Atom feeds in PHP, but it's known to fail when encountering malformed feeds. Hence, alternative options may be necessary.

One of the recommended alternatives is to use PHP's built-in SimpleXML functions. SimpleXML provides an intuitive structure for parsing XML documents, including RSS and Atom feeds. It also detects and handles XML warnings and errors. If an error occurs, the feed source can be cleaned using a tool like HTML Tidy before attempting to parse it again.

Here's a simple class using SimpleXML to parse an RSS feed:

class BlogPost
{
    var $date;
    var $ts;
    var $link;

    var $title;
    var $text;
}

class BlogFeed
{
    var $posts = array();

    function __construct($file_or_url)
    {
        $file_or_url = $this->resolveFile($file_or_url);
        if (!($x = simplexml_load_file($file_or_url)))
            return;

        foreach ($x->channel->item as $item)
        {
            $post = new BlogPost();
            $post->date  = (string) $item->pubDate;
            $post->ts    = strtotime($item->pubDate);
            $post->link  = (string) $item->link;
            $post->title = (string) $item->title;
            $post->text  = (string) $item->description;

            // Create summary as a shortened body and remove images, 
            // extraneous line breaks, etc.
            $post->summary = $this->summarizeText($post->text);

            $this->posts[] = $post;
        }
    }

    private function resolveFile($file_or_url) {
        if (!preg_match('|^https?:|', $file_or_url))
            $feed_uri = $_SERVER['DOCUMENT_ROOT'] .'/shared/xml/'. $file_or_url;
        else
            $feed_uri = $file_or_url;

        return $feed_uri;
    }

    private function summarizeText($summary) {
        $summary = strip_tags($summary);

        // Truncate summary line to 100 characters
        $max_len = 100;
        if (strlen($summary) > $max_len)
            $summary = substr($summary, 0, $max_len) . '...';

        return $summary;
    }
}
Copy after login

This class provides methods for loading and parsing an RSS feed, extracting and storing individual posts, and summarizing post text for display purposes. By using SimpleXML, this class can effectively and reliably handle well-formed and malformed RSS feeds.

The above is the detailed content of What\'s the Best Way to Parse RSS/Atom Feeds in PHP Using SimpleXML?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template