How to detect if a page is an RSS or ATOM feed

Posted by Pepper on Stack Overflow See other posts from Stack Overflow or by Pepper
Published on 2010-03-14T17:13:13Z Indexed on 2010/03/14 17:15 UTC
Read the original article Hit count: 385

Filed under:
|
|
|

Hello,

I'm currently building a new online Feed Reader in PHP. One of the features i'm working on is feed auto-discovery. If a user enters a website URL, the script will detect that its not a feed and look for the real feed URL by parsing the HTML for the proper tag.

The problem is, the way im currently detecting if the URL is a feed or a website only works part of the time, and I know it can't be the best solution. Right now im taking the CURL response and running it through simplexml_load_string, if it can't parse it I treat it as a website. Here is the code.

$xml = @simplexml_load_string( $site_found['content'] );

if( !$xml ) // this is a website, not a feed
{
    // handle website
}
else
{
    // parse feed
}

Obviously, this isn't ideal. Also, when it runs into an HTML website that it can parse, it thinks its a feed.

Any suggestions on a good way of detecting the difference between a feed or non-feed in PHP?

Thanks,

Pepper http://feedingo.com

© Stack Overflow or respective owner

Related posts about php

Related posts about rss