Skip to main content

TheGenieLab Blog

How to Parse RSS Feeds with PHP: An Example Using BBC News

16 April 2014 by Daniel Lewis


If you like the article, please share it!


APIs (application programming interfaces) have had arguably one of the biggest influences on the web over the past 5 years.

The ability to leverage a company’s data and services, quite often for free (although it might be rate limited), to better serve your own users is an absolute blessing to developers. Just think about it for a second. Businesses that rely on their API usage would either not even be around today or wouldn’t have been able to grow anywhere near as fast without APIs being a ‘thing’.

Who uses APIs?

There are a number of different industries that rely heavily on API usage one way or another:

  • Social & blogging: Facebook, LinkedIn, WordPress, Flickr and Reddit
  • Email service providers: MailChimp, SendGrid, Mailgun, Vertical Response, Constant Contact
  • Payment services: Stripe, Balanced Payments, GoCardless, Paymill, Coinbase
  • Cloud storage: Google Drive, Dropbox, GitHub, Box, Maytech
  • Miscellaneous: Spotify, IMDB, Yoda Speak, Foursquare

 Read more: 5 Websites That Make a Difference Using APIs

Also let’s not forget that at a very low level, RSS feeds are to a certain extent APIs also. Thousands of businesses rely on parsing RSS feeds to serve content to users.

With Google Reader unfortunately being thrown into the bin last year, many feed readers sprung up to take advantage of the gaping hole in the market that Google left. Some examples of companies that sprung up around this time:

  • Feedly (my favourite 1/2)
  • Flipboard (my favourite 2/2)
  • NewsBlur
  • Pulse
  • Tiny tiny RSSX

So we know these guys take advantage of RSS feeds around the web; they parse and organise hundreds of thousands daily. Obviously running a successful RSS reader business is not *quite* as simple as that, but that’s ultimately what it boils down to.

So what does parsing an RSS feed entail?

Parsing the BBC News API using PHP

You’ll be delighted to know that this is actually ridiculously simple; it only really requires a couple of lines of PHP! You’ll need:

  • A local webserver (Wamp / Mamp / Xamp will do just fine)
  • An RSS feed URL (http://feeds.bbci.co.uk/news/rss.xml)
  • Your brain

Assuming you can get up and running ready to write some PHP (if you can’t, check out this Stackoverflow thread), here’s one way you can do it (there are many, perhaps more efficient, ways):

Loading up the RSS content

Since we have checked and already know that the RSS feed we’re using is in XML format, we know that it is possible to parse XML in some way. PHP5 has a handy function built-in called SimpleXMLElement whose job it is to parse XML formatted content. Let’s put in the BBC feed and see what happens:

$newsoutput = new SimpleXMLElement(‘http://feeds.bbci.co.uk/news/rss.xml’, true);

If we var_dump $newsoutput, here’s what we get:

This is actually ready to be parsed now, but my preference is turn this from the simpleXmlElement object into a normal array:

$newsoutput = new SimpleXMLElement(‘http://feeds.bbci.co.uk/news/rss.xml’, LIBXML_NOCDATA, true);
$newsoutput = json_decode(json_encode($newsoutput), TRUE);

The dump of this is friendlier to work with and looks like this:

Now we’re ready to parse this and loop through the posts one-by-one. I think it’d be nicer if we were able to click a button to parse the feed though, and not parse immediately on page load.

I’ve added a form that POSTs back to itself with a hidden input called “submitted”. This is so we can do a check in PHP to see whether or not we’re ready to parse the feed by saying “if the button has been pressed, parse the feed, else just show the ‘parse now’ button”.

We can also use this conditional to change the button text from “parse now” to “refresh feed” based on whether or not the feed has been parsed and is in front of us or not.
The form methodology can be seen in the complete code I’ve put on GitHub here:

Outputting the feed news items

We’ve got all the data loaded on our server – now we need to loop through it and present it nicely. We can see by the above var_dumps the structure of the data multidimensional nature of the data.

The individual news items are 2 levels deep;

• Attributes
o Version
• Channel
o Title
o Link
o Description
o Language
o lastBuildDate
o Copyright
o Image
- Url
- Title
- Link
- Width
- Height
o Ttl
o Item
- Title
- Description
- Link
- Guid
- pubDate

I have bolded the bit that we need – each “item” array is an individual news item. Here’s how we loop through these:

foreach ($newsoutput['channel']['item'] as $item) {
echo $item['title'];
}

‘Title’ could be any of the nodes from the bolded item array above. With some really simple Bootstrap styling applied using the wonderful bootstrapcdn.com, our finalised BBC News RSS reader looks like this:

So there we have it! Parsing an RSS feed with PHP is extremely simple. Now you’ve got the data on your server, you could always create a mashup with a different API.

For example, translate the headlines into Yoda Speak, or create a word cloud for each article. With APIs becoming so prevalent on the web, only your creativity can hold you back!


Enjoyed our article? Share it!

Get some advice about your project - leave a message and we'll be in touch!

Find out more about our ecommerce stores

Read about our API development services


Leave a Reply

All fields are required. We won't spam you!


Showing 1 - 0 of 0 comments

© TheGenieLab LLC

A Limited Liability Company

Incorporated in the State of Florida No. L1000082688

Head Office: 400 NW 26th Street • Miami, FL • 33127 • +1 305-762-0130

UK: +44 (0)3333 445 809

Privacy Policy | Terms