Data Scraping / data extraction of news articles from RSS

Data Scraping / data extraction of news articles from RSS

Cancelled

Job Description

The goal of this project is to create a script (prefer PHP but will consider any) that runs continuously while analyzing an RSS feed, visiting the links in the feed, and extracting the entire article contents, including:

Headline
Subheading (if it exists in the article)
Author name
Date and Time published
Image link(s)
Image source (name and outlet. For example, "James Smith/AP")
Video link(s)
Author bio (if it exists)
Tags


The purpose for this is that I need to mock up a news site and need fresh news stories.

Here is the feed to analyze:

http://rssfeeds.usatoday.com/usatoday-NewsTopStories

The data needs to be inserted in a MySQL database.

---
Skills: video