Data Extraction from Press Releases


Job Description

I have an RSS feed that provides a stream of links to press releases. I need a number of activities performed on the RSS feed.

1) Between 50% and 75% of the RSS feeds contain event-scheduling information (example: In some way I need a process of filtering out the press releases announcing results, to ensure that only press releases scheduling an event are passed on to step 2

2) For each of the press releases (via links in the RSS feed), I need data extracted. Specifically, I need the company stock ticker, the type of event, the start date/time, the webstreaming address, the phone number, and (if provided) the passcode.

In the example above, the information extracted would be:
Company: HomeAway, Inc.
Ticker: AWAY
Type of event: report financial results
start date/time: July 25, 2012 4:30 p.m. Eastern Time
Webstream address:
Phone number: (877) 407-0789
passcode: 397231

3) I need the extracted information delivered to me in a CSV file

Notes: I prefer to use "off the shelf" and pre-built software tools for as much of this project as I can. I am using Yahoo Pipes as the input RSS feed, and I want minimal (if any) web programming done to complete this task. For extracting the information from the press release, I would prefer to use the Alchemy API, although I am open to other solutions as well

To be considered for this job, you must do the following:
A) Write "Keyword" as the very first thing on the first line of your submission
B) Tell me briefly how you would solve the problem, from start to finish. What tools/software would you use--and again--I strongly prefer pre-built tools/software over custom web development

