OCR Tesseract expert to convert many file types emailed in
Emails will arrive with attachments. Convert them to text. File formats that will arrive as attachments include: PDF, doc, docx, png, jpg, tif, gif.
You will upload the generated text file to amazon s3 and post to a URL the name of this file.
U will extract the emails from a google apps email account or one pointing at the server. Probably a google apps email account. So u will write a script that will constantly check for emails and delete them after processing them. U can use cron every minute or use a constantly running script.
Skills: pdf, amazon