Do not apply for this job unless you have worked with Amazon Data Pipeline.
I have a number of PHP scripts that are basically connecting various services to an RDS database for imports and exports. However, I'm having problems figuring out how to schedule them to run in data pipeline. We have two classes of php scripts:
1. just PHP scripts that pull/push data from one external web service to an RDS database. There aren't any dependencies, outside of the fact that script 1 needs to run successfully before script 2 can. This could all be accomplished with cron scheduling, but it's unclear how to set up pipeline to activate a php script - where do we host the script, for example?
2. More complex data processing - for example, we have an Ec2 instance that has httpd installed and a mySql database for temporary storage when combining various source of data together. The end result is the pushed to an RDS database. However, it doesn't seem to be possible to have a pre-formatted Ec2 instance that turns on when running for a few hours a day, then goes dormant, when not needed.
Can you help set up a preformatted EC2 AMI that turns on and process a file using Amazon data pipeline, then turns off?