Website Scraper Script - Programmer Needed
Closed - This job posting has been filled and work has been completed.
We need a seasoned developer that can work with us to develop this software and have a working beta within 1 week.
- On time delivery is essential
- Ability to troubleshoot
- Make ongoing updates and add features
0. Here is are mock up: http://screencast.com/t/yC1roeaO
1. We need to scrape a website like this from Archive.org: http://web.archive.org/web/2011020
- The content can be in html format.
- We have used software like http://www.httrack.com/page/2/en/i
- We can use a scraping tool like the one's above, but we want the process to be more automated from one UI.
2. We need to have a UI that will allow us to make several updates.
- Remove dead images
- Apply no follow tag to all outgoing links
- Preview to Clean-up/Remove Sitewide broken images, bad links, broken scripts etc
- Remove archive.org banners/images.
3. Then we need the ability to upload the processed files to a different root domain.
Please show us your working portfolio so we know how skilled you are.