Crawl, scrape & API lookup - Python
Closed - This job posting has been filled.
This project is a trial test to ensure that we can build out scalability and ensure we have got the best approach.
Needed is someone to create a multi-threaded Python script that will have the following functionality:
1. Create new Project ID and Que task
2. Scrape search suggest wildcards (python)
3. Pull 10 results for each letter of the alphabet
4. Run across 10 search operators (plus their alphabet variables)
5. Collate list and remove duplicates (while in MySQL database)
6. Batch keywords by Regex Variables (variables have already been created)
7. Run through search volumes (probably require API)
8. Filter by search volume and categorize by Batches (in point 6)
The page will simply be an input field / past projects (hidden behind a username and password), when selecting past projects, results will be displayed in JQuery data table
We have access to proxies (if they are required) and a test server for you to use.
Obviously I will provide more details upon interview.