Site/Data Scraping to Develop Ecommerce Product Pages in Bulk

Site/Data Scraping to Develop Ecommerce Product Pages in Bulk

Cancelled

Job Description

Site/Data Scraping to Develop Ecommerce Product Pages in Bulk

The goal of this project is to automate the creation of thousands of high quality, well SEO'ed product pages for an e-commerce site by using 2 large data sets.

1st Data Set is provided by us and includes basic Item Descriptions, UPC Numbers, Item Costs, Quantities, and some additional data points. These come from inventory excel files generated by our suppliers/distributors. The 2nd Data Set will be generated by scraping competitor websites to generate SEO content, Product Information and images and combined with 1st Data Set to create a comprehensive listing/page for that item. Afterwards, the listings will be uploaded in bulk to the website to create these pages.
The basic idea is to use the UPC (Universal Product Code) fields as the universal and unique data point to tie and match these 2 large data sets together. All of our distributors already provided UPC codes for everything in 1st Data Set and the Competitor/Manufacturer sites we will be parsing already have the UPC displayed on the product page making the matching easier.

The requirements for the final output
1. Comprehensive and detailed URL for the product page that utilizes the product brand ( i.e. Bushnell), the product family (i.e. Elite Tactical Riflescope), the product features (i.e. color, type of reticle, magnification) and the product part number (i.e. ET6245) to create a URL that something like http://www.website.com/bushnell-elite-tactical-6-24x50mm -mil-dot-reticle-matte-black-rifle-scope-ET6245.html
2. “Name” field – the title for the page which will be basically be the same string of text as the URL but without the hyphens. i.e. “Bushnell Elite Tactical 6-24x50mm Riflescope, Mil Dot Reticle, Matte Black, ET6245”
3. “Code” field - A product code on which the parameters will be provided by us. This can be easily automated with our parameters. However, it is extremely important to get this right from the beginning to classify products by category properly. This will likely have to be done manually and we can do that in house.
4. A price for the item (this will be derived from combining 1st Data Set with 2nd Data Set to match the product cost to the item, and then the cost will simply be increased by a percentage to set the price. We can do that in house if we have to.
5. Product Description – this will be taken from the competitor websites or manufacturer websites utilizing the site scraping tool that we are requesting.
6. Images – some of our distributors already provide images in a file tied directly to the UPC, so its fairly easy. Other ones will have to be pulled from competitor/manufacturer websites with site scraping.
7. UPC numbers – this will simply use the UPC numbers that we already have to load them into product pages
8. Manufacturer Part Numbers (MPN) – these are the manufacturer part numbers that the manufacturer themselves use to identify the product. This already comes in the excel files provided in 1st Data Set
9. Google/Pricegrabber Taxonomy – this can easily be done in house if needed by using the “Code” field to determine what categories products need to be in.
This is the first stage of multiple product additions; we have at least dozens more to go through in a similar fashion. We want to find someone who is reliable in the long term and can put out quality work in a timely fashion. The payment for this job is per stage, not in total. The stages can range anywhere from 1,000 products to 50,000 products being added. The total data load is somewhere around 500,000 products.