Collate a 2-year history of horseracing results from a website that provides such information publicly.
Scripts for downloading and parsing should be in Unix-compatible languages and result should be a flat csv file consisting of approximately 300,000 lines of 1 'start' (race-runner combination) each.
Each row expected to have about 20 pieces of information.
Website is in HTML and would need to be queried discreetly and sensibly (so as not to cause inconvenience or annoyance to host).
Could you please have a look at the website http://www.cheval-francais.eu/en/
[Should be in English but there may be some French - use Google Translate]
Required fields are:
racedate, Track, Racenumber, Runner number, horse identifier, jockey identifier, trainer identifier, order of finish (or disqual), shoes worn (fore/hind), Distance of Race, Race type (e.g. Monte?)
Plus any available of the following:
beaten margins (if avail), run time (if available), , Prizemoney earned (if avail), Prizemoney of Race (if avail), state of track (if avail), other