nasdaq.com web data scrapping
Closed - This job posting has been filled.
I would like to get the data from here:
For any arbritrary symbol (in the link above, the symbol is spy).
I would like the data printed out in a csv file with rows like this:
The last four columns is always 1,0,0,0
The first column is the date, this can be passed in as a command line argument and doesn't need to be extracted from the page.
The second column is the time. Time should be something like 729 for 729AM EST, and 1630 for 4:30PM EST.
The next 5 columns are open, high, low, close, volume. For 729 for example, they would be derived by taking all of the trades listed for 729AM and extracting the open (first trade in 729), close (last trade in 729), and high/low which is the highest and lowest priced trade in that minute. Finally, volume should be the sum of the volume of all trades in that minute.
Lastly, I would like data for the entire premarket session (4AM to 9:30AM) and the entire aftermarket session (4PM to 8PM). This means in the part of the page where it says: "In the drop-down select the time range to see more trades:", you will need to query the different time intervals they have listed there, and then grab data for every trade in that interval by going through the pages for each time interval.
Please sleep for 100ms between requests so we don't DDOS the nasdaq servers on accident (but make this a changeable parameter in the code).
The output filename should be table_spy.csv in this case. The output directory should be given to the code as an input, along with the path to a text file containing symbols. The text file would look something like this:
where there is one symbol per row.
I would like a separate code for both premarket and aftermarket (they should both be extremely similar). Both codes should be able to run from the Linux command line. The language used does not matter but please specify when responding what language you will use.
Budget for this job is $100 and seeking to have this done in the next 48 hours.