Scrape Canadian postal codes

Closed - This job posting has been filled and work has been completed.

Job Description

Canadian postal codes look like this: A1A 1B1 (first a letter, then a digit, then a letter then a space, then a digit, etc). the first three characters is called the FSA (Forward sortation Area).

I want to collect the Canadian FSAs, a sample of cities (1 or more) for each FSA, and the latitude longitude coorinates for each of those cities.

This information could easily be collected from Wikipedia. Look here:

http://en.wikipedia.org/wiki/List_of_N_postal_codes_of_Canada

Which is a partial list of some of the FSAs in Canada.

All the FSAs with sample cities are available at Wikipedia. Each of those cities has lat long info on its wikipedia page.

For example:
http://en.wikipedia.org/wiki/Kitchener,_Ontario

NOTE: I need lat long in WGS84 floating point format, not in degrees, minutes, seconds format. For this, you'll have to 'click through' the city's wikipedia page to

for example:
http://toolserver.org/~geohack/geohack.php?pagename=Kitchener,_Ontario&params=43_27_N_80_29_W_type:city_region:CA-ON

The coordinates I want are at the top of that page, second from top line. In this case, it's: 43.45, -80.483333


RURAL:

Those wikipedia lists of postal codes have a 'Rural' section with large numbers of small towns. For these sections, please just deliver the locations of 2 towns in that FSA instead of all the towns.

QUALITY CONTROL:

I will inspect a 10% portion of your results by hand. I will not pay the contract if I find more than 1 mistake (an inaccurate coordinate or invalid city)

SIZE:

I'm expecting about 1500-2000 results (town,coordinates)

OUTPUT:

I'd like a spreadsheet (preferably CSV) where every row is a 'city'. Each row should have an FSA in the first column, a city name in the second column, a link to that city's wikipedia page in the third column, that city's latitude in the 4th column, longitude in the 5th column.


Thanks in advance!