Create 2 excel files using databases

Create 2 excel files using databases

Closed - This job posting has been filled and work has been completed.

Job Description

Please produce two excel files using databases.

Step 1)
Download the “.com” zonefile
www.dnsfan.com/com.zone.gz (Warning this file is HUGE! 1.9GB)
Create a sub-set database from this with every “.com” that contains the word “realty”

Step 2)
Download a database with all cities in the world along with population. (over 1,000 people)

http://download.geonames.org/export/dump/cities1000.zip
http://download.geonames.org/export/dump/cities5000.zip (population listed)
http://download.geonames.org/export/dump/cities15000.zip (population listed)

Step 3)
Using both databases above, Break the “realty” list created from the first database into two separate excel files.
a) Exact match Geographic Locations.xls
COLUMN names City, Country, Region, Population, TotalCities, matches
For example Kirklandrealty.com matches “Kirkland” population 45,778
In the excel file put these three columns per city: “city, country, region, population”
Do not allow other characters in the city name, only {A through Z and hyphen}.
For cities that have special characters other than A through Z put three versions.
An example would be “rio de janeiro” put three rows:
No spaces version: Row 1) riodejaneiro, Brazil, “6,323,037”
Hyphen version: Row 2) rio-de-janeiro, Brazil, “6,323,037”
First word version: Row 3) rio, Brazil, “6,323,037”
An example for Marriott-Slaterville,UT would be
No spaces version: Row 1) marriottslaterville, USA, UT, “3,037”
Hyphen version: Row 2) marriott-slaterville, USA, UT, “3,037”
First word version: Row 3) marriott, USA, UT, “3,037”
Do not allow duplicate city names. Only allow one row per unique city name.
If there are two cities named moscow, please just list how many cities in the world have that same name in the TotalCities column. Combine populations in the population column.
SeattleRealty.com and RealtySeattle.com would only result in one row. However in demand column for Seattle you would put 2 rather than 1.
b) Not exact match.xls
COLUMN names Name, matches
Put “name” in a row, with no “.com”.
Realty101.com and 101Realty.com would only result in one row. However in matches column for 101 you would put 2 rather than 1.