Somehow, while farting around online the other night, I stumbled across a file containing every town in the world. Yes, EVERY TOWN. At least, that’s what the file owner says. (The data was created by MaxMind, and is available at http://www.maxmind.com. ) Not only does it contain the name and nation of each town, but it also contains the latitude and longitude. And not only does it contain all this neat stuff, it is also FREE.
Not one to pass up a bargain, I decided to go for it and download the file. It was zipped, of course, in the txt.gz format. The zipped file took about 30 seconds to download via my trusty Frontier FiOS 20Meg Internet connection. I know the math doesn’t work out on the data size/download time, but sometimes you’re hindered by the other end of your connection. At any rate, the downloaded file is 31.1 meg in size. I unzipped it, and the resulting text file is 123 meg.
Not sure what I’d find when I opened it, and not even being sure that I could open it at all, I tried Notepad. Notepad successfully opened the file after about 4 or 5 minutes. When I saw the contents, I realized that the data was delimited. In other words, it was set up as fields separated by commas. This type of text file is very useful when you want to import the data into a spreadsheet or database. My first thought was to use Microsoft Excel, so I tried that. The file opened quickly, but a warning window popped up telling me that only a portion of the file was opened. I checked the number of rows and saw that there were 65536, the maximum allowable in Excel 2000. Only a very small percentage of the total file had been sucked into Excel. This was not good enough.
I then tried to open the file using Microsoft Access. This worked out nicely. It took seconds for the data to be absorbed into an Access table. The table is complete, holding each and every town in the world (all 2,699,354 of them) along with the nation and coordinates. I am very happy.
But now what happens? What possible use can I make of such an overwhelming amount of data?
I enjoy looking at maps and discovering unusual and humorous town names, so I can spend some time doing that as I scroll through the table. There must be lots of strange and exotic town names out there. I’m excited.
One thing that threw me off at first is the identifier for the nation. Each identifier is two letters, and the data is sorted by nation (using this identifier) and then by town name. The first nation is “AD” and the last one is “ZW.” It turns out that the nation identifiers used are the Internet country codes. A list of these codes can be found here.
If you’re wondering (or if you’re not), the first town listed is Aixàs in Andorra. The last town is Zvishavane in Zimbabwe. The name of my home town, South Bend, is duplicated several times. But the US state in which each is located is not identified. You have to locate the place by the longitude and latitude to determine where it is and which state it is in. So the data could use some enhancement to make it really cool.
Concerning the coordinates, Google Earth is a good tool for that. You can input the coordinates and GE will take you right to the place. You can be a globetrotter without leaving your easy chair. What would really be fun is to run the coordinates of every town in the database. I wonder how long that will take me . . .