In this notebook, you will learn how to geocode different sorts of location data by making requests to several online APIs (Application Programming Interface) for latitude and longitude co-ordinates associated with those locations.
The aim of the notebook is not to teach you formal approaches for working with APIs or the data that is returned from them. Instead, it's something to whet your curiosity. Something to show you how, with a few lines of Python code, you can start to work with live, third-party datasources and online services to perform real-world programming tasks.
If something doesn't work: DON'T PANIC. You won't break your computer and you won't break the internet. And you won't fail the module if you just move on!
The location data we will consider includes:
#The requests library makes it easy to call URLs using Python
import requests
Postcodes are a widely used form of location data, typically capable of identifying a location to a resolution of a few hundred square metres.
There are several online services that will return geolocation information given a postcode.
To call the service, we construct a URL as defined for a particular API and make a request to that URL using the python requests
package.
Data is often returned from webservices using the JSON (Javascript Object Notation) data format, although some APIs allow you to specify other formats such as XML.
(One advantage of the JSON response is that it can be immediately consumed by a Javascript script called from inside a webpage.)
JSON and XML both allow data to be represented in a structured, tree based hierarchical format. The first API we will use, published via the postcodes.io website, structures its response data in the following way:
The result
node is at the top of the tree with children postcode
, latitude
, longitude
and so on. The codes
child has further children, such as: admin
and parish
.
In python, data structures of this form can be represented using the dict
("dictionary") structure, which you will meet elsewhere in the course.
The python requests
library has a method that parses a correctly formed JSON response as a python dict
, or more generally, as a set of nested dicts. In this case, one dict
structure may be nested inside another to support child, grandchild, great grandchild, and so on, levels of structure.
The contents of different levels of the nested dict
data structure can be accessed by using a form of associative, relative addressing. For example, if the variable mypostcode
is set to the dict
shown above, we could access the contents of the main result
part of the data structure by writing: mypostcode["result"]
.
To obtain the value of items in deeper nested parts of the data structure, we simply add further levels of relative addressing. To fetch the value of the postcode
, we need to specify the path to it via the result
node: mypostcode["result"]["postcode"]
. To obtain the value of the parish
in the code
part of the data structure, we specify the path to it as mypostcode["result"]["code"]["parish"]
.
Run the following cell to call the postcodes.io
API with a particular postcode.
See if you can make sense of the result that is returned.
postcode = 'MK7 6AA'
r=requests.get('https://api.postcodes.io/postcodes/{PC}'.format(PC=postcode))
r.json()
Try rerunning the previous cell using different postcodes - can the service locate your home postcode?
postcodes.io
JSON data¶Once we have retrieved the data from the API, and cast it as a python data object, we can look inside it programmatically.
For example, we can find the latitude and longitude values.
#Obtain the lat/long of a postcode
lat=r.json()['result']['latitude']
lon=r.json()['result']['longitude']
#Display the result
print(lat,lon)
Having access to the latitude and longitude means we can start to make use of that information, for example by plotting it on a map.
You may recall how we previously used the folium
package to generate interactive maps from python code within a notebook.
We can do a similar thing again here.
!pip3 install folium
#Plot the lat long of a postcode on a map
#We need to import the following packages to access the maps
import folium
#Create a map centered on the postcode location at a particular zoom level
mymap = folium.Map(location=[lat, lon], zoom_start=15)
#Create a popup message using Python string formatting to create the label based on variable values
popupstr = 'Location of {PC}: ({lat},{lon})'.format(PC=postcode, lat=lat,lon=lon)
#Display a marker for the location
folium.Marker([52.0239, -0.7072], popup=popupstr).add_to(mymap)
mymap
As well as geolocating postcodes, we can also goecode complete (or partial) addresses. One API that supports address based geocoding is the Google Maps geocoding API.
Once again, we need to construct a URL according to a pattern defined by the API documentation. Then we can make a request to that URL and hopefully get the geocoded data back as a response.
address='Open University, Walton Hall, Milton Keynes, MK7 6AA, UK'
r= requests.get("https://maps.googleapis.com/maps/api/geocode/json", params={'address': address, 'sensor': "false"})
r.json()
Try rerunning the previous cell with an address that is familiar to you. Does the API find it?
time
library and add the statement time.sleep(1)
inside the loop to pause its execution for one second during each iteration.folium
map object to display several markers, one for each of your (looped) postcodes. Inside the postcode loop add a corresponding marker to the map. Don't forget to render the map from the last line of code in the cell.As well as looking up geolocation data for a postal address, we can also try to look up a location based on the IP address of a computer. There are seveal websites that allow you to lookup the IP address of the device you are using to connect to the internet, and several webservices too.
I'm going to use a simple service from Amazon web services that returns an IP address terminated by an end of line (\n
) character. By using the requests
library, I can call the URL, access the data response (text
) and then strip (.strip()
)) the end-of-line whitespace character from it.
myIPaddress=requests.get('http://checkip.amazonaws.com/').text.strip()
myIPaddress
#We can construct a URL based around the IP address of the machine making the request as follows:
url='https://freegeoip.net/json/{IP}'.format(IP=myIPaddress)
url
r=requests.get(url)
r.json()
The result may surprise you, for example if the notebook and the python process associated with it is running on a server hosted in the cloud. In this case, try looking up the IP address associated with computer you are using to access the internet. You can find this IP by visiting the link: http://checkip.amazonaws.com/.
The Google geolocation API can be used to look-up the geographical locations (latitue and longitude co-ordinates) of cell towers and wifi hotsposts based on their unique IDs.
To call the Google webservice to look up the geographical locations of cell towers or wifi hotspots from their IDs, you will need to get a Google Geolocation API token: visit https://developers.google.com/maps/documentation/geocoding/get-api-key and follow the instructions on how to get a key for the geolocation API.
When you have obtained your key, use it to set the googleMapsAPIkey
variable below.
googleMapsAPIkey="AIzaSyAnpCrSlBn72gHzcxrX5EHKxeeKOiOuBVg"
Once you have set your Google API key, run the following cell to look up the details of a particular cell tower:
#Add your cell tower details here.
#You can find them using an app such as the OpenSignal app
postjson = {
"cellTowers": [
{
"mobileCountryCode": 234,
"mobileNetworkCode": 15,
"locationAreaCode": 714,#979,
"cellId": 1671#42333969
}
]
}
url='https://www.googleapis.com/geolocation/v1/geolocate?key={}'.format(googleMapsAPIkey)
print(postjson)
r = requests.post(url, json=postjson)
r.json()
As well as services that provide access to directories that try to associate IP addresses with physical locations, there are also databases that also try to associate MAC addresses of wifi routers with physical locations.
If your computer has a wifi enabled, you will use access a low level command on your computer that identifies in-range wifi routers and provides adminstrative information about them.
STILL NEEDS TESTING & REFINING - TO DO
Note that to call the Google webservice to look up the geographical locations of cell towers or wifi hotspots from their IDs, you will need to get a Google Geolocation API token: visit https://developers.google.com/maps/documentation/geocoding/get-api-key and follow the instructions on how to get a key for the geolocation API.
When you have obtained your key, use it to set the googleMapsAPIkey
variable below.
Also note that the code may look a little bit involved. But DON'T PANIC
, you don't need to be able to write, or even read, this sort of code for the purposes of this course.
googleMapsAPIkey="YOUR_KEY_HERE"
import sys
import requests
#http://stackoverflow.com/a/9859202/454773
def isInt_str(v):
v = str(v).strip()
return v=='0' or (v if v.find('..') > -1 else v.lstrip('-+').rstrip('0').rstrip('.')).isdigit()
#/System/Library/PrivateFrameworks/Apple80211.framework/Resources/airport
import subprocess
def getWifiMacAddresses():
#autodetect platform and then report based on this?
print(sys.platform)
macAddr={}
#For Mac:
if sys.platform=='darwin':
results = subprocess.check_output(["/System/Library/PrivateFrameworks/Apple80211.framework/Resources/airport", "-s"])
results = results.decode("utf-8").split("\n")
for l in [x.strip() for x in results[1:] if x.strip()!='']:
ll=l.split(' ')
#We could use a regular expression - or we can construct our parser a step at a time...
macAddress=l.strip().split(' ')[1]
strength=l.strip().split(' ')[2]
if isInt_str(strength):
macAddr[l.strip().split(' ')[0]]={'macAddress':macAddress,
'signalStrength':int(strength)}
elif win in sys.platform:
results = subprocess.check_output(["netsh", "wlan", "show", "network", "mode=bssid"])
results = results.replace("\r","").split("\n")
macAddress='UNKNOWN'
for l in results[4:]:
if l.startswith('SSID'):
macAddress=':'.join(l.split(':')[1:]).strip()
if 'BSSID' in l:
macAddr[macAddress]=l.split(':')[1].strip()
macAddress='UNKNOWN'
elif 'linux' in sys.platform:
#linux?
#! apt-get -y install wireless-tools
#results = subprocess.check_output(["iwlist","scanning"])
#via PP - linux text - TO DO
# apt-get -y install wireless-tools then run iwlist scanning to display the details of wireless access points your computer can see.
#apt-get -y install wireless-tools gave me "Could not open the lock file ..."
#However when I checked in the Ubuntu Software Centre wireless-tools was already installed. I think non-expert users may use the Software Centre to install additional applications.
#iwlist just give you a not very helpful usage list. What works directly is:
#iwlist wlan0 scan
pass
return macAddr
postjson={'wifiAccessPoints':[]}
hotspots=getWifiMacAddresses()
for h in hotspots:
postjson['wifiAccessPoints'].append(hotspots[h])
print(h,hotspots[h])
print('JSON posted to Google service: ',postjson)
url='https://www.googleapis.com/geolocation/v1/geolocate?key={}'.format(googleMapsAPIkey)
r = requests.post(url, json=postjson)
r.json()
In this notebook, you have learned how to geocode several different sorts of location identifer - postcodes, postal addresses, IP addresses and maybe even the MAC address of any WiFI routers in view of your computer.
You have also seen how we can take the JSON data returned from the geolocation services and parse it as python dict that we can then start to work as data, for example, by plotting markers associated with identified locations on an interactive map.