Understanding what geocoding is, how it works, and its applications to everyday life.
Geocoding is the transformation of a human readable address into a latitude and longitude coordinate. For example, 99 Upper Lewes Road, Brighton, England would be transformed into (50.83565°, -0.127616°) which then enables computer systems such as GPS to help us humans navigate to that location.

The geocoder itself is a piece of software which performs the transformation from address to location. Companies like Google, Mapbox, and Naurt run this software in a cloud environment and provide it as an online service where you can upload an address to receive a latitude and longitude response.
The most common use of geocoding is navigation. Navigation algorithms can’t work with human readable addresses, they need latitude and longitude coordinates. A geocoder is therefore the first step in any navigation process, from Google Maps to your TomTom Sat-Nav.
Along a similar vein to navigation is last mile delivery. Instead of navigating to a single address, last mile delivery drivers have to navigate to multiple addresses along a planned route. To plan the route, accurate geocoding is required to ensure the route is optimal, as any wasted time can result in lost profits across the fleet. Once en route, the driver then needs the best information possible on where to park at each address and where to drop the parcel off at each address. This data can save them minutes on each delivery which makes a big difference across the entire fleet.
If a delivery driver is en route to you, it’s likely you bought the product online. Whenever you buy something online, there’s always too many forms to fill in. Luckily, address fields have become more automated thanks to geocoders and similar tools. You may have been able to look up your address from a street number and postal code, or even scroll through all addresses within your postal code and select one. If you entered it manually, a validity check was most certainty carried out on the address to check that is really exists. These processes have been designed to reduce friction for the customer and reduce delivery confusion for the company.
Where did you first see the product you ordered? Possibly in an advertisement? Both physical and online ads are targeted based on location. When you’ve previously ordered a product the company has collected your address, geocoded it into a location, and stored it. They’re then able to go ahead and plot it alongside all their other customers’ locations to spot geospatial trends in sales, marketing, customer satisfaction, and any other metric they may be interested in.
When insuring your house, car, or any other property, insurance companies will need to know where you live. Again, instead of a location, you’ll provide your address which then gets converted into a location by a geocoder. The insurance company will then go away and compare your location to geospatial data on natural disasters and crime rates to arrive at a reasonable insurance premium.
Geocoders are structured around one core component - a database. The database is a collection of addresses, towns, cities, administrative boundaries, roads, and any other geospatial data you may be interested in searching, for instance; bus stops, shops, parks, landmarks, or rivers. On top of the database there are three generic services which help you find what you’re looking for.
When an address is inputted into the geocoder it could be in any format, so it needs to be standardised. First, the address is parsed including detecting the language, expanding any common abbreviations such as cl into the full word, close, and removing any invalid characters. The address is then tokenised, classifying each part into a street name, town, postal code, and so forth. With this, the address can be reformatted into any standard needed. In the UK, addresses have the format street number, street name as in, 99 Upper Lewes Road. However in Estonia, it would be written street name, street number, as in Upper Lewes Road, 99. With the knowledge of the country’s standards in hand and a tokenized address, it’s time to search the database.
Due to the hierarchical nature of an address, it’s possible to perform a search that can narrow down the results with each address component. For instance, if a postal code is present in the inputted address, the geospatial area to search can be narrowed down. If a street is then present, it can further narrow the search to addresses only on that street. Now in an ideal world they’d be an exact match to the street and street number but this isn’t always the case. A user may misspell a name, a road name could have st in it which expands to saint or street, or someone may have recently renamed their house. With this in mind the search needs to be flexible, resistant to typos and return multiple possible addresses.
With a list of possible results from the database it’s now time to rank them in order of relevance. Within geocoders, there’s two types of relevance: textual and locational. Textual refers to how well the searched address matches with the address in the database. This is affected by typos, ambiguities in the address, and the language used to search. On the other hand, locational relevance refers to how close the results are to a user inputted location, the location of the user’s device, or where they’re currently looking on the map. If you’re searching for 10 main street in Australia it’s very unlikely you’d want a result from the USA. The search results need to be intelligently ranked using a combination of both types of relevance with the most relevant then returned to the user
It’s also worth mentioning that standard geocoders will often use interpolation to fill in addresses when no data is present for it. Let’s say you have the location of an address at the start of the road and a location of an address at the end of the road. You can use the road’s shape to fill in the numbers of the missing addresses and their locations. At Naurt we avoid using this type of result as it often results in a response with degraded quality, and instead focus on obtaining extensive address coverage for each region we operate in. This leads us nicely onto the next section: Accuracy.
To speak about accuracy, it’s useful to separate the geocoder into the parts described in the previous section: the underlying database and the search services.
Data accuracy is how far away the returned location of the correct address is compared to the true location of that address. The vast majority of geocoders will return a “rooftop geocode”, meaning the returned location will be somewhere within the outline of the building. This is often the best case scenario and while it’s accurate, it’s certainly not precise, as large buildings which contain multiple addresses (flats, apartment complexes, offices) will return a single location for all addresses. With this level of fidelity it’s possible for a driver to get within the vicinity of the building but not always possible for a pedestrian to gain access to the building with ease.
Data accuracy also varies by locale. Houses in rural areas are surveyed less often due to their remoteness and are therefore older and lacking in data at the street number level. Urban areas also suffer a similar problem but at the sub-address level. In a city, the street number is likely known but conversions of buildings into flats or apartments lead to outdated data being present within the geocoder.
Instead of returning rooftop geocodes, Naurt returns the parking spot and building entrance for each address. Large building complexes are given individual entrance locations for each sub-address where possible and Naurt never falls back to street-level or administrative-level geocodes to ensure the highest quality of data. Building entrances and parking spots are usually accurate to within 10 metres, though this does vary with building complexity.
Search accuracy is how relevant the returned address is to the searched address. In an ideal world, every search would return the location of the intended address; however, address data is messy and rarely matches 100% with the address stored in the database. This can lead to perceived data inaccuracy when in fact, it’s simply the wrong address being returned. Search accuracy varies more between geocoders than data accuracy and can be very difficult to quantify. It may be that the search service itself is not effective, or it may be that the geocoder doesn’t have data in the region searched and is instead returning a different address. Luckily, it is also possible to bypass addresses altogether.
Reverse geocoding is where a latitude and longitude is transformed into nearby addresses. This is the reverse of geocoding (often called forward geocoding), and can be useful if you want your users to drop a pin on the map or know what data is available near their current location. If you’re in a region which has poor search accuracy because of a lack of address data or translation problems, then reverse geocoding with distance limits could be more effective.
Geocoders are widely available as online APIs. A quick google search reveals hundreds of websites offering geocoding across every country on the planet. However, most of these geocoders will be built on top of free and open-source datasets which have patchy coverage. Good providers will combine datasets from several sources, both paid and free, to ensure good coverage and quality. The most popular providers, Google and Mapbox, do exactly this as well as generating data internally. At Naurt, we’ve opted for this route too and doubled down, specialising our geocoder for delivery. We don’t offer points of interest such as “the statue of liberty”, which would be of no value for our delivery clients - we offer complete addresses and a parking spot and building entrance for each one.
For an easy to follow guide on how to use a geocoder, sign-up for a free Naurt account and follow our guide here.
Geocoding has become ingrained in everyday life, whether your conscious of it or not. As the reliance on geocoders has grown, the quality and coverage of geocoders has grown too. For the general public, geocoders are now accurate enough that the focus is on coverage. Ensuring every building is available is more important for Google Maps than making sure each building is mapped to the micro-level. Conversely, for businesses in the delivery vertical this is where the opportunity is. Knowing the location of sub-addresses or units within a building enables faster deliveries with a higher rate of customer satisfaction. Over the next 5 years, this data will be more widely integrated into geocoders in addition to more detailed data about the address. Delivery and insurance companies would love to know what type of building the address is in, what floor it’s on, what door is best to enter through, whether there’s parking nearby, or gates securing the entrance. Fortunately, these are exactly the questions Naurt aims to answer through a simple geocoding API.