Address validation is a complex process regardless of the geography. This has been addressed in other best practice articles such as “Understanding Address Validation”. Each country or region holds addresses in various formats, and AddressDoctor provides a single interface to validate against over 250 countries. Understanding the basics for each country will enable the business to optimize the performance of their address validation solution.
This article describes United Kingdom address localizations and related best practices, including the UK address format and the importance of maintaining good data quality using real time and batch interactions.
Informatica provides address validation through the methods described below, which can then be implemented and integrated according to the needs of the business.
The UK address structure is a very granular dataset where a single postal code covers a relatively small geographical area (when compared to postal codes of other countries). This makes having an accurate postal code even more important, while the underlying address validation set is more volatile.
The data provider for the AddressDoctor UK database is the Royal Mail. They are responsible for maintaining the UK Postal Address File (PAF) covering England, Scotland, Wales, Northern Ireland, Jersey, Guernsey, and the Isle of Man. The data is updated each month and can be downloaded from AddressDoctor. Subscribers are typically notified when an update is available.
The PAF contains over 28 million addresses with roughly 1.8 million postcodes. These are a mixture of residential and organization addresses. The Royal Mail makes tens of thousands of minor changes each month so it is important to update the data as soon as it is available.
The Royal Mail does not create addresses; it just amends the source addresses to meet their main function, mailing. They take addresses from the UK local authorities (local government bodies) and append a postcode to the full address as a requirement for mailing. Each local authority is responsible for street naming and building numbering and naming.
The PAF file also has a unique key for each address in the database called the Unique Delivery Point Reference Number (UDPRN). The UDPRN can be found in the GBR supplementary information.
The elements below need to be considered when delivering letters to a UK address. Not all elements are required for every address. The Royal Mail PAF guide provides a detailed explanation of exactly which elements are needed when.
Address Section |
Element |
Example |
Is Required |
Premise |
Sub Building Name/Number |
Flat 1 |
Yes, if applicable |
Premise |
Building Name |
Rose Cottage |
Not required if there is a building Number |
Premise |
Number |
11 |
Yes, if applicable |
Premise |
Organisation Name |
Cath’s Cafe |
Yes, if applicable |
Premise |
PO Box Number |
6 |
Yes, if applicable |
Street |
Dependent Thoroughfare |
Chestnut Court |
Yes, if applicable |
Street |
Thoroughfare |
Cypress Rd |
Yes, if applicable |
Locality |
Double Dependent Locality |
Tyre Industrial Estate |
Yes, if applicable |
Locality |
Dependent Locality |
Blantyre |
Yes, if applicable |
Locality |
Post Town |
GLASGOW |
Yes, if applicable |
Locality |
County |
Surrey |
No |
Postcode |
Postcode |
SW1P 3UX |
Yes, Always |
A full GBR address will have at least 1 item from each of the different address sections, but it is not un-common for more than 1 to be required.
PREMISE
STREET
LOCALITY
POSTCODE
The postal code and building information is the most heavily weighted part of the address in the UK. The postcode will take a piece of mail to the street and then the building information will take the mail to the letter box.
The postcode in the UK is made up of alphanumeric characters. Only the formats below are considered valid (“A” representing alpha characters and “9” representing numeric).
AA9A 9AA
A9A 9AA
A9 9AA
A99 9AA
AA9 9AA
AA99 9AA
There are some postcodes that are an exception to the rule. These exceptions are listed below.
GIR 0AA |
|
Postcode for a national bank conceived in the 1960s/1970s called GiroBank |
SAN TA1 |
|
Postcode for Father Christmas in Reindeerland |
ASCN 1ZZ |
|
Ascension Island |
BIQQ 1ZZ |
|
British Antarctic Territory |
FIQQ 1ZZ |
|
Falkland Islands |
STHL 1ZZ |
|
Saint Helena |
SIQQ 1ZZ |
|
South Georgia and the South Sandwich Islands |
Simple validation on the postcode format can highlight various data quality issues that need to be evaluated. The first part of the postcode, before the space, is called the outward. It is used by sorting machines to direct the mail to the specific delivery offices. In the case of missing outward codes, town and locality information can be used to determine the right geographic location.
The inward code, after the space, is used to assign the “postal walk” so the delivery person knows the street of the address. In the case of missing inward codes, street names can be used to direct the mail to the correct location.
Regardless of which version of the engine is implemented, the minimum information needed to validate an address to a high level is the postcode and building information. These address elements are critical if the fast completion engine is being used to capture addresses in front end systems or user interfaces.
It is common for retail websites and call centers in the UK to implement solutions like AddressDoctor Fast Complete. The purpose of many of these is to allow a user to only enter the building name or number and the postcode so that Fast Complete can auto populate the remainder of the address saving time and improving the customer experience.
When using the batch or bulk verification engine and the address data is stored in a discrete or atomic format, it is a best practice to implement the discrete address template in the Address Validation (AV) transformation to get the best results. This combination will typically provide the best validation rate. Unfortunately the address data is not always stored in this format. Some systems store addresses data in a hybrid or multiline format. In those instances, hybrid and then multi-line AV transform templates will produce the next best results, in that order.
Addresses in the UK can change in a variety of ways including changing street identifiers, re-coding postcodes, and renaming/numbering addresses. This means that as soon as the data is captured, the data can degrade. To maintain a high level of quality, front end and back end address validation must be utilized. The process for this is shown in the process flow in the figure below.
The bulk address (re)validation process will need to align with the business requirements. It can be run each time an update is done or during predefined periods. The more frequently the process is invoked, up to the PAF update frequency, the higher the level of data quality that can be achieved.
The figure above is a common example of address data flowing into a target system. There will be other sources such as, bulk data feeds, and data migrations that will need to follow a similar process. The address validation process flow will need to account for these as well.
A common practice seen in UK addresses is the value of a “vanity” address item. This is where customers have enhanced the address over and above the strict postal presentation and included information not strictly required for mailing but merely for personal or aesthetic preference. In the UK, this is often seen in three scenarios.
The first scenario is where a customer would like to have a locality associated with them that is not part of the official address. An example is “Mayfair” in London; officially the address might be just outside the boarder. Due to the high granularity of the postcode, this information is not required for mailing but good for customer service.
In the data industry, this is now referred to as “Former Postal Counties”. The UK PAF file (Royal Mail Postal Address File) stopped providing county information in the year 2000. This can have adverse effects on some validations tools, which may remove this information from the results. Many residents who are familiar with the old county based sorting system like to see county information on the letter.
By far the most popular residential vanity item is the “Vanity House Name”. In this scenario, the resident has given their house a (optional) name to post a letter. “The Rose Cottage, 13 High Street”, is an example as having just “13 High Street” is enough to identify the property when supplied with a postcode.
Address validation is available in both real-time point of entry and batch processing. Both approaches often need to be used together to ensure data is captured accurately and routinely cleansed to stop it from degrading. Address validation cannot be painted with a single brush and understanding the needs of the business is critical in the decision making process and developing the address validation business strategy.
For example, if the key requirements are to use the address elements for deduplication, using the strict postal address will help with increased matches due to the same address begin represented in the same way. Most single customer view projects will use the strict postal address for this task.
If the business objective is to reduce mailing costs, then Informatica Address Validation can help by formatting the address in the correct way for the postal authorities. It can also use GBR supplementary information to return extra information on an address, like DPS, which can be used to achieve mailing discounts.
If client centricity is all part of a wider initiative, then use Informatica to retain and hold vanity address items separate to the strict postal information. AddressDoctor can output information in “Recipient Lines” and “Residue Lines”. Keeping information separate like this will help with accurate reporting and ensure that the business is in touch with the customer by retaining information they may have provided.
Further information on the AddressDoctor, address formats can be found on the address doctor website.
Conduct a search on the Royal Mail Website for “programmers guide”. This document used the PAF Programmers Guide edition 7 v5 for reference.