How big is the Eircode database?

Ireland’s new postcode scheme, Eircode is officially launched today. It’s been a long time coming and like it or lump it, we’ll all be using it soon.

Eircode logoThere’s been plenty of debate and criticism of Eircode in recent months, some of it valid, some of it misplaced. However, I did read one thing that caught my attention — the suggestion that Eircode was impractical for use with portable GPS navigators because the full country-wide database would require 2 GB storage and exhaust their flash storage.

That sounds like a lot. Let’s see if it holds up to scrutiny.

A mapping table

A GPS navigator is already perfectly good at taking GPS co-ordinates and telling you how to get there. All it needs is a way to convert an Eircode to the corresponding location. The most straightforward approach is a mapping table that stores GPS co-ordinates for each possible Eircode.

Eircode has two parts: a three character routing key, corresponding to a broad geographic area, and a four character unique ID within that area. Let’s create a routing key table that stores each routing key in use along with an index into our co-ordinate table. This co-ordinate table will store the GPS co-ordinates for all the IDs in each routing area.

We can also save some space by not storing the full Eircode with each entry, just the unique ID portion:

Eircode Mapping Table

How much space will these tables occupy? Let’s consider the Eircode character set which is a mix of uppercase letters and digits. Letters that are often confused with numbers are omitted, so the full set of characters that may appear in an Eircode is:

A C D E F H K N P R T V W X Y
0 1 2 3 4 5 6 7 8 9

That’s 15 letters and 10 digits for a total of 25 unique characters. Each character can be encoded using five bits (32 possibilities), leaving a few left over for other purposes.

The routing code has further restrictions: the first character must be a letter, the second a digit, and the third a digit or ‘W’ (to allow the infamous D6W area code). Each of these can be encoded using just four bits, for a total of 12 bits of storage to store all three. Let’s round this up to 16 bits (two bytes) for simplicity.

A routing table entry also needs an index. We’ll use 32-bit indices, supporting up to 4 billion discrete entries – far more than we’ll ever need.

Together, the routing key plus index need 48 bits – 6 bytes of storage. With 139 routing areas initially planned for Eircode, that’s just 6 * 139  = 834 bytes of storage for the entire routing key table. So far, so good.

The co-ordinate table

Now let’s consider the co-ordinate table. The 4-character unique IDs encode to five bits for each character, so the whole ID takes 20 bits. Rounding up to 24 bits (3 bytes) leaves a few spare for other uses.

What about the GPS co-ordinates? Ireland and its surrounding islands fit in a box defined by these points:

Position Location Latitude Longitude
North Inishtrahull, Co Donegal 55.430019 -7.233810
South Fastnet Rock, Co Cork 51.384167 -9.600278
East Lambay Island, Co Dublin 53.491000 -6.017000
West Tearaght Island, Co Kerry 52.075667 -10.651278

So that’s from 51’N to 56’N latitude and 6’W to 11’W longitude – five degrees in each case. Using millionths of degrees, that’s a range of 0-5 million discrete integer values which provides accuracy to approximately 0.1 metres on the ground, more than enough to pinpoint any individual building and well beyond the precision of consumer GPS equipment.

24 bits is enough to encode any value in the range 0-16 million, so 48 bits (6 bytes) stores the latitude and longitude co-ordinates of any point in Ireland as an offset from 51’N, 6’W.

With each ID code occupying 3 bytes, a total of 3 + 6 = 9 bytes will be used by each entry in the co-ordinate table. At the moment, there are approximately 2.2 million unique addresses in Ireland, so 9 * 2.2 million = 19.8 million bytes (18.8 MB).

Adding this to the routing code table size of less than 1 KB, we can confidently say that the Eircode co-ordinate database will occupy less than 20 MB of storage. On modern devices, this is a very modest amount of storage, equivalent to around 10 JPEG photos or songs.

Adding Addresses

A GPS navigator should be able to convert co-ordinates into an approximate address using its existing database if needed. However, for user convenience, it may want to display the official address for an entered postcode. How much extra space would it take to add a full address to each entry in the co-ordinate table?

This is a bit harder to calculate accurately, without a copy of the entire Eircode address database on hand. However, we can make an educated guess. Let’s presume a 64 character alphabet (enough for uppercase, lowercase, digits and some punctuation), so each character can be stored in 6 bits. We’ll also assume that the average address length is 50 characters. Of course, some are shorter and some are longer.

50 characters of 6-bits each is 300 bits, 38 bytes. We’ll add another 4 bytes of overhead to index the table for quicker access speed. That gives 42 bytes per Eircode. Multiplying by 2.2 million entries, the whole address database would occupy 88 MB.

This is a very simple implementation. Since the full address database contains a lot of repetition and redundancy, it should compress very well. To test this, I used a database with 3,000 addresses distributed throughout Dublin and Ireland. The average address length turned out to be 40 characters, so with 6-bit encoding, 90 KB is needed for the database. This compressed down to 34 KB using standard ZIP encoding, a compression rate of 2.6:1.

Assuming a similar level of compression for the entire Eircode database, and treating the index values separately, the compressed address data would take (38 bytes * 2.2M / 2.6 = 30.6 MB) + (4 bytes * 2.2M = 8.4 MB), 39 MB in total.

So, a full set of Eircode co-ordinate and address data at present should take no more than 20 + 39 = 59 MB – a far cry from the 2 GB scaremongering.

And in fact, this is very close to the figure mentioned by Pat Donnelly of autoaddress.ie in his comprehensive rebuttal to Brian Lucey’s original post criticising Eircode.

What next?

The debate about Eircode will no doubt continue for the months to come. Let’s make sure it’s informed by facts rather than speculation.

Further reading
Reference