21 October 2015
Established in 1993, Geo Strategies Ltd offer industry-strength Data Quality Management services including geocoding. Throughout its history, the company has maintained an uncompromising focus on quality and productivity in assisting industries with relevant digital products and services to full international standards.
The business challenge
Geo Strategies provides business consultancy in Central and Eastern Europe, part of which is offering geocoding, the process of attaching geographic coordinates to an address. However, they were facing a problem in Romania as, historically, the compilation of address databases was in a very poor state, including no standardised spellings. The business challenge facing Geo Strategies was to cleanse and standardise large address databases from Romania, so they could geocode those addresses quickly and efficiently.
It was not an obvious challenge to be undertaken as an IT project, as it required considerable intellectual understanding of the complexities of a foreign language. However, Dr Peter Lane from the School of Computer Science, building on his skills as an analyst was able to design and build pattern-matching algorithms to replicate the syntax of the Romanian language. How did the University help?
As an SME, Geo Strategies found their partnership with the University of Hertfordshire a far better match than working with larger Universities closer to home. Being able to focus on the specific needs of SMEs and tailor working patterns to suit them was a major consideration to the company.
Ultimately, the project broke down into three phases. Phase one saw the creation of an IT solution based on an understanding of the Romanian address system’s syntax and decomposing address information into its logical components (county, street etc.). This was then compared with a standard lexicon of address names and numbers for geocoding.
Phase two extended the solution to extract and standardise the numbering system used throughout Eastern Europe. This was particularly demanding and fraught with errors, as it is based on a complex system (including Street number, Block number, Entrance number, Floor, Apartment number) aligned with widespread, high-rise apartment blocks.
Phase three devised a system of ‘fuzzy matching’ to correct incorrectly spelt words in the Romanian language. This was an intricate task as Romanian is a morphologically rich language with many opportunities for abbreviations and misspelling.
Real business benefits
The project resulted in an IT solution which allowed Geo Strategies to standardise and geocode a large database of 70 million Romanian addresses and identify duplicates, resulting in approximately 5 million individual addresses. This revealed that there was, originally, massive duplication due to spelling or data-entry errors. The outcome of this work has meant the company, equipped with an efficient address matching, de-duplication and geocoding system, can now expand its consultancy services far wider than its original client base to the whole business community.
Finding such a vast duplication of records was an unexpected discovery from the project, which has allowed the company to start offering a new de-duplication service to large organisations (supermarkets etc.) helping them achieve a Single Customer View (SCV), a vital component of modern customer-centricity.
“We are delighted with the results of collaborating with the University of Hertfordshire. We had an unusual project that spanned a variety of academic disciplines and were able to work with first-class academics, who both understood the true nature of the problem and were able to devise an efficient IT solution.”
Bill Metcalf, Managing Director, Geo Strategies Ltd