Datafiniti Tackles Incomplete Postal Code and Address Formats for 60 Countries With AddressParser

Datafiniti
Knowledge from Data: The Datafiniti Blog
3 min readSep 17, 2018

--

Datafiniti’s mission is to help companies gain instant access to valuable data sourced from thousands of websites. The team strives to curate and clean the data we collect so that businesses can quickly implement our products into their actions, analytics, and applications. Our love for and curiosity about data does not stop there. We also develop in-house solutions to better meet specific customer needs. Learn more about how we provide locational data with our Business Data and People Data Products in this latest update.

There are 195 countries across the globe, each with their unique postal code and address formats. Without a standard data collection protocol for these variations, developers lose valuable time and face quality control issues. Today, customers will begin to experience increased and comprehensive coverage of address information for over 60 countries with the release of AddressParser.

“International customers increasingly drive our sales, but in the past, we’ve built our systems for U.S. data,” said Will Hudgins, data engineer. “This project will help us tackle and deliver data, which has been a thorn in our side, for diverse customer needs.”

AddressParser is an intelligent system that will take an unstructured raw address string and parse it into components including the street address, city, province, postalCode, and country fields as available in Datafiniti’s Business Data and People Data schema. In cases of incomplete address information, the system can infer missing values and identify whether it is a U.S. or non-U.S. address based on the combination of postalCode, city, and or province.

“For example, in Italy and Spain they often have prepositions in the city names, but sometimes they don’t,” said Hudgins. “There’s a lot of domain-specific knowledge that has to be understood and applied to make this system work.”

The most exciting part of this project for Hudgins was the ability to apply machine learning principles to his work. AddressParser utilizes constraint propagation like a puzzle for addresses. This process starts with a large scale problem and slowly strips away possible solutions by applying rules tied to the problem. Once all rules or constraints are applied, the AddressParser can identify or propagate a single value for each field of an address and compile all of the information into one, usable data format.

“That AI technique is the linchpin of this project and brings it together to make it work,” said Hudgins.

Are you a current Datafiniti customer? No action is required to start experiencing this upgrade. Increased and comprehensive coverage of western and European Union addresses will automatically be visible through the API and Portal as we update our data sets with the new technology. Up next, the team hopes to expand to include countries across the globe such as India, Mexico, and countries of Africa.

“The more we develop this idea, the more it’s a game changer for Datafiniti and our customers,” said Hudgins.

No matter your industry, Datafiniti has over 100 million records across our Business, People, Product, and Property Data Products to take your company to the next level. Are you ready to get started? Click Create a Free Account below to get started on your own or Request a Demo to talk to a member of our sales team.

If you have any questions, you can contact your customer success representative or email our team at support@datafiniti.co.

Written by Nicholle Shaver, content marketer.

--

--

We provide instant access to business, people, product, and property data sourced from thousands of websites.