MetroGIS Logo: A Common Ground. MetroGIS Logo: Sharing Information Across Boundaries.
   Serving the Minneapolis/St. Paul Metropolitan Area Home   |   Search   |   Contact Us    
 
What is MetroGIS? 
Deliverables & Outcomes 
Business Planning 
What's New 
Major Accomplishments 
Annual Reports 
Affiliations 
Awards 
Grants 
History 

Overview 
Short Quotes 
Testimonials 
Studies 
Performance Measurement 

Overview
Looking for GIS Data or Web Map Services? 
Looking for GIS Applications? 
Looking for GIS Standards/Best Practices? 

Meeting Calendar 
Policy Board 
Coordinating Committee 
Technical Advisory Team 
Special Purpose Workgroups 
Organizational Structure 
Operating Guidelines
Dissolved Teams
Major Projects

Presentations 
Major Reports 
Articles & Publications 
Glossary

 

Data > Standards

Guidelines & Issues for working with Address Data

2010 MetroGIS Address Points Database Specifications

The MetroGIS Address Workgroup has created address point database specifications based on the draft National Address Data Standard.  The MetroGIS specifications are in draft format and will continue to change modestly.  Unresolved issues are highlighted in the draft.  We share this draft because so many people have asked for it.

1997 Address Guidelines
MetroGIS created a set of address data guidelines back in 1997.  While somewhat dated, these guidelines are still useful to many organizations, so we have chosen to keep them on our web site.

Preface
Introduction
Types of Addresses
Elements of a Street Address
Developing Address Data Bases
Address Parsing and Field Concatenation
Relationship Between Parcels and Addresses
Geocoding and Address Matching
Bibliography
Appendix

This paper is intended to provide information and guidance to anyone working with address data. It explains the important issues involved in incorporating address data into a GIS. It also describes potential pitfalls and provides specific examples to help the user understand the issues surrounding the use of address data.

MetroGIS Standards Advisory Team
July 1997

Preface

What is a "Standard"? What is a "Guideline"?
In April of 1993, the GIS Standards Committee of the Minnesota Governor's Council on Geographic Information developed a project plan which included the following standards related terminology. This information provides a useful frame of reference for the discussion of addressing issues and guidelines that follows.

Policy
A high-level overall plan, defining a course or method of action and embracing the general goals and acceptable procedures of a governmental body, to guide and determine present and future decisions. The following concepts are methods to implement policy.

Standard

  • a definite rule, principle or measure,
  • established or formally sanctioned by an authorized body
  • requiring adherence from all organization statutorily obligated to do so

Convention

  • a general agreement about basic principles or procedures
  • some written description of the agreement is usually prepared
  • a convention can become a standard if formally sanctioned by an authorizing body

Guideline

  • a recommendation that is developed to detail a proposed practice or procedure developed for local ('in-house') use
  • no formal or semi-formal agreement to comply with a guideline is necessary
  • a guideline can evolve to a convention if adopted by a larger user community
Return to Preface Return to top

Introduction

Why are addresses important?
A Geographic Information System (GIS) relies on the framework that it is built upon. A critical element of this framework is its resolution or the level to which we can reference unique entities and map them in some way. In the real world, addresses are the most commonly used and smallest unique identifier. They are often used as the primary link between individuals and locations. Addresses have a much more user-friendly quality than other identifiers such as property identification numbers (PIN's). For example, a citizen who may wish to extract data from a GIS would find it much more intuitive to be able to query a data base using an address than by most other means.

What is the benefit to using a standard address format?
Because many datasets are geographically referenced by an address, defining and using a standard address format will increase the ease with which these datasets can be incorporated into the GIS for mapping and analysis. And because addresses are so often used as a means of communication between and within organizations, standardizing addresses will increase an organizations ability to share these datasets with other organizations. Standard addresses can also increase the efficiency of automated applications. For example, they may make locating addresses on an E-911 system more efficient and accurate or usable over a wider area covering several communities.

Standardizing addresses may also save you money on bulk mailings. The U.S. Postal Service offers reduced bulk mailing costs for those organizations which utilize and adhere to the USPS address standard (discussed below). Furthermore, software products are available which can read addresses and convert them to the appropriate USPS standard, when possible. Even if a bulk mailing is not the intent, these software products may assist in standardizing an address data base which is known to have errors or inconsistencies.

This document is intended to be a guideline, not a standard.

The recommendations outlined in this document are intended to provide information for those who wish to work with addresses and are in no way meant to be adopted as a set of mandatory rules. This guide should help those who wish to create new data bases which contain or access address data. The information should also assist those who work with existing data bases and intend on transferring or sharing data.

Many different agencies have adopted a variety of address rules for their data bases for good reasons. There is no advantage in undertaking the expense of recreating existing data bases when guidelines can be followed which allow for the data to be transferred into formats which can be used by many other agencies in a wide array of applications.

The U. S. Postal Service Standards
Several different aspects of address information can be considered for standardization. These range from the use of capitalization and punctuation to data base design and address matching procedures. While this document provides some guidelines in a variety of areas, the U. S. Postal Service has developed detailed standards that deal with address "format" and "content" to help the users and developers of address information. We strongly recommend that developers of address information consider the USPS standards when developing or modifying address datasets. Copies of these addressing standards may be obtained from the U.S. Postal Service National Customer Support Center at 1-800-238-3150 or via the Internet at www.usps.gov. Request Postal Addressing Standards (Pub. 28).

Return to Introduction Return to top

Types of Addresses

There are two main types of addresses used:

  • Postal or Mailing Addresses
  • Situs Addresses

In both cases, many different systems have been adopted by different organizations such as the U.S. Postal Service, U.S. West, NSP, Counties, Cities, etc.

POSTAL/MAILING ADDRESSES

  • These are used to contact individuals or organizations.
  • These are used for legal contacts, billings and notifications, for example.
  • As recorded by local government agencies, these addresses are often out of state or even out of the country, and so may not be the same as its related site-specific or situs address. (see below)
  • These addresses differentiate among parties of interest in land, for example:
    • property owners for tax notifications
    • business addresses of corporate entities or tenants which differ from the property owner
    • lease holders
  • Post Office Box addresses do not actually record the street details as part of the address for the purposes of mailing. (See the "Elements of an Address" section for more details.)

SITUS ADDRESSES

  • Are not used for the contact of individuals, but instead for the relating of features to a specific location
  • These are site-specific location or service addresses used for such things as:
    • emergency response
    • location/identification of local government infrastructure such as city owned land
    • location of suites, shops or offices within a shopping center or business park
  • These addresses can be used in a variety of ways, for example:
    • Intersection addresses: These may be identified, processed and coded differently than with more typical street addresses. (See the "Elements of an Address" section for more details.)
    • Landmark addresses: These can be used for geocoding purposes through the use of an address alias. (See the "Elements of an Address" section for more details.)
    • Land use identification: Land use information can be attached to these addresses to indicate whether a property is vacant.
  • Situs addresses are a common data element used in local government. Assigning an address to every building and undeveloped parcel (e.g. park, undeveloped lot) can be useful for such things as directing someone to a park or getting an emergency vehicle to a vacant parcel. However, giving an address to vacant parcels also has a drawback in that the address may change once the parcel is developed.
Return to Types of Addresses Return to top

Elements of a Street Address

The following is a breakdown of the key elements that make up a typical street address. Please note that the U.S. Postal Service has developed addressing standards that include capitalization, punctuation and abbreviation of addresses. While the examples below adhere to those standards, they are only a subset of what is available in the Postal Service standards.

  1. Street number. (3186 PILOT KNOB RD)
    The street number is typically an integer value, but it may also include alpha characters, (e.g., 142A or 216 1/2). How these addresses will be located depends upon your geocoding software.
  2. Prefix direction. (156 E 18TH ST)
    The location of a direction designation may vary within an address. Some software products require the directional field to be placed in the prefix position and others in the suffix position in order to ensure the best results in address matching. (e.g., N 1ST AVE or 1ST AVE N).
    USPS street direction standard abbreviations: N, S, E, W, NE, SE, NW, SW
  3. Street name. (3334 CEDAR AVE)
    Care should be taken to insure that street names are not abbreviated or misspelled. In some cases streets may be known by more than one name. In these cases an alias or cross reference may be needed. Streets with numeric names may need to be entered as 1ST ST rather than FIRST ST.
  4. Street type. (3334 CEDAR AVE)
    Street types need to be entered using USPS recommended abbreviations.
  5. Suffix direction. (1200 34TH ST W)
    (see prefix direction)
  6. Unit Number (14955 GALAXIE AVE STE 300)
    Some common unit designators are APT (Apartment), STE (Suite), DEPT (Department), and the # sign. (See Postal Addressing Standards)
  7. City (MINNEAPOLIS MN 55406)
    Spell city names in their entirety when possible. When it is not possible, use the 13 character abbreviations from the USPS City State File.
  8. State (MINNEAPOLIS MN 55406)
    Use 2 letter USPS State Abbreviations.
  9. Zip Code (MINNEAPOLIS MN 55406)
    Zip Code or Zip+4 Number

Other Types of Addresses

  • Intersections are a common type of address. You may want to identify, process and code these differently than street addresses. For example, one common method is to separate the intersecting streets with a space - forward slash - space (e.g., PILOT KNOB RD / 150TH ST W).
  • Landmarks are another useful address type. It may also be useful to alias street addresses in a non-geocodable format in order to work with things like landmarks (e.g., DAKOTA COUNTY COURTHOUSE).
  • A Post Office Box, with no street address. (e.g., PO BOX 146, HAMPTON MN 55031)
  • These are just a few examples. For questions concerning the proper coding of these and other addresses, consult the USPS Postal Address Standards booklet.

Other Things to Consider

  • To help resolve duplicate matching addresses ( e.g., 150 MAIN ST ) which may exist in more then one location in a City or County, you may need to use additional information (e.g. city or ZIP code).
  • Be aware that for bulk mailers, the Postal Service has started using the concept of Zip plus four plus two. The plus two is the two right-most characters of the house number. The zip + 4 + 2 is then sorted by ASCII value, but with the odds and evens of the + 2 into separate lists for each of the unique Zip + 4 groups. This, in theory, yields a mail sort in the actual delivery order. This enables the bundle of mail to be given directly to the mail carrier without preprocessing. The real message here is that if your organization does bulk mailings, make the effort to understand the Zip + 4 + 2 sorting.
  • Any particular geocoding software may allow some flexibility in matching addresses, but careful data entry following these recommendations will help ensure accurate matching!
Return to Elements of Addresses Return to top

Developing Address Data Bases

In addition to having standardized addresses, it is also important to have a well designed address data base. In this section you will find helpful tips and a good example of a single file for storing addresses. For storing large amounts of address information, or for using address data with a variety of other datasets, a relational data model is highly recommended.

Address Standardization Software
Several software products are available which can read addresses and convert them to the appropriate USPS standard, when possible. While this can save you money on bulk mailings, it can also assist in standardizing an address data base which is known to have errors or inconsistencies. Two examples of these products are Acumail and Postal Soft.

Address Tips from URISA
The Urban and Regional Information Systems Association (URISA) develops and presents workshops on a variety of topics related to GIS. Below are some tips from a workshop on addresses presented by Peirce Eichelberger at GIS/LIS96 in Denver, CO.

  • Many problems can be avoided by developing the proper data base model up front
  • Need to have a street names synonyms cross reference (e.g. both Main St. and Hwy. 5). This will require a link between the tabular data base and the street layer in the GIS.
  • Normalize the data base. e.g. a given street name only entered once (spelled correctly). Thus, if there are 100 parcels on the street, the street name is only in the database once. When entering the address, the system should look up the street name and put the foreign key with the parcel (see Appendix C).
  • Have a domain table (see Appendix C.) for street types (e.g. RD ST AV, etc.). and have a data entry application to allow only these to be entered (e.g. could automatic change AVE or Ave to AV)
  • For the previous two bullets, these help people do their job more accurately and faster. They will actually thank you for implementing these things.
  • Suggest data standards for everyone working with addresses.
  • The addressing application, or at least the data base, should be outside of the GIS software (e.g. not in INFO tables). Otherwise too much of it is hidden and not accessible to all of the numerous non-GIS applications for addresses.
  • Below is an example of a single file for address data. A relational data model is highly recommended for storing large amounts of data.

Dakota Co. Address File Example

FIELD NAME (FIELD DESCRIPTION)/FIELD LENGTH

ST_NUMB (HOUSE NUMBER)/10 CHARACTER

ST_PDIR (PREFIX STREET DIRECTION)/2 CHARACTER

ST_NAME (STREET NAME)/20 CHARACTER

ST_TYPE (STREET TYPE)/4 CHARACTER

ST_SDIR (SUFFIX STREET DIRECTION)/2 CHARACTER

CITY (CITY)/27 CHARACTER

UNIT (UNIT NUMBER)/9 CHARACTER

STATE (STATE)/2 CHARACTER

ZIP_CODE (ZIP CODE INCLUDES PLUS 4)/ 10 CHARACTER

PLUS2(ZIP CODE EXTENSION)/2 CHARACTER

(THE NEXT 4 COULD BE COMBINED INTO A SINGLE FIELD CALLED "OTHER_ADD")

ST_INT (INTERSECTION)/45 CHARACTER

LANDMARK (BUILDING NAME)/45 CHARACTER

PO_BOX (POST OFFICE BOX)/12 CHARACTER

OTHERADD (NON STANDARD)/45 CHARACTER

Return to Developing Address Databases Return to top

Address Parsing and Field Concatenation

Field Concatenation
The parts of an address being stored in separate fields of a database may need to be appended or joined together. The process of putting these address fields together is called concatenation. It is easier to put fields of data together or concatenate them, than it is to break fields of data apart or parse them.

Fields are Stored as Character Fields
Because these fields need to be concatenated together, the fields should be stored as character fields. If the address fields are stored as numbers, the fields will be added like numbers instead of appended together like characters.

Another reason for storing the address fields as character fields is because they contain non-numeric data. For example, ZIP_CODE will contain a dash between the first five digits and the ZIP+4 portion of the zip code, i.e. 55124-8579.

Software may Determine Need
The software being used may determine how the address data will need to be formatted. For example, to perform address matching in Arc/Info or ArcView, the address fields need to be one field of data.

Address Parsing
Address parsing is the process of taking an address entered as a single field of data and breaking into the component fields of the address.

Return to Address Parsing and Field Concatenation Return to top

Relationship Between Parcels and Addresses

County governments (and some cities) are responsible for identifying parcels for the purposes of property taxation or the recording of an interest in a piece of land. In order to uniquely identify each piece of land, a PIN or Property Identification Number is assigned.

In many cases, there is a direct correlation between the PIN and the property situs address. Such simple cases occur with single family residences where one house is owner-occupied and the PIN can be linked to the situs address which will also be used as the postal address.

There are many exceptions to this situation though, such as non-homestead residential properties where the PIN relates to a situs address, but a different postal address may be used for taxation purposes.

Many PINs relate to parcels where multiple tenancy exists, either in business or for residences. In these cases, there is a one-to-many relationship, with the PIN linked to the corporate taxable property for example. The same PIN though, should be attached to each of the numerous tenant addresses, which themselves may be situs address and/or postal addresses.

PINs may also relate to undeveloped parcels which, nevertheless, could have a situs address for local government to be able to accurately determine future addressing for uses in emergency response to a particular location in the community.

PINs can relate to non-contiguous land which is under the same ownership, such as a farm with separate land holdings. In this case, the PIN must be linked to the multiple situs addresses of each separate piece of land and also to the postal address for the entire taxable entity.

Return to Relationships Between Parcels and Addresses Return to top

Geocoding and Address Matching

Geocoding is the process of creating geographic coordinates for geographically referenced tabular data. In other words, a geocoding process will allow one to derive precise coordinates on the surface of the earth for things like

  • addresses
  • mile posts
  • parcel identification numbers (PINs), and
  • public land survey information (PLS).

One form of geocoding is address matching. Because addresses are the geographic identifier for many databases (for example a database of city residents), one can map a variety of information in such a data base by using an address matching process. Through this process, an address can be matched to data in a GIS and a geographic location or coordinate (longitude and latitude or X and Y value) can be assigned.

Street Centerlines
One type of GIS layer is a "street centerline" layer. While this layer sounds as though it is used to map the yellow line in the middle of roads, it is in fact used for defining the location of roads in general and also to define the location of addresses. This is done by assigning the range of addresses that exist on any particular segment of a road or street. A centerline layer will store geographic coordinates for the end points of the street segments, designating them as a from point and a to point. Then, if the "from" and "to" address on each side of the street is defined (address range), addresses can be geocoded by interpolating between these points and creating coordinates at an offset left or right of the centerline.

In this example 1151 State St. is an interpolated point based on the address ranges.

Thus, in order to perform address matching functions with a street centerline layer, for each street segment that layer must contain information about street number, street name, prefix direction, suffix direction, street type, and address ranges based on four fields:

  1. from address on the left side of the street
  2. from address on the right side of the street
  3. to address on the left side of the street
  4. to address on the right side of the street.

Unfortunately addresses are often entered into databases (coded) inconsistently. Spelling errors and non-standard abbreviations affect the number of hits (matches) on a particular database. Addresses that are not matched can be processed one at a time, and spelling errors and non-standard abbreviations can be fixed during this process of reject processing.

Address Ranges, Actual vs. Theoretical
When putting address ranges in a database of a street centerline layer, a decision must be made on whether to use actual address ranges or theoretical address ranges. For example, a given street segment might have three houses with the house numbers of 123, 135 and 143 (actual), while the city has designated that houses along this street should be numbered between 100 and 200 (theoretical address range). The actual address is much more accurate and reflects reality (123 - 143), but if a new house is built or otherwise a new address is added, the address range may need to be modified. If theoretical ranges are used, then no maintenance of the address range is needed.

Your intended use of the data should help determine which method is preferable. Your existing data sources may also be a factor in your decision. For example, you might not have the actual address ranges. Conversely, there may be no theoretical address ranges in some areas.

Address Points
Address ranges, by their very nature, are imprecise. They can only offer an approximation of the actual location of a specific address. Address points, however, can be considerably more accurate with regard to geography. For example, in the diagram below, 1145 State St. would appear to the left of the midpoint of the street segment with an interpolated address range. The address point shows its actual location which is to the right of the street segment midpoint.

Centroids of parcels may provide address points that are suitable for some purposes. Remember though, that one parcel can have many addresses, and one address can consume many parcels. Again, the use of the dataset should help determine what type of dataset is needed.

Geocoding Other Types of Data
Mile Posts work very similar to addresses. A mile post number can be assigned to each line segment in an address centerline layer. Through an interpolation process, like geocoding by address, a geographic coordinate can be created for a mile post. The difference between mile posts and addresses is that mile posts do not have left and right fields coded. This is because mile posts normally apply to the centerline or physical road surface while addresses generally apply to residences or businesses located back from the road surface.

A Parcel Identification Number (PIN) is a unique identifier associated with a defined piece of land for purposes of tracking taxation information. Sometimes non-taxation related databases will also include PINs (e.g. city owned land, crime locations, etc.). Since a PIN is generally associated with a point or polygon in a parcel GIS layer, any database with the PIN in it can be geocoded by matching the PINs. Geocoding by PINs assures that the related databases are assigned to the correct parcel. The down side to geocoding with PINs is that PINs in the GIS parcel layer change over time, and some are even retired (no longer used). Because of this, all of the databases that use PINs must be maintained (updated) in conjunction with the GIS parcel layer in order to keep the geocoding process accurate.

In the Public Land Survey System (PLS), the aliquot parts of a section, like quarter or quarter-quarter, can be used for geocoding as well. The PLS GIS layer has the sections (square miles) of townships broken down into these aliquot parts. The fields used for this are: section number, township number, range number, quarter section, and quarter-quarter section (forty). The number of quarters will determine how accurate an interpolated geographic coordinate will be. For instance, a quarter section code will result in about 1320 feet of possible error while a quarter-quarter section code will result in about 660 feet of possible error.

Return to Geocoding and Address Matching Return to top

Bibliography

Postal Addressing Standards, U.S. Postal Service, Publication 28, August 1995

Copies of these addressing standards may be obtained from the U.S. Postal Service National Customer Support Center at 1-800-238-3150 or via the Internet at www.usps.gov. Request Postal Addressing Standards (Pub. 28).

Eichelberger, Peirce. "The Importance of Addresses: The Locus of GIS" in 1993 URISA Proceedings pp. 212-222

Appendix

Example of creating mailing labels by buffering a parcel using ArcView and Microsoft Word.

Return to top
   
   Page last updated on February 10, 2010. Home   |   Search   |   Contact Us