ArcGIS Developers
Dashboard

ArcGIS API for Python

Part 4 - Batch Geocoding

Introduction

The batch_geocode() function in the arcgis.geocoding module geocodes an entire list of addresses. Geocoding many addresses at once is also known as bulk geocoding. You can use this method upon finding the following types of locations:

  • Street addresses (e.g. 27488 Stanford Ave, Bowden, North Dakota, or 380 New York St, Redlands, CA 92373)
  • Administrative place names, such as city, county, state, province, or country names (e.g. Seattle, Washington, State of Mahārāshtra, or Liechtenstein)
  • Postal codes: (e.g. 92591 or TW9 1DN)

Batch sizes (max and suggested batch sizes)

There is a limit to the maximum number of addresses that can be geocoded in a single batch request with the geocoder. The MaxBatchSize property defines this limit. For instance, if MaxBatchSize=2000, and 3000 addresses are sent as input, only the first 2000 will be geocoded. The SuggestedBatchSize property is also useful as it specifies the optimal number of addresses to include in a single batch request.

Both of these properties can be determined by querying the geocoder:

In [1]:
from arcgis.gis import GIS
from arcgis.geocoding import get_geocoders, batch_geocode
In [2]:
gis = GIS(profile="your_enterprise_profile")
In [3]:
# use the first of GIS's configured geocoders
geocoder = get_geocoders(gis)[0]
In [4]:
print("For current geocoder:")
print(" - MaxBatchSize: " + str(geocoder.properties.locatorProperties.MaxBatchSize))
print(" - SuggestedBatchSize: " + str(geocoder.properties.locatorProperties.SuggestedBatchSize))
For current geocoder:
 - MaxBatchSize: 1000
 - SuggestedBatchSize: 150

Batch geocode single line addresses, multi-line addresses

The batch_geocode() function supports searching for lists of places and addresses. Each address in the list can be specified as a single line of text (single field format), or in multi-field format with the address components separated into mulitple parameters.

The code snippet below imports the geocode function and displays its signature and parameters along with a brief description:

In [6]:
help(batch_geocode)
Help on function batch_geocode in module arcgis.geocoding._functions:

batch_geocode(addresses, source_country=None, category=None, out_sr=None, geocoder=None, as_featureset=False, match_out_of_range=True, location_type='street', search_extent=None, lang_code='EN', preferred_label_values=None)
    The batch_geocode() function geocodes an entire list of addresses.
    Geocoding many addresses at once is also known as bulk geocoding.
    
    =========================     ================================================================
    **Argument**                  **Description**
    -------------------------     ----------------------------------------------------------------
    addresses                     required list of strings or dictionaries.
                                  A list of addresses to be geocoded.
                                  For passing in the location name as a single line of text -
                                  single field batch geocoding - use a string.
                                  For passing in the location name as multiple lines of text
                                  multifield batch geocoding - use the address fields described
                                  in the Geocoder documentation.
                                  The maximum number of addresses that can be geocoded in a
                                  single request is limited to the SuggestedBatchSize property of
                                  the locator.
                                  Syntax:
                                  addresses = ["380 New York St, Redlands, CA",
                                    "1 World Way, Los Angeles, CA",
                                    "1200 Getty Center Drive, Los Angeles, CA",
                                    "5905 Wilshire Boulevard, Los Angeles, CA",
                                    "100 Universal City Plaza, Universal City, CA 91608",
                                    "4800 Oak Grove Dr, Pasadena, CA 91109"]
    
                                  OR
    
                                  addresses= [{
                                       "Address": "380 New York St.",
                                       "City": "Redlands",
                                       "Region": "CA",
                                       "Postal": "92373"
                                   },{
                                       "Address": "1 World Way",
                                       "City": "Los Angeles",
                                       "Region": "CA",
                                       "Postal": "90045"
                                   }]
    -------------------------     ----------------------------------------------------------------
    source_country                optional string, The source_country parameter is
                                  only supported by geocoders published using StreetMap
                                  Premium locators.
                                  Added at 10.3 and only supported by geocoders published
                                  with ArcGIS 10.3 for Server and later versions.
    -------------------------     ----------------------------------------------------------------
    category                      The category parameter is only supported by geocode
                                  services published using StreetMap Premium locators.
    -------------------------     ----------------------------------------------------------------
    out_sr                        optional dictionary, The spatial reference of the
                                  x/y coordinates returned by a geocode request. This
                                  is useful for applications using a map with a spatial
                                  reference different than that of the geocode service.
    -------------------------     ----------------------------------------------------------------
    as_featureset                 optional boolean, if True, the result set is
                                  returned as a FeatureSet object, else it is a
                                  dictionary.
    -------------------------     ----------------------------------------------------------------
    geocoder                      Optional, the geocoder to be used. If not specified,
                                  the active GIS's first geocoder is used.
    -------------------------     ----------------------------------------------------------------
    match_out_of_range            Optional, A Boolean which specifies if StreetAddress matches should
                                  be returned even when the input house number is outside of the house
                                  number range defined for the input street.
    -------------------------     ----------------------------------------------------------------
    location_type                 Optional, Specifies if the output geometry of PointAddress matches
                                  should be the rooftop point or street entrance location. Valid values
                                  are rooftop and street.
    -------------------------     ----------------------------------------------------------------
    search_extent                 Optional, a set of bounding box coordinates that limit the search
                                  area to a specific region. The input can either be a comma-separated
                                  list of coordinates defining the bounding box or a JSON envelope
                                  object.
    -------------------------     ----------------------------------------------------------------
    lang_code                     Optional, sets the language in which geocode results are returned.
                                  See the table of supported countries for valid language code values
                                  in each country.
    -------------------------     ----------------------------------------------------------------
    preferred_label_values        Optional, allows simple configuration of output fields returned
                                  in a response from the World Geocoding Service by specifying which
                                  address component values should be included in output fields. Supports
                                  a single value or a comma-delimited collection of values as input.
                                  e.g. ='matchedCity,primaryStreet'
    =========================     ================================================================
    
    :returns:
       dictionary or FeatureSet

The address parameter will be a list of addresses to be geocoded, and you can choose between:

  • a single line of text — single field batch geocoding — use a string.
  • or multiple lines of text — multifield batch geocoding — use the address fields described in Part 3.

The Geocoder provides localized versions of the input field names in all locales supported by it.

Single Line Addresses

In [7]:
addresses = ["380 New York St, Redlands, CA", 
             "1 World Way, Los Angeles, CA",
             "1200 Getty Center Drive, Los Angeles, CA", 
             "5905 Wilshire Boulevard, Los Angeles, CA",
             "100 Universal City Plaza, Universal City, CA 91608",
             "4800 Oak Grove Dr, Pasadena, CA 91109"]
In [8]:
results = batch_geocode(addresses)
In [12]:
map0 = gis.map("Los Angeles", 9)
map0
In [10]:
for address in results:
    map0.draw(address['location'])
    print(address['score'])
100
100
100
100
100
98.18

Each match has keys for score, location, attributes and address:

In [11]:
results[0].keys()
Out[11]:
dict_keys(['address', 'location', 'score', 'attributes'])

Multi-line Addresses

The earlier example showed how to call batch_geocode() with single line addresses. The following example illustrates how to call batch_geocode() with a list of multi-field addresses.

In [13]:
addresses= [{
                "Address": "380 New York St.",
                "City": "Redlands",
                "Region": "CA",
                "Postal": "92373"
            },{
                "Address": "1 World Way",
                "City": "Los Angeles",
                "Region": "CA",
                "Postal": "90045"
            }]
In [16]:
results = batch_geocode(addresses)
In [19]:
map1 = gis.map("Los Angeles", 9)
map1