ArcGIS Developers
Dashboard

ArcGIS API for Python

Part 2 - Locating Addresses

Geocode Address

The geocode() function supports searching for places and addresses either in a single field format or in a multi-field format with the address components separated into multiple parameters.

The code below imports the geocode function and displays its signature and parameters, along with a brief description:

In [1]:
from arcgis.geocoding import geocode
In [10]:
help(geocode)
Help on function geocode in module arcgis.geocoding._functions:

geocode(address, search_extent=None, location=None, distance=None, out_sr=None, category=None, out_fields='*', max_locations=20, magic_key=None, for_storage=False, geocoder=None, as_featureset=False, match_out_of_range=True, location_type='street', lang_code=None, source_country=None)

Single-line address

geocode() can be called with one required parameter address or can be used with optional parameters to fine-tune the search results such as search_extent, as seen in this example.

  • The address parameter specifies the location to be geocoded. This can be a string containing the single line address, i.e street address, place name, postal code, or POI.
In [2]:
from arcgis.gis import GIS
gis = GIS(profile = "your_enterprise_profile") # log in with your own enterprise or AGOL profile configuration

As seen in Part 1, geocode performed upon a single line address returns matched geocoded results. The more details provided onto the address string, the more refined result set becomes.

In [3]:
single_line_address = "380 New York Street, Redlands, CA 92373"
In [4]:
# geocode the single line address
esrihq = geocode(single_line_address)
len(esrihq)
Out[4]:
1

If we loosen up on the single-line address definition, e.g. remove the city and state name, then the length of the returned result set increases to 20.

In [5]:
single_line_address = "380 New York Street"
In [6]:
# geocode the single line address
esrihq = geocode(single_line_address)
len(esrihq)
Out[6]:
20

Example of World Series POI

Another example of such is to define the single line address of search to be Disneyland, USA, as compared to Disneyland. The former returns matched results in USA only while the latter is returning all in the world.

In [26]:
disney = geocode("Disneyland, USA")
len(disney)
Out[26]:
3
In [38]:
for i in range(3):
    print(disney[i]['attributes']['LongLabel'])
    print(" - ",disney[i]['attributes']['Score'], 
          " - ", disney[i]['attributes']['Addr_type'], 
          " - ", disney[i]['attributes']['Type'])
Disneyland, 1313 S Disneyland Dr, Anaheim, CA, 92802, USA
 -  100  -  POI  -  Amusement Park
Disneyland, Anaheim, CA, USA
 -  100  -  Locality  -  Neighborhood
Disneyland, 1313 S Harbor Blvd, Anaheim, CA, 92802, USA
 -  100  -  POI  -  Business Facility
In [39]:
disney = geocode("Disneyland")
len(disney)
Out[39]:
20
In [40]:
for i in range(20):
    if disney[i]['attributes']['Type']== "Amusement Park":
        print(disney[i]['attributes']['LongLabel'])
        print(" - ",disney[i]['attributes']['Score'], 
              " - ", disney[i]['attributes']['Addr_type'])
Disneyland, 1313 S Disneyland Dr, Anaheim, CA, 92802, USA
 -  100  -  POI
Disney Land, 7th Street, Gandhipuram, Coimbatore, Tamil Nadu, 641012, IND
 -  100  -  POI
Disney Land, Kalawad Road, Mota Mava, Rajkot, Gujarat, 360005, IND
 -  100  -  POI
Disneyland, Boulevard du Parc, 77700, Coupvray, Seine-et-Marne, Ile-de-France, FRA
 -  100  -  POI

Multi-field address

Alternatively, the address can be specified in a multi-field format using a dict containing the various address fields accepted by the corresponding geocoding service.

In order to provide a way to find addresses in many different countries, which may use different addressing formats, the geocode() method uses standardized field names for submitting address components. In understanding what specific fields are needed to enter for your address, you can check out the `addressField` introduction for more information.

The Geocoder's address field property specifies the various address fields accepted by it when geocoding addresses. The neighborhood, city, subregion, and region parameters represent typical administrative divisions within a country. They may have different contexts for different countries, and not all administrative divisions are used in all countries. For instance, with addresses in the United States, only the city (city) and region (state) parameters are used; for addresses in Mexico, the neighborhood parameter is used for districts (colonias) within a city, city for municipalities (municipios), and the region parameter for states (estados); Spain uses all four administrative divisions.

For example, if the address field of a geocoding service resource includes fields with the following names: Address, City, Region and Postal, then the address argument is of the form below.

In [3]:
multi_field_address = { 
                        "Address" : "380 N Y St",
                        "City" : "Redlands",
                        "Region" : "CA",
                        "Postal" : 92373
                      }
In [4]:
# geocode the multi_field_address
esrihq1 = geocode(multi_field_address)
len(esrihq1)
Out[4]:
2

The returned result set contains two dict objects, and we can see from below that each dict object contains 'address', 'location', 'score', 'attributes', and 'extent' keys. The x and y coordinates can be found in both location and attributes properties.

In [16]:
esrihq1[0].keys()
Out[16]:
dict_keys(['address', 'location', 'score', 'attributes', 'extent'])
In [13]:
esrihq1[0]['location'], esrihq1[0]['extent']
Out[13]:
({'x': -117.19568252432872, 'y': 34.05723700023128},
 {'xmin': -117.19587199429185,
  'ymin': 34.056237000231285,
  'xmax': -117.19387199429184,
  'ymax': 34.05823700023128})

We can also go one level deeper and check out the address type, X and Y coordinates from its attributes field for more details.

In [18]:
esrihq1[0]['attributes']['Addr_type'], esrihq1[0]['attributes']['X'], esrihq1[0]['attributes']['Y']
Out[18]:
('PointAddress', -117.19568252432872, 34.05723700023128)
In [19]:
esrihq1[1]['attributes']['Addr_type'], esrihq1[1]['attributes']['X'], esrihq1[1]['attributes']['Y']
Out[19]:
('StreetAddress', -117.18686179838916, 34.05653921319909)

Example of an Indian address

Photo of Taj Mahal (Source: Lonely Planet)

In [14]:
india_mfield_address = { 
                        "Address" : "Taj Mahal",
                        "City": "Agra",
                        'District': 'Taj Ganj',
                        "Region" : 'Uttar Pradesh',
                        "Country": "IND",
                        "Postal" : 282001
                      }
In [15]:
# geocode the multi_field_address
taj_mahal = geocode(india_mfield_address)
len(taj_mahal)
Out[15]:
7

Example of a Mexican address

Photo of San Miguel de Allende (Source: National Geography)

In [16]:
mex_mfield_address = { 
                        "Address" : "San Miguel de Allende",
                        "City": 'San Miguel de Allende',
                        "Region" : 'Guanajuato',
                        "Country": "MEX"
                      }
In [17]:
# geocode the multi_field_address
san_miguel = geocode(mex_mfield_address)
len(san_miguel)
Out[17]:
20

Example of a Spanish Address

Photo of Sagrada Familia (Source: https://www.culturalplaces.com/)

In [15]:
esp_mfield_address = { 
                        "Address" : "Sagrada Familia",
                        'Nbrhd': 'Sagrada Familia',
                        'District': 'Barcelona',
                        'City': 'Barcelona',
                        'Region': 'Catalunya',
                        "Country": "ESP"
                      }
In [16]:
# geocode the multi_field_address
sag_fam = geocode(esp_mfield_address)
len(sag_fam)
Out[16]:
20

Search for street intersections

The following example illustrates how to search for a street intersection. An intersection is where two streets cross each other, and hence an intersection search consists of the intersecting street names plus the containing administrative division or postal code. For example, redlands blvd and new york st 92373 is a valid intersection search, as is redlands blvd & new york st redlands ca.

In [70]:
intersection = "redlands blvd and new york st 92373"
In [26]:
multi_field_intersection = { 
    "Address" : "redlands blvd & new york st",
    "City" : "Redlands",
    "Region" : "CA"
    }
In [29]:
map2 = gis.map("Esri, Redlands, CA", 15)
map2
Out[29]:
In [27]:
# geocode the intersection address and plot the location of the first geocode result on the map
# either of the two intersection address formats can be used, they give itentical results:

# intersection_result = geocode(intersection)[0]
intersection_result = geocode(multi_field_intersection)[0]
In [28]:
popup = { 
            "title" : "redlands blvd and new york st", 
            "content" : intersection_result['address']
        }
map2.draw(intersection_result['location'], popup)

Understanding the geocoded result

The output fields

The geocode() method returns a list of dict object, and we can look at the first entry of the list (e.g. intersection_result) to determine what the included keys of the dict object are:

In [80]:
intersection_result.keys()
Out[80]:
dict_keys(['address', 'location', 'score', 'attributes', 'extent'])
In [83]:
intersection_result['attributes'].keys()
Out[83]:
dict_keys(['Loc_name', 'Status', 'Score', 'Match_addr', 'LongLabel', 'ShortLabel', 'Addr_type', 'Type', 'PlaceName', 'Place_addr', 'Phone', 'URL', 'Rank', 'AddBldg', 'AddNum', 'AddNumFrom', 'AddNumTo', 'AddRange', 'Side', 'StPreDir', 'StPreType', 'StName', 'StType', 'StDir', 'StPreDir1', 'StPreType1', 'StName1', 'StType1', 'StDir1', 'StPreDir2', 'StPreType2', 'StName2', 'StType2', 'StDir2', 'BldgType', 'BldgName', 'LevelType', 'LevelName', 'UnitType', 'UnitName', 'SubAddr', 'StAddr', 'Block', 'Sector', 'Nbrhd', 'District', 'City', 'MetroArea', 'Subregion', 'Region', 'RegionAbbr', 'Territory', 'Zone', 'Postal', 'PostalExt', 'Country', 'LangCode', 'Distance', 'X', 'Y', 'DisplayX', 'DisplayY', 'Xmin', 'Xmax', 'Ymin', 'Ymax', 'ExInfo'])

See below for the descriptions for all of the fields that can be returned by geocode():

  • address: Complete matching address returned for findAddressCandidates and geocodeAddresses geocode requests.
  • location: The point coordinates of the output match location as specified by the x and y properties. The spatial reference of the x and y coordinates is defined by the spatialReference output field. Always returned by default for findAddressCandidates and geocodeAddresses geocode requests only.
  • score: A number from 1–100 indicating the degree to which the input tokens in a geocoding request match the address components in a candidate record. A score of 100 represents a perfect match, while lower scores represent decreasing match accuracy.
  • attributes: A dict object containing Loc_name, Status, Score, Match_addr, LongLabel, ShortLabel, Addr_type etc. key-value pairs.
  • extent: the display extent of a feature returned by the geocoding service.

Read into a DataFrame

The previous example explained how to get results as a FeatureSet using as_featureset = True parameter, and map the FeatureSet on the Map Widget. Next, we will convert the FeatureSet to a DataFrame and see how the results look.

In [55]:
disney_fset = geocode("Disneyland", as_featureset = True)
disney_fset
Out[55]:
<FeatureSet> 20 features
In [56]:
disney_fset.sdf.head()
Out[56]:
Loc_name Status Score Match_addr LongLabel ShortLabel Addr_type Type PlaceName Place_addr ... Y DisplayX DisplayY Xmin Xmax Ymin Ymax ExInfo OBJECTID SHAPE
0 World M 100 Disneyland Disneyland, 1313 S Disneyland Dr, Anaheim, CA,... Disneyland POI Amusement Park Disneyland 1313 S Disneyland Dr, Anaheim, California, 92802 ... 33.81533 -117.91895 33.81277 -117.92395 -117.91395 33.80777 33.81777 1 {"x": -117.92369994070933, "y": 33.81532991704...
1 World M 100 Disneyland, Anaheim, California Disneyland, Anaheim, CA, USA Disneyland Locality Neighborhood Disneyland Anaheim, California ... 33.81530 -117.92370 33.81530 -117.93570 -117.91170 33.80330 33.82730 2 {"x": -117.92369999999994, "y": 33.81530000000...
2 World T 100 Disney Land Disney Land, 7th Street, Gandhipuram, Coimbato... Disney Land POI Amusement Park Disney Land 7th Street, Gandhipuram, Coimbatore, Tamil Nad... ... 11.01845 76.96622 11.01845 76.96122 76.97122 11.01345 11.02345 3 {"x": 76.96622000000008, "y": 11.0184500000000...
3 World T 100 Disney Land Disney Land, Guntur Road, Prakasam, Andhra Pra... Disney Land POI Convention Center Disney Land Guntur Road, Prakasam, Andhra Pradesh, 523001 ... 15.51213 80.04665 15.51213 80.04165 80.05165 15.50713 15.51713 4 {"x": 80.04665000000006, "y": 15.5121300000000...
4 World T 100 Disney Land Disney Land, Kalawad Road, Mota Mava, Rajkot, ... Disney Land POI Amusement Park Disney Land Kalawad Road, Mota Mava, Rajkot, Gujarat, 360005 ... 22.27105 70.74842 22.27105 70.74342 70.75342 22.26605 22.27605 5 {"x": 70.74842000000007, "y": 22.2710500000000...

5 rows × 59 columns

In [90]:
disney_fset.sdf.Addr_type
Out[90]:
0          POI
1     Locality
2          POI
3          POI
4          POI
5          POI
6          POI
7          POI
8          POI
9          POI
10         POI
11         POI
12         POI
13         POI
14         POI
15         POI
16         POI
17         POI
18         POI
19         POI
Name: Addr_type, dtype: object
In [57]:
disney_fset.sdf.Score
Out[57]:
0     100
1     100
2     100
3     100
4     100
5     100
6     100
7     100
8     100
9     100
10    100
11    100
12    100
13    100
14    100
15    100
16    100
17    100
18    100
19    100
Name: Score, dtype: int64

Plot the accuracy of different results as a bar plot

Take the four entries shown in the disney_fset.sdf for example. Since a score of 100 represents a perfect match, while lower scores represent decreasing match accuracy, the first entry is listed with the highest score, representing the best accuracy among all results.

In [58]:
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_axes([0,0,1,1])
addrs = disney_fset.sdf.Match_addr
scores = disney_fset.sdf.Score
ax.bar(addrs,scores)
plt.show()

Advanced Options

Advanced options for the geocode() method to be used for configuring and fine-tuning the search results are listed below:

  • Limit searches to a set of bounding boxes (search_extent param)
  • Defining an origin point location of the search to be used with distance (location & distance parameters)
  • Customizing number of geocode results (max_locations param)
  • Limit searches to exact address range (match_out_of_range param)
  • Limit searches to a particular country (source_country param)
  • Getting results in desired co-ordinate system (out_sr param)
  • Customizing output fields in the geocoded results (out_fields param)
  • Street coordinates vs Rooftop coordinates (location_type param)
  • Geocoding in languages other than English (lang_code)
  • Whether the results of the operation will be persisted (for_storage param)
  • magic_key param. More explanations can be found in part 7.

search_extent parameter

search_extent: An optional string of a set of bounding box coordinates that limit the search area to a specific region. This is especially useful for applications in which a user will search for places and addresses only within the current map extent.

You can specify the spatial reference of the search_extent coordinates, which is necessary if the map spatial reference is different than that of the geocoding service; otherwise, the spatial reference of the coordinates is assumed to be the same as that of the geocoding service.

The input can either be a comma-separated list of coordinates defining the bounding box or a JSON envelope object. The spatial reference of the bounding box coordinates can be included if an envelope object is used.

as_featureset parameter

as_featureset: An optional boolean. If True, the result set is returned as a FeatureSet object. Otherwise, it is a dictionary.

Example using search_extent & etc.

The example below uses Esri headquarter's bbox as search_extent, and the geocoded results got filtered down to 1 (from the original count of 20).

In [9]:
# geocode the single line address
esrihq_fset = geocode(single_line_address, 
                      search_extent = {'xmin': -117.19587199429185,
                                       'ymin': 34.056237000231285,
                                       'xmax': -117.19387199429184,
                                       'ymax': 34.05823700023128},
                      as_featureset = True)
esrihq_fset
Out[9]:
<FeatureSet> 1 features
In [10]:
esrihq_fset.features[0].attributes["Match_addr"]
Out[10]:
'380 New York St, Redlands, California, 92373'
In [13]:
map0 = gis.map("Redlands, CA")
map0