Find closest facilities

Find Closest Facilities finds the given number of facilities that are reachable from each incident within the specified travel time or travel distance, and returns the best routes between the incidents and the chosen facilities. When finding closest facilities, you can specify whether the direction of travel is toward or away from the facilities. For example, the tool can be used to find the closest fire stations to fire incidents, the closest healthcare providers to a person's address, or the closest hospitals for emergency response.

Find Closest Facilities

Usage notes

  • Find Closest Facilities requires two point DataFrames representing the incidents and the facilities.

  • A network dataset is required to run any network analysis tool. It must be locally accessible to all nodes in your Spark cluster. Use setNetwork() to load the network dataset from a mobile map package or a mobile geodatabase.

  • A travel mode refers to the mode of transportation, such as driving or walking. Use setTravelMode() to choose a mode defined in the network datasource, or a custom travel mode defined in a JSON string. By default, the tool uses the default travel mode in the network datasource.

  • Use the setTravelDirection() setter to specify the direction of travel to or from the facilities.

    • ToFacilities—The closest facilities are searched along the network from the incidents to the facilities within the specified impedance cutoff. This is the default.

    • FromFacilities—The closest facilities are searched along the network from the facilities to incidents within the specified impedance cutoff.

    The travel direction you should use depends on your use case. For example, in emergency response, FromFacilities is commonly used to calculate the travel time or travel distance from the facilities (e.g., hospitals, fire stations, police stations) to the locations of the emergency. In retail store management, ToFacilities is commonly used to calculate the travel time or travel distance from customers' addresses to the retail stores to analyze the proximity of the facilities.

  • It is required to set the impedance cutoff using setCutoff(). It accepts a single cutoff value. The impedance cutoff is used to set the maximum travel distance or travel time when searching for facilities for each incident. The impedance cutoff must be the same units as the travel mode. For example, if the travel mode is driving time, the impedance cutoff should be time based.

    There are two types of cutoff value supported in the Find Closest Facilities tool.

    • Distance cutoff—Specify the maximum traveling distance between incidents and facilities. For example, when analyzing walking distance from schools (incident DataFrame) to subway stations (facility DataFrame), a cutoff value of 1 mile (e.g., setCutoff(1, "mile")) means that the tool will search for the closest subway stations within 1 mile walking from each school.

    • Time cutoff—Specify the maximum traveling time between incidents and facilities. For example, when analyzing driving time from fire stations (facility DataFrame) to fire incidents (incident DataFrame), a cutoff value of 15 minutes (e.g. setCutoffs(15, "minutes")) means the tool will search for the closest fire stations within a 15-minute drive to the fire incidents.

    The cutoff value must be positive. If the unit is missing, the tool will use the distance or time unit defined in the travel mode.

  • Use the setNumFacilities() setter to specify the maximum number of closest facilities to find per incident. If there are multiple facilities with an equal travel cost to an incident, Find Closest Facilities will break ties by randomly selecting one or more records from the equidistant facilities to ensure the specified number of closest facilities. For example, if you are interested in finding two closest facilities when there are three facilities that are equidistant from the incident, two of the three facility records will be randomly selected and returned in the output.

  • The impedance cutoff can result in fewer facilities returned than the specified number of facilities. For example, if you are interested in finding three closest facilities within a specified travel distance when there are two identified facilities within the impedance cutoff, only the two facilities will be returned in the output. When there are no closest facilities found within a specified cutoff, the tool will return a Null value in the output for no facilities.

  • Find Closest Facilities can return the best routes or straight lines between the incidents and the chosen facilities. You can also choose not to return line geometry for better performance. Use setRouteGeometry() to set the route geometry for the output DataFrame:

    • AlongNetwork—The true shape of the resulting route that is based on the streets along the network.

    • StraightLines—A straight line between the incident and the identified facility.

    • NoLines—No line geometry will be returned.

    When the output route geometry is set to AlongNetwork or StraightLines, the primary geometry field of the output Dataframe is the route geometry field. When set to Nolines, there is no primary geometry field in the output Dataframe.

  • Use the setter accumulateAttributes() to specify cost attributes accumulated along the network. The cost attributes are defined in the network dataset. One or more Total_[Cost] columns will be returned, where Cost is the name of the cost attribute. For example, if the available cost attributes in the network dataset are Kilometers ,Minutes, and WalkTime, you can accumulate all attributes by calling accumulateAttributes("Kilometers", "Minutes", "WalkTime"). In this case three output fields (Total_Kilometers, Total_Minutes, and Total_WalkTime) will be returned, representing the cost along the network between the associated origin and destination.

  • When travel mode is configured wth traffic data, you can specify a time and a the day of week at which the routes will begin or end using setTime(day_of_week, time, time_zone = "UTC", usage = "Departure"). The interpretation of the time depends on whether the time usage is set to be the departure time or the arrival time of the route.

    • Departure—The time is interpreted as the departure time from the facility or incident. This is the default.
    • Arrival— The time interpreted as the arrival time at the facility or incident. This option is useful if you want to know the time to depart from a location so you arrive at the destination at the time specified in the setTime().

    Traffic information will not be used in network analysis when setTime() is not used, or when the setter is used but no traffic data is configured with the travel mode .

  • GeoAnalytics Engine support setting a specific time in a generic weekday in setTime(day_of_week, time, time_zone = "UTC", "usage" = "Departure").

    day_of_week is a string representing the day of the week. Acceptable values are:

    • Sunday
    • Monday
    • Tuesday
    • Wednesday
    • Thursday
    • Friday
    • Saturday

    time is the time of day when traffic will be modeled. It can be provided in two formats:

    • A string in the format "HH:mm:ss", for example, "14:30:00".
    • A datetime.time object.

    time_zone is an optional string representing the time zone. The default option is Coordinated Universal Time (UTC). You can specify a time zone ID in the following formats to use local time.

    • UTC offset—a fixed offset from UTC. For example "UTC-05:00" represents the time zone that is five hours behind UTC. GeoAnalytics Engine does not account for Daylight Saving Time (DST). Only Standard Time (SDT) is used for UTC offset.
    • Time zone identifier—a standardized string that uniquely identifies a time zone region (e.g., "America/New_York"). For a comprehensive list of time zone identifiers, refer to this list of tz database time zones.

    If you specify a local time zone, whether the time zone is of the facilities or incidents depends on both the travel direction and time usage.

    • If travel direction is set to from facilities and usage set to Departure, this is the time zone of the facilities.
    • If travel direction is set to from facilities and usage set to Arrival, this is the time zone of the incidents.
    • If travel direction is set to toward facilities and usage set to Departure, this is the time zone of the incidents.
    • If travel direction is set to toward facilities and usage set to Arrival, this is the time zone of the facilities.

    For example, to set the arrival time for Friday at 8:15 a.m. in the "America/New_York" time zone, you can use setTime("Friday", "08:15:00", "America/New_York", "arrival"), setTime("Friday", datetime.time(8,15), "America/New_York", "arrival"), or setTime("Friday", "13:15:00", "UTC", "arrival).

  • The analysis will always be completed in the coordinate system of the network dataset. If the incident Dataframe or facility Dataframe are in a different coordinate system than the network dataset , they will automatically be transformed to the coordinate system of the network dataset.

    Learn more about coordinate systems and transformations.

Limitations

  • Network analysis requires a network dataset from a mobile map package or a mobile geodatabase. Loading network data from a file geodatabase is not supported. Using a network service, such as the ArcGIS Online network analysis service, is not supported.

  • GeoAnalytics Engine does not support adding barriers in network analysis.

Results

The following fields are included in the output DataFrame:

  • All fields from the incident DataFrame
  • All fields from the facility DataFrame

In addition, the following fields representing the travel cost are included for each output record:

FieldDescription
RankThe rank of the closest facilities. The rank is given according to ascending-order travel distance or time.
TravelTimeThe travel time in minutes between the incident and the identified facility.
TravelDistanceThe travel distance in meters between the incident and the identified facility.

Both travel time and travel distance are calculated along the network. If you accumulate cost attributes, your output will have one or more fields named Total_[Cost] representing the accumulated travel cost along the network between the incident and identified facility.

If you set the output route geometry to AlongNetwork or StraightLines, your output will have a linestring field named route_geometry representing the true shape or straight line between the incident and the identified facility.

Performance notes

Improve the performance of the Find Closest Facilities tool by doing one or more of the following:

  • Only analyze the records in your area of interest. You can pick the records of interest by using one of the following SQL functions:

    • ST_Intersection—Clip to an area of interest represented by a polygon. This will modify your input records.
    • ST_BboxIntersects—Select records that intersect an envelope.
    • ST_EnvIntersects—Select records having an evelope that intersects the envelope of another geometry.
    • ST_Intersects—Select records that intersect another dataset or area of intersect represented by a polygon.
  • Set the route geometry to Nolines instead of the AlongNetwork or StraightLines if you are only interested in determining the total travel time or travel distance between the incidents and facilities.
  • Use smaller values for setCutoff() and setNumberFacilities().

Similar capabilities

Syntax

For more details, go to the GeoAnalytics Engine API reference for find closest facilities.

SetterDescriptionRequired
run(incidents_df, facilities_df)Runs the Find Closest Facilities tool using the provided DataFrames.Yes
setNetwork(path)Sets the network data source from a mobile map package or a mobile geodatabase.Yes
setTravelMode(travel_mode)Sets the travel mode. By default, the tool uses the default travel mode in the network datasource.No
setTravelDirection(travel_direction)Sets the direction of travel to or from the facilities. Choose from 'ToFacilities' (default) or 'FromFacilities'.No
setCutoff(cutoff, unit=None)Sets the maximum travel distance or travel time searching for facilities for each incident. By default, it is in the unit of the impedance attribute used by the travel mode.Yes
setNumFacilities(count)Sets the number of facilities to find. The default is 1.No
setRouteGeometry(route_geometry)Sets the route geometry representing the route between the incident and the identified facility. Choose from 'AlongNetwork' (default), 'StraightLines', or 'NoLines'.No
setTime(day_of_week, time, time_zone = "UTC", usage = "Departure")Sets the time at which the routes begin or end.No
accumulateAttributes(*attributes)Accumulates cost attributes along the network between the incident and the identified facility. No accumulated cost is returned by default.No

Examples

Run Find Closest Facilities

Python
Use dark colors for code blocksCopy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# Log in
import geoanalytics
geoanalytics.auth(username="myusername", password="mypassword")

# Imports
from geoanalytics.tools import FindClosestFacilities
from pyspark.sql import functions as F
import matplotlib.pyplot as plt

# Create a facility DataFrame
facilities_url = r"https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/SDFireStations/FeatureServer/0"
facilities_df = spark.read.format("feature-service").load(facilities_url) \
                     .select("FACILITYID", "FULLADDR", "PHONE", "ACTIVE", "shape")

# Create an incident DataFrame
incidents_url = r"https://services1.arcgis.com/Ua5sjt3LWTPigjyD/arcgis/rest/services/Public_School_Location_201819/FeatureServer/0"
incidents_df = spark.read.format("feature-service").load(incidents_url) \
                    .filter(F.col("NMCNTY") == 'San Diego County') \
                    .select("NAME", "STREET", "CITY", "STATE", "ZIP", "shape")

# Access the Network Dataset
# This needs to be accessible to the machine that is running the Find Closest Facilities tool.
# If running on a cluster, it needs to be accessible to all nodes in the cluster.
california_network = r"/data/California.mmpk"

# Run the Find Closest Facilities tool
result = FindClosestFacilities() \
        .setNetwork(california_network) \
        .setTravelMode("trucking time") \
        .setTravelDirection("FromFacilities") \
        .setCutoff(5, "minutes") \
        .setNumFacilities(2) \
        .setRouteGeometry("AlongNetwork") \
        .run(incidents_df, facilities_df) \
        .where(F.col("TravelTime").isNotNull())
result.sort("NAME","Rank").show(5)
Result
Use dark colors for code blocksCopy
1
2
3
4
5
6
7
8
9
10
+--------------------+---------------+---------+-----+-----+--------------------+----+------------------+------------------+----------+--------------------+------------+------+--------------------+--------------------+
|                NAME|         STREET|     CITY|STATE|  ZIP|               shape|Rank|        TravelTime|    TravelDistance|FACILITYID|            FULLADDR|       PHONE|ACTIVE|              shape1|      route_geometry|
+--------------------+---------------+---------+-----+-----+--------------------+----+------------------+------------------+----------+--------------------+------------+------+--------------------+--------------------+
|                ALBA|4041 Oregon St.|San Diego|   CA|92104|{"x":-117.1348499...|   1|2.7688534783676917|1002.3922083982721|        14|    4011 32nd Street|619-533-4300|   Yes|{"x":-117.1249271...|{"paths":[[[-117....|
|    Adams Elementary|  4672 35th St.|San Diego|   CA|92116|{"x":-117.1180409...|   1|1.7452089199273755| 695.6191880289591|        18|  4676 Felton Street|619-533-4300|   Yes|{"x":-117.1220022...|{"paths":[[[-117....|
|    Adams Elementary|  4672 35th St.|San Diego|   CA|92116|{"x":-117.1180409...|   2| 4.719640574786083| 1959.747748980926|        14|    4011 32nd Street|619-533-4300|   Yes|{"x":-117.1249271...|{"paths":[[[-117....|
|Albert Einstein A...|   3035 Ash St.|San Diego|   CA|92102|{"x":-117.1384500...|   1| 2.156189057040781| 737.2832135729769|        11|     945 25th Street|619-533-4300|   Yes|{"x":-117.1400096...|{"paths":[[[-117....|
|Albert Einstein A...|   3035 Ash St.|San Diego|   CA|92102|{"x":-117.1384500...|   2| 4.247430230528537| 1439.274596232985|         7|944 Cesar E. Chav...|619-533-4300|   Yes|{"x":-117.1450304...|{"paths":[[[-117....|
+--------------------+---------------+---------+-----+-----+--------------------+----+------------------+------------------+----------+--------------------+------------+------+--------------------+--------------------+
only showing top 5 rows

Plot results

Python
Use dark colors for code blocksCopy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Plot the closest fire stations near public schools in San Diego
colors = ['#e34a33', '#fdcc8a']
cmap = plt.cm.colors.ListedColormap(colors)

# Plot the true routes
result_plot = result.st.plot(cmap_values="Rank",
                             cmap=cmap, is_categorical=True,
                             basemap="light",
                             figsize=(15,15),
                             legend=True,
                             legend_kwds = {"title": "Rank of closest fire stations"})

# Plot the public school in green
result.st.plot(geometry="shape", facecolor = "#1a9641", marker_size=30, ax=result_plot)
# Plot the closest fire stations near public schools
result.st.plot(geometry="shape1",
               cmap_values="Rank",
               cmap=cmap, is_categorical=True,
               marker_size=30,
               ax=result_plot)
result_plot.set_title("Closest fire stations and routes to public schools in San Diego")
result_plot.set_xlabel("X (Degrees)")
result_plot.set_ylabel("Y (Degrees)")
Plotting example for a Find Closest Facilities result.

Version table

ReleaseNotes

1.3.0

Python tool introduced

1.4.0

Added support for setting start time.

1.5.0

Added support for loading the network dataset using SparkContext.addFile.

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.