Find Dwell Locations

Find Dwell Locations determines dwell locations from time-sequential points in a track. Dwell locations are defined as sequential observations with little or no movement over a certain period of time.

Depending on the field of application, this may be referred to as stay points or idle detection. Tracks are identified by one or more track fields. The result DataFrame contains the dwell locations as points, a convex hull of the dwell locations, or a mean center point of the dwell. The result also contains the count of points within a dwell location, the start and end time of the dwell, the duration of the dwell, and any additional statistics that have been calculated. Each track can have 0, 1, or more dwell locations.

fdl workflow

Usage notes

TermDescription
Dwell locationLocations representing a track that has been stationary relative to the specified time and distance values. The output represents dwell locations as points, convex hulls, or mean centers.
ObservationA point in a track.
TrackA sequence of point observations that are time enabled with time type instant. Rows are determined to be in the sequence by a track identifier field and are ordered by time. For example, a city can have a fleet of snow plow trucks that record their location every 10 minutes. The vehicle ID can represent the distinct tracks.
GeodesicA line drawn on a sphere. A geodesic line drawn on the globe represents the curvature of the earth's geoid.
PlanarA straight-line distance as measured on a flat surface (that is, a Cartesian plane). This is also referred to as Euclidean distance.
InstantA single moment in time represented by a start time and no end time.
IntervalA duration of time represented by a start time and an end time.
  • The input DataFrame must have time-enabled points that represent an instant in time.

  • Dwell locations are defined as sequential observations with little or no movement over a certain period of time.

  • Results are points representing instants in time, or polygons representing an interval in time. The start and end of the interval are determined by the time of the first and last points in a dwell.

  • Dwell locations can only be detected in tracks with more than one point.

  • Tracks are represented by the unique combination of one or more track fields. Specify one or more fields to identify tracks using setTrackFields().

  • By default, dwell locations are created using a geodesic method for calculating distance. It is recommended that you use geodesic distance in the following circumstances:

    • Tracks cross the international date line—When using the geodesic method, input DataFrames that cross the international date line will have tracks that correctly cross the international date line. This is the default. The input DataFrame's geometry must have a projected coordinate system that supports wrapping around the international date line, for example, a global projection such as World Cylindrical Equal Area. You can transform your data to a projected coordinate system by using ST_Transform.

      Learn more about coordinate systems and transformations

    • The dataset is not in a local projection—If the input DataFrame is in a local projection, use the planar distance method. For example, use the planar method to examine dwell locations in a single state. The input DataFrame must be set to a spatial reference local to the dataset.

  • Output dwell locations can be represented in four ways. The following table shows an example of each:

Output typeDescriptionExample
All pointsEvery point is returned even if it is not part of a dwell. The resulting points have time type instant. Only a count statistic is calculated for this output type. The count represents the number of rows that belong to a single dwell. Non-dwell points will have a count of 0. output all
Dwell pointsOnly points that are part of a dwell are returned. The resulting points have time type instant. Only a count statistic is calculated for this output type. The count represents the number of points that belong to a single dwell. output dwells
Mean centersEach dwell has a single point returned representing the mean center of the dwell in distance and time. The resulting points have time type interval. The count of points in the dwell is always calculated. You can optionally calculate statistics on this type of dwell result. By default, no statistics are calculated. output meancenters
Convex hullsEach dwell is represented by a convex hull polygon of the dwell result. The resulting rows have time type interval. The count of rows in the dwell is always calculated. You can optionally calculate statistics on this type of dwell result. By default, no statistics are calculated. output convex
  • You can split tracks using the following methods:

    • setTimeSplit() - Splits tracks based on a time between observations. Applying a time split breaks up any track when input data is farther apart than the specified time. For example, if you have five rows with the same track identifier and the times of [01:00, 02:00, 03:30, 06:00, 06:30] and set a time split of 2 hours, any rows that are measured more than 2 hours apart will be split. In this example, the result would be a track with [01:00, 02:00, 03:30] and [06:00, 06:30], because the difference between 03:30 and 6:00 is greater than 2 hours.

    • setTimeBoundarySplit() - Splits tracks based on defined time intervals. Applying a time boundary split segments tracks at a defined interval. For example, if you set the time boundary to 1 day, starting at 9:00 a.m. on January 1, 1990, each track will be truncated at 9:00 a.m. every day. This split accelerates computing time, as it creates smaller tracks for analysis. If splitting by a recurring time boundary makes sense for your analysis, it is recommended for big data processing.

    • setDistanceSplit() - Splits tracks based on a distance between observations. Applying a distance split breaks up any track when input data is farther apart than the specified distance. For example, if you set a distance split of 5 kilometers, sequential rows greater than 5 kilometers apart will be part of a different track.

    • setSplitExpression() - Splits tracks based on an Arcade expression. Applying a split expression splits tracks based on values, geometry or time values. For example, you can split tracks when a field value is more than double the previous value in a track. To do this with an example field named WindSpeed, you can use the following expression: var speed = TrackFieldWindow("WindSpeed", -1, 1); 2* speed[0] < speed[1]. Tracks will split when the previous value (speed[0]) is less than two times the current value.

  • You can apply none, one, two, three, or four split options at the same time. All of the examples below use a gap split. The results, assuming you apply a time split of six hours, a time boundary of one day, and distance split of 16 kilometers, are as follows:

Five examples of input points (green) with varying time and distance
splits

Five examples of input points (yellow) with varying time and distance splits are shown.

Split optionDescription
Six input points with a time and locationInput rows with the same identifier. The distance between the points is marked on top of the dotted line, and the time of each point measurement is marked below the points. There are four splits on the timeline. The red splits represent the time boundary split of one day starting at 12:00 a.m. The blue split represents the distance split when the distance between two points is greater than 16 kilometers. The purple split represents the time split when the temporal distance between two sequential points is more than six hours.
1.Example with no time split and no distance split.
2.Example with a time split of six hours. Any rows more than two hours apart are split into separate tracks.
3.Example with a time boundary of one day, starting at midnight. At each one-day interval starting from the specified time (here 12:00 a.m.), a track is created.
4.Example with a distance split of 16 kilometers. Any rows more than 16 kilometers apart (the rows at 05:00 a.m. and 06:00 a.m.) are split into separate tracks.
5.Example with a time split of six hours and a time boundary of one day starting at 12:00 a.m. Any rows more than six hours apart or that intersect with the time duration split at 12:00 a.m. are split into separate tracks.
6.Example with a time split of six hours and a distance split of 16 kilometers. Any rows more than six hours apart (the rows at 06:00 a.m. and 7:00 p.m.) or farther than 16 kilometers apart are split into separate tracks.
7.Example with a distance split of 16 kilometers and a time boundary of one day starting at 12:00 a.m. Any rows greater than 16 kilometers apart or that intersect with the time duration split at 12:00 a.m. are split into separate tracks.
8.Example with a distance split of 16 kilometers, a time split of six hours, and a time boundary of one day starting at 12:00 a.m. Any rows farther than 16 kilometers apart, or more than six hours apart, or that intersect with the time duration split at 12:00 a.m. are split into separate tracks.
  • When choosing parameters to calculate dwell locations, consider the type of observation and the scale of dwell that you want to find. The following are examples of how to specify dwell calculations:

    • Ship DataFrame that has vesselID and tripID fields.

      • Use the vesselID and tripID fields as the identifiers to calculate dwell locations along distinct routes.

      • Use a time tolerance of 1 hour and a distance tolerance of 1 nautical mile to discover where vessels stay within 1 nautical mile for at least 1 hour.

    • Animal tracker DataFrame that has an animalID field.

      • Use the animalID field as the identifier to compare dwell locations of specific animals.

      • To determine the range of an animal, use a time tolerance of 3 days, and a distance tolerance of 10 miles to discover animal habitats of interest.

      • For a smaller area of interest, use a time tolerance of 2 hours, and a distance tolerance of 100 meters.

Limitations

  • Inputs must be point DataFrames with time-enabled records of time type instant.

  • Any points that do not have time will not be included in the analysis.

  • When calculating the convex hull and a dwell location is completely stationary (one unique location) or composed of two unique points, a small value based on the tolerance of the spatial reference used in an analysis will be used as the width, height, or diameter to create output polygons instead of convex hulls. These polygons can be used for visualization and do not represent the spatial extent of the dwell. Examples of these cases are described in the following table:

Input caseDescriptionExample
Coincident (one spatially-unique location)If the input rows are stacked (coincident), the resulting convex hull will be an invalid polygon. In this example, the coincident input rows are represented by the red dot in the center of the yellow polygon. The yellow polygon represents the output convex hull result for coincident points. The blue polygon represents what a true convex hull looks like when there are four noncoincident points in a single dwell location. case coincident
Colinear (two spatially-unique locations)If the input rows are in a line (most common with two spatially-unique locations), the resulting convex hull will be an invalid polygon. In this example, colinear points are represented by red dots in the yellow polygon. The yellow polygon represents the output convex hull result for colinear points. case colinear

Results

In addition to the fields from the input DataFrame, the following fields are included for output records:

FieldDescription
dwell_geometryThe output geometry for the dwell. The geometry will either be a point (for 'DwellPoints', 'AllPoints', or 'DwellMeanCenters') or polygon ('DwellConvexHulls').
COUNTThe number of points that were in the dwell.
DwellIDA unique ID for the dwell that the point belongs to.
DwellDurationDuration of dwell time in milliseconds. This is calculated as the difference between the first and last record in the dwell.
MeanXThe mean value of the x-coordinates that compose the dwell.
MeanYThe mean value of the y-coordinates that compose the dwell.
MeanDistanceThe average distance between consecutive points in a dwell location.
dateThe time of the point. Returned when the output type is DwellPoints or AllPoints.
dwell_startThe start time of the dwell. Returned when the output type is DwellConvexHulls or DwellMeanCenters.
dwell_endThe end time of the dwell. Returned when the output type is DwellConvexHulls or DwellMeanCenters.
<statistic>_<fieldname>Specified statistics of specified fields. These are returned based on addSummaryField() inputs. These are only calculated for DwellConvexHulls and DwellMeanCenters outputs.

If the setOutputType() value is AllPoints, the results that belong to a dwell will have the fields above calculated. The results that do not belong to a dwell will return a value of 0 for the count field, the date field will return the time value of the input row, and all other fields will return a value of null.

Performance notes

Improve the performance of Find Dwell Locations by doing one or more of the following:

  • Only analyze the records in your area of interest. You can pick the records of interest by using one of the following SQL functions:

    • ST_Intersection—Clip to an area of interest represented by a polygon. This will modify your input records.
    • ST_EnvIntersects—Select records that intersect an envelope.
    • ST_Intersects—Select records that intersect another dataset or area of intersect represented by a polygon.
  • Specify a value of DwellPoints or DwellMeanCenters using setOutputType().
  • Subdivide tracks as much as possible by using setTrackField() inputs.
  • Specify a value of planar using setDistanceMethod() to determine distance calculation method instead of geodesic.
  • Split the tracks using one or more of setTimeSplit(), setTimeBoundarySplit(), or setDistanceSplit(). Using setTimeBoundarySplit() will provide the biggest performance gain.

Similar capabilities

The following tools perform similar capabilities:

How Find Dwell Locations works

  • Dwell locations are determined using both time (setDwellMinDuration()) and distance (setDwellMaxDistance()) values. First, the tool assigns points to a track using a unique identifier (setTrackFields()). Track order is determined by the recorded time for each point. Then the distance between the first observation in a track and the next is calculated. Points are considered to be part of a dwell if two temporally consecutive points stay within the given distance for at least the given duration. When two points are found to be part of a dwell, the first point in the dwell is used as a reference point, and the tool finds consecutive points that are within the specified distance of the reference point in the dwell. Once all points within the specified distance are found, the tool collects the dwell points and calculates their mean center. Points before and after the current dwell are added to the dwell if they are within the given distance of the dwell location's mean center. This process continues until the end of the track.

  • Input DataFrames are summarized into dwell locations using a unique identifier. For all output types, count of points and time duration are calculated for each dwell location.

  • If you selected a summarized output option (DwellMeanCenters or DwellConvexHulls), each track can optionally apply numeric statistics (Count, Sum, Minimum, Maximum, Range, Mean, Standard Deviation, Variance, First and Last) or string statistics (Count, Any, First and Last) for the rows summarized within a track using addSummaryField().

  • The First and Last statistics return the first or last value in a track. For example, with a time-ordered track with the following values: [Toronto,Guelph,Montreal], the first value is Toronto, and the last value is Montreal.

  • The count statistic (for strings and numeric fields) counts the number of non-null values. The count of the following values equals 5: [0, 1, 10, 5, null, 6] = 5. The count of this set of values equals 3: [Primary, Primary, Secondary, null] = 3.

Syntax

For more details, go to the GeoAnalytics Engine API reference for find dwell locations.

SetterDescriptionRequired
run(dataframe)Runs the Find Dwell Locations tool using the provided DataFrame.Yes
addSummaryField(summary_field, statistic, alias=None)Adds a summary statistic of a field in the input DataFrame to the result DataFrame.No
setDistanceMethod(distance_method)Sets the method used to calculate distances between track observations. There are two methods to choose from: 'Planar' or 'Geodesic' (default).No
setDwellMaxDistance(max_distance, max_distance_unit)Sets the maximum distance between points for them to be considered part of a single dwell event.Yes
setDwellMinDuration(min_duration, min_duration_unit)Sets the minimum time between points for them to be considered part of a single dwell event.Yes
setOutputType(output_type)Sets the result type. Options include 'DwellPoints', 'AllPoints', 'DwellConvexHulls' or 'DwellMeanCenters'.Yes
setTimeBoundarySplit(time_boundary_split, time_boundary_split_unit, time_boundary_reference=None)Sets boundaries to limit calculations to defined spans of time.No
setTrackFields(*track_fields)Sets one or more fields used to identify distinct tracks.Yes

Examples

Run Find Dwell Locations

Python
Use dark colors for code blocksCopy
                                               
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# Log in
import geoanalytics
geoanalytics.auth(username="myusername", password="mypassword")

# Imports
from geoanalytics.tools import FindDwellLocations
from geoanalytics.sql import functions as ST
from pyspark.sql import functions as F

# Path to the Seattle example tracks data
data_path = r"https://services1.arcgis.com/36PP9fe9l4BSnArw/arcgis/rest/" \
            "services/seattle_example_tracks/FeatureServer/0"

# Create a DataFrame from the Seattle example tracks data
df = spark.read.format("feature-service").load(data_path)

# Use Find Dwell Locations to find where users do not move more than 30 meters
# for at least 5 minutes
result = FindDwellLocations() \
            .setTrackFields("user_id") \
            .setDistanceMethod(distance_method="Planar") \
            .setDwellMaxDistance(max_distance=30, max_distance_unit="Meters") \
            .setDwellMinDuration(min_duration=5, min_duration_unit="Minutes") \
            .setOutputType(output_type="Dwellpoints") \
            .run(dataframe=df)

# Convert DwellDuration from milliseconds to minutes
result = result.withColumn("DwellDuration_minutes", F.col("DwellDuration") / 60000)

# Show the first 5 dwell result records from the result
result.select("DwellID", "user_id", "COUNT", "DwellDuration_minutes", "MeanDistance").show(5)
Result
Use dark colors for code blocksCopy
         
1
2
3
4
5
6
7
8
9
+--------------------+-------+-----+---------------------+------------------+
|             DwellID|user_id|COUNT|DwellDuration_minutes|      MeanDistance|
+--------------------+-------+-----+---------------------+------------------+
|505c9cd5-ed40-4d8...|  user3|    8|                  7.0| 7.885048697300235|
|e777d782-176f-41a...|  user3|   12|                 11.0|16.525913142284654|
|125c8161-b3b0-457...|  user4|   16|                 15.0| 7.360461096415931|
|28b6bcb4-0e4e-40d...|  user4|    9|                  8.0| 8.712140471225085|
|f05c9c33-d05c-422...|  user4|    2|               1278.0| 6.748597028826337|
+--------------------+-------+-----+---------------------+------------------+

Plot results

Python
Use dark colors for code blocksCopy
                                               
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# Plot the dwell locations for user3 to visualize where dwells occur along tracks
seattle_example_tracks_plot = df.where("user_id = 'user3'").st.plot(color="lightgrey",
                                                                    figsize=(16,10))
result_plot = result.where("user_id = 'user3'") \
    .st.plot(cmap_values="DwellDuration_minutes",
             legend=True,
             ax=seattle_example_tracks_plot)
result_plot.set_title("Input locations and dwell locations for user3")
result_plot.set_xlabel("X (US Survey Feet)")
result_plot.set_ylabel("Y (US Survey Feet)");

Plotting example for a Find Dwell Locations result.

Version table

ReleaseNotes

1.0.0

Tool introduced

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.