Trace proximity events

Trace Proximity Events analyzes time-enabled points representing moving entities. The tool follows entities of interest in space (location) and time to see which other entities the entities of interest have interacted with. The trace will continue from entity to entity to a configurable maximum degrees of separation from the original entity of interest.

Trace Proximity Events workflow

Usage notes

TermDefinitionExample
EntityA moving object with position periodically recorded.An animal, person, or vehicle. An entity may be stationary or moving.
Entities of interestThe specific entities used to start a trace.A person infected with COVID-19.
Proximity eventWhen two entities are near each other within a period of time.Two people that come within 3 meters of each other and within a 1-minute window of each other.
DepthThe degree of separation between an entity of interest and an entity further down the trace (downstream).A proximity event between the entity of interest and someone else is depth 1.
Trace eventThe first contact for a specified entity downstream from the entities of interest.
  • The following are examples of use cases that can be performed with the Trace Proximity Events tool:

    • An organization monitors company-issued devices carried by workers. The company is interested in determining which employees were near an individual known to have coronavirus disease 2019 (COVID-19). Using a DataFrame representing device locations and time, they identify devices that have been within six meters and five minutes of the contagious person and other possibly contagious employees.

    • An NGO is monitoring salmon populations using GPS and is interested in tracking the spread of salmon lice between escaped farmed salmon and wild populations. Some GPS-tagged farmed salmon are tracked to see if they come in close proximity with tagged wild populations, and how those wild populations may further spread the disease. The measurements also include a depth field, which the NGO uses to only find fish at a similar depth.

  • When tracing proximity events, it is your responsibility to understand organizational, local, and national guidelines regarding data sensitivity and privacy.

  • When using proximity tracing to find the transmission (such as a disease), be aware of the following:

    • The presence of a trace event does not guarantee that it has been transmitted; it is only a potential encounter.
    • The absence of a trace event does not mean that something hasn’t been transmitted. In cases such as a disease, there may be transmission through other vectors.
    • When possible, use setAttributeMatchCriteria() to constrain proximity events when required. For example, use attributes to constrain the room, floor, or elevation.
  • Specifying a larger value for setSearchDuration() and setSearchDistance() results in more events and takes longer to process the results. Smaller distances results in fewer events and a shorter processing time.

  • Records must meet both the temporal search distance and the spatial search distance criteria to be considered near each other. tpe temporal tpe spatial

  • Use domain-specific knowledge to determine the values used for setSearchDuration() and setSearchDistance(). Consider factors such as the accuracy of the device when setting the distances.

  • The entity of interest is where the proximity tracing begins. If you specify a start time, tracing begins at that time for that entity. If you do not specify a time, tracing begins on January 1, 1970 for that entity.

  • Defining the entities of interest requires entity ID values from the input DataFrame and optionally a start time value from which tracing will begin.

  • By default, entity tracks are created using the planar distance method when the input DataFrame is in a projected coordinate system and the geodesic distance method when the input DataFrame is in a geographic coordinate system. To override this default behavior, use setDistanceMethod(). It is recommended that you use geodesic distance in the following circumstances:

    • Tracks cross the international date line—When using the geodesic method, input DataFrames that cross the international date line will have tracks that correctly cross the international date line. Your input DataFrame or processing spatial reference must be set to a spatial reference that supports wrapping around the international date line, for example, a global projection such as World Cylindrical Equal Area.

    • Your DataFrame is not in a local projection—If your input DataFrame is in a local projection, use the planar distance method. For example, use the planar method to examine trace events within a single state. Your input DataFrame or processing spatial reference must be set to a spatial reference local to your dataset.

  • You can set additional requirements for a proximity event:

    • For example, you can trace only individuals in a particular building on a campus, or you can trace only within one level of a building. Use setAttributeMatchCriteria() to specify constraining attributes. For example, to constrain entities on the same floor, specify the Floor field.

    • By default, all traces between an entity of interest and an entity farther down the trace are found. Use setMaxTraceDepth() to limit the depth.

  • Optionally, use includeTracksDataFrame() to create a track DataFrame that contains the first trace event and all subsequent records for that entity. Additionally, the records for the entity of interest are always included in the output track DataFrame. These results are helpful for visualizing where entities travelled and can be used in the Reconstruct Tracks tool.

Limitations

The input proximity events must be a point DataFrame with a timestamp. Any records that do not have time are not included in the results.

When using proximity tracing to find transmission (such as a disease), be aware of the following:

  • The presence of a trace event does not guarantee that it has been transmitted; it is only a potential encounter.

  • The absence of a trace event does not mean that something hasn’t been transmitted. In cases such as a disease, there may be transmission through other vectors.

  • When possible, use setAttributeMatchCriteria() to constrain proximity events when required. For example, use attributes to constrain the room, floor, or elevation.

Results

The output proximity events DataFrame contains the first proximity event for the entities in the trace, as well as the following fields:

FieldDescription
from_idThe upstream entity ID.
to_idThe downstream entity ID.
depthThe degree of separation between the entity of interest and the to_id field.
duration_minutesThe duration of the trace event in minutes. This field is calculated as the difference between the start and end times and stored in seconds. A value of 0 means that there is a single proximity event (same start and end time are the same).
entity_idThe entity ID.
event_startThe date and time of the proximity event. This field is calculated as the first recorded time that meets the criteria of the proximity event.
event_geometryThe geometry of the proximity event.

The output tracks DataFrame includes the following fields:

FieldDescription
entity_idThe entity ID.
depthThe degree of separation between the entity of interest and the trace track. The depth is the same across a single track.
instant_datetimeThe timestamp of each record. This is the same date as the record from the input records.
track_geometryPoints representing the locations of entities after each proximity event.
track_startThe start time of the track.

Similar Capabilities

Performance notes

Improve the performance of Trace Proximity Events by doing one or more of the following:

  • Only analyze the records in your area of interest. You can pick the records of interest by using one of the following SQL functions:

    • ST_Intersection—Clip to an area of interest represented by a polygon. This will modify your input records.
    • ST_BboxIntersects—Select records that intersect an envelope.
    • ST_EnvIntersects—Select records having an evelope that intersects the envelope of another geometry.
    • ST_Intersects—Select records that intersect another dataset or area of intersect represented by a polygon.
  • Use smaller values for setSearchDistance() and setSearchDuration().
  • Limit the entities of interest using the setAttributeMatchCriteria().
  • Specify a setMaximumTraceDepth() value to limit the number of downstream traces for a given entity and the entity of interest.

How Trace Proximity Events works

The diagrams below show how the Trace Proximity Events tool processes data. In these diagrams, time is on the x-axis. In each diagram there are four entities: A, B, C, and D. The highlighted text describes the trace events that occur between two entities (the from and to entities) and the depth of the proximity event. In this example, entity C is the entity of interest that is being traced downstream.

In diagram 1, entity C is the chosen entity of interest. The depth is 0.

Trace Proximity Events diagram 1

In diagram 2, a proximity event occurs between entities C and B. The depth of the trace is 1. When multiple records are subsequent proximity events, this is a sustained proximity event.

Trace Proximity Events diagram 2

In diagram 3, a proximity event occurs between entities B and A. The depth of the trace is 2.

Trace Proximity Events diagram 3

In diagram 4, a proximity event occurs between entities C and D. The depth of the trace is 1.

Trace Proximity Events diagram 4

In the image below, entity B is the entity of interest and comes in proximity with entity A three times, denoted by the blue circles. Assuming that time is on the x-axis, the first proximity event is 1, followed by a break without contact, and then proximity events 2 and 3. The tool returns event 1 in the proximity events DataFrame. Proximity events 2 and 3 are not returned. If setIncludeTracksDataFrame() is true, all input rows after proximity event 1 are returned in the output tracks DataFrame.

tpe first trace

Syntax

For more details, go to the GeoAnalytics Engine API reference for trace proximity events.

SetterDescriptionRequired
run(dataframe)Runs the Trace Proximity Events tool using the provided DataFrame.Yes
includeTracksDataFrame()Includes a second DataFrame with the points used in the trace.No
setAttributeMatchCriteria(*attribute_match_criteria)One or more fields used to constrain the proximity events. Entities will only be considered near when the spatial search distance and temporal search distance criteria are met and the two entities have equal values of the fields specified.No
setDistanceMethod(distance_method)Sets the method used to calculate distances between track observations. There are two methods to choose from: 'Geodesic' or 'Planar'. If this setter is not used, the geodesic distance method will be used when the input DataFrame is in a geographic coordinate system and the planar distance method will be used when the input DataFrame is in a projected coordinate system.No
setEntitiesOfInterestIds(entities_of_interest_ids)Sets one or more entities that you are interested in tracing from, as well as a time to start tracing from.Yes
setEntityIdField(entity_id_field)Sets the field used to identify distinct entities.Yes
setMaxTraceDepth(max_trace_depth)Sets the maximum degrees of separation between an entity of interest and an entity further down the trace.No
setSearchDistance(search_distance, search_distance_unit)Sets the maximum distance between two points to be considered in proximity. Points closer together in space and that also meet the search duration criteria are considered in proximity of each other.Yes
setSearchDuration(search_duration, search_duration_unit)Sets the maximum duration between two points that are considered in proximity. Points closer together in time and that also meet the search distance criteria are considered in proximity of each other.Yes

Examples

Run Trace Proximity Events

Python
Use dark colors for code blocksCopy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# Log in
import geoanalytics
geoanalytics.auth(username="myusername", password="mypassword")

# Imports
from geoanalytics.tools import TraceProximityEvents
from geoanalytics.sql import functions as ST
from pyspark.sql import functions as F

# Path to the Seattle example tracks data
data_path = r"https://services1.arcgis.com/36PP9fe9l4BSnArw/arcgis/rest" \
            "/services/seattle_example_tracks/FeatureServer/0"

# Create a DataFrame from the Seattle example tracks data
df = spark.read.format("feature-service").load(data_path)

# Use Trace Proximity Events to find records that are within proximity of each other
# by 50 feet with a 5 minute range and originating from user2 and user3.
result = TraceProximityEvents() \
            .setEntityIdField(entity_id_field="user_id") \
            .setEntitiesOfInterestIds(entities_of_interest_ids=
                                               [{"entityId": "user2", "epochTimeStamp": 1585569600000},
                                                {"entityId": "user3", "epochTimeStamp": 1585828800000}]) \
            .setDistanceMethod(distance_method="Planar") \
            .setSearchDuration(search_duration=5, search_duration_unit="Minutes") \
            .setSearchDistance(search_distance=50, search_distance_unit="Feet") \
            .includeTracksDataFrame() \
            .run(dataframe=df)

# View the first 5 rows of the output proximity events to show where events occurred
result.output.select("from_id", "to_id", "duration_minutes", "depth", "event_geometry",
                     F.date_format("event_start", "yyyy-MM-dd").alias("event_start")) \
             .sort("from_id", "to_id").show(5)

# View the first 5 rows of the tracksLayer to show travel after events occurred
result.tracks.select("entity_id", "depth", "track_geometry",
                     F.date_format("track_start", "yyyy-MM-dd").alias("track_start")).show(5)
Result
Use dark colors for code blocksCopy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
+-------+-----+----------------+-----+--------------------+-----------+
|from_id|to_id|duration_minutes|depth|      event_geometry|event_start|
+-------+-----+----------------+-----+--------------------+-----------+
|   NULL|user2|            NULL|    0|                NULL| 2020-03-30|
|   NULL|user3|            NULL|    0|                NULL| 2020-04-02|
|  user1|user4|             0.0|    2|{"x":1268471.8960...| 2020-04-01|
|  user2|user1|            26.0|    1|{"x":1267840.5744...| 2020-03-30|
|  user4|user5|             0.0|    3|{"x":1265728.4904...| 2020-04-01|
+-------+-----+----------------+-----+--------------------+-----------+

+---------+-----+--------------------+-----------+
|entity_id|depth|      track_geometry|track_start|
+---------+-----+--------------------+-----------+
|    user1|    1|{"x":1267840.4481...| 2020-03-30|
|    user1|    1|{"x":1267744.6826...| 2020-03-30|
|    user1|    1|{"x":1267705.5232...| 2020-03-30|
|    user1|    1|{"x":1267726.8824...| 2020-03-30|
|    user1|    1|{"x":1267836.5154...| 2020-03-30|
+---------+-----+--------------------+-----------+
only showing top 5 rows

Plot results

Python
Use dark colors for code blocksCopy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Get the tracks for the users that have come into contact with each other
seattle_example_tracks_plot = df.where("user_id = 'user1' or user_id = 'user2'" \
                                "or user_id = 'user4' or user_id = 'user_id5'") \
                                    .st.plot(color="purple", figsize=(14,8), basemap="light")

# Create a new column containing the "from to" information for the plot legend
output_df = result.output.withColumn("from_to", F.concat(F.lit("from "),"from_id",
                                                         F.lit(" to "), "to_id"))

# Plot the proximity events DataFrame
output_plot = output_df.st.plot(cmap_values="from_to",
                                is_categorical=True,
                                cmap="Paired",
                                s=100, legend=True,
                                legend_kwds={"title": "Proximity Events"},
                                ax=seattle_example_tracks_plot)
output_plot.set_title("Proximity events for users in Seattle example track data")
output_plot.set_xlabel("X (US Survey Feet)")
output_plot.set_ylabel("Y (US Survey Feet)");
Plotting example for a Trace Proximity Events result. Proximity events are shown.
Python
Use dark colors for code blocksCopy
1
2
3
4
5
6
7
8
9
10
11
# Plot the Trace Proximity Events tracks DataFrame
tracks_plot = result.tracks.st.plot(cmap_values="entity_id",
                                         is_categorical=True,
                                         cmap="Paired", legend=True,
                                         legend_kwds={"title": "User ID"},
                                         figsize=(14,8), basemap="light")
tracks_plot.set_title("Tracks showing the travel of users following a proximity event")
tracks_plot.set_xlabel("X (US Survey Feet)")
tracks_plot.set_ylabel("Y (US Survey Feet)");
Plotting example for a Trace Proximity Events result. User tracks after an event are shown.

Version table

ReleaseNotes

1.0.0

Python tool introduced

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.