Find Co-travelers analyzes time-enabled points representing moving entities. The tool finds where two entities are moving together in space (location) and time, also known as co-traveling. You can choose to find all co-traveling entities in your data, or you can find only entities that are co-traveling with specific entities of interest. See Usage Notes below for a glossary of commonly used terms in this topic.
Two entities are considered co-travelers if they are within a search distance of each other (specified with set
)
at the same time for at least a specified duration. There are two ways in which you can define the duration criteria:
-
set
—The minimum time that two entities must be within the search distance of each other without moving farther apart than the search distance. For example, in the diagram below two entities are near each other from 8:30:45 to 8:33:05. The resulting co-travel event has a consecutive duration of 2 minutes and 20 seconds.Consecutive Duration() -
set
—The minimum total time that two entities must be within the search distance of each other, regardless of if they temporarily move farther apart than the search distance. For example, in the diagram below two entities are within a search distance of each other from 8:32:20 to 8:33:40 (a consecutive duration of 1 minute and 20 seconds) before separating and then coming near each other again from 8:35:50 to 8:37:00 (a consecutive duration of 1 minute and 10 seconds). The resulting co-travel event has a cumulative duration of 2 minutes and 30 seconds.Cumulative Duration()
One of these two duration criteria is required. You can also choose to use both for finer control of co-travel detection.
By default, two entities must be within the search distance of each other at the same instant in time to be candidates
for co-travel. A search duration can be optionally defined (using set
) to expand the range of time that
two entities can come within the search distance of each other to be considered co-travelers.
For example, consider the example shown in the diagram below. A vehicle is traveling down a highway followed by another vehicle that takes the same route 5 minutes later. With a search distance of 10 feet and no search duration, these vehicles would not be considered co-travelers because they are not within 10 feet of each other at any instant in time. However, with a search distance of 10 feet and a search duration of 5 minutes, these vehicles would likely be considered co-travelers because they were within 10 feet of the same location within the time frame allowed by the search duration.
By default, all co-travel between two entities is summarized in a single record in the result DataFrame. Optionally,
set
can be used to specify the maximum gap duration allowed in a single co-travel incident. A gap is when two
entities are temporarily farther apart than the search distance and/or search duration. Co-travel incidents with
a gap in spatial and temporal nearness longer than the gap cutoff will be split into separate records in the result DataFrame.
For example, consider the diagram below showing co-travel detected between 2 entities. There is a 2 minute and 10 second gap in this co-travel incident. With no gap cutoff, the co-travel incident would be returned as a single record in the result DataFrame. With a gap cutoff of 2 minutes, the same co-travel incident would be returned as two separate records in the result DataFrame, as the 2 minute and 10 second gap exceeds the gap cutoff.
If one DataFrame is provided as input, the result will include all pairs of entities that meet the co-travel criteria.
You can optionally use set
to specify a second DataFrame of tracks that will be used as seeds when searching for
co-travelers. In this case, the result will include only pairs of entities that meet the co-travel criteria and each
co-travel incident will include at least one entity from the entities
DataFrame. If the same track ID
exists in both DataFrames, the two tracks will be treated as distinct and separate entities. To exclude co-travel
between entities with the same track ID, use set
. If set
is not used,
set
will be ignored.
Usage notes
-
The following table outlines terminology for Find Co-travelers:
Term Definition Example Entity A moving object with position periodically recorded. A marine vessel. Observation A point geometry and timestamp representing a single entity's location at an instant in time. The location of a vessel at an instant in time. Entities of interest Specific entities used to find co-travel incidents. A specific marine vessel known to engage in illegal fishing. Co-travel incident When two entities are within the search distance of each other over a minimum duration of time. Two vessels that are within 500 meters of each other for at least 1 hour. Gap When two entities temporarily farther apart than the search distance and, if defined, the search duration. Two vessels are within 500 meters of each other for 20 minutes before moving farther apart for 10 minutes and finally moving within 500 meters of each other for 10 more minutes. This co-travel incident has a 10-minute gap. Consecutive duration The duration that two entities co-travel without separating. In the previous example, the co-travel event has consecutive durations of 20 minutes and 10 minutes. Cumulative duration The total duration that two entities are co-traveling, including gaps. In the previous example, the co-travel event has a cumulative duration of 50 minutes. -
The input DataFrame must contain time-enabled point observations that represent an instant in time. Track observations that do not have a valid timestamp or geometry will be excluded from analysis.
-
Tracks must have more than one observation to be used in analysis.
-
You can specify one or more fields to identify tracks using
set
. Tracks are represented by the unique combination of one or more track fields. For example, if theTrack Fields() flight
andID Destination
fields are used as track identifiers, the recordsI
,D007 Solden
andI
,D007 Tokyo
would be in different tracks since they have different values for theDestination
field. -
Geodesic distance calculations will be used if your data is in a geographic spatial reference with no projection. Planar distance calculations will be used if your data is in a projected spatial reference or has no spatial reference. You can override the default behavior by calling
set
. Geodesic distance calculations require a spatial reference to be set on the input data. Planar distance calculations require that your data is in a projected spatial reference. For more information see Coordinate systems and transformations.Distance Method() -
When using geodesic distance calculations, co-travel events may include instances where two entities are very slightly farther apart than the search distance.
-
The
entities
DataFrame must have the same track fields as the_of _interest observations
DataFrame. -
The spatial reference of the
observations
DataFrame must be the same as the spatial reference of theentities
DataFrame. If the two DataFrames have different spatial references, the_of _interest observations
will be transformed to the spatial reference of theentities
. If no such transformation exists, or if one of the DataFrames does not have a spatial reference set, the tool will fail._of _interest -
When finding co-travelers, it is your responsibility to understand organizational, local, and national guidelines regarding data sensitivity and privacy. False positives may occur depending on data quality, tool parameters, and circumstance.
Results
Results will include the following fields:
Field | Description |
---|---|
cotravel | A unique ID given to every co-travel incident. |
entity1 | A string or array of strings representing the track ID(s) of the first entity in a co-travel incident. |
entity2 | A string or array of strings representing the track ID(s) of the second entity in a co-travel incident. |
cumu | The cumulative duration of the co-travel incident (in seconds), including any gaps in co-travel. |
consec | The longest consecutive duration of the co-travel incident (in seconds). |
cotravel | A linestring representing the path of the co-travel incident. If the co-travel incident includes gaps, cotravel is a multipart linestring. Each vertex in the linestring has an M-value representing the approximate unix timestamp of the co-travel event at that location (in seconds). |
cotravel | A timestamp representing the date and time that the co-travel incident starts. |
cotravel | A timestamp representing the date and time that the co-travel incident ends. |
Performance notes
Improve the performance of Find Co-travelers by doing one or more of the following:
-
Only analyze the records in your area of interest. You can pick the records of interest by using one of the following SQL functions:
- ST_Intersection—Clip to an area of interest represented by a polygon. This will modify your input records.
- ST_BboxIntersects—Select records that intersect an envelope.
- ST_EnvIntersects—Select records having an evelope that intersects the envelope of another geometry.
- ST_Intersects—Select records that intersect another dataset or area of intersect represented by a polygon.
- Specifying a larger value for
set
results in more events and takes longer to process the results. Smaller distances results in fewer events and a shorter processing time.Search Distance() - Specify a value of
planar
usingset
to use planar distance calculations instead of geodesic.Distance Method()
Limitations
When using this tool to find co-travel, be aware of the following:
-
Co-travel detection depends on user-defined parameters and data quality. Even if a pair of entities are returned in the result, there is no guarantee that the two entities willingly or knowingly traveled together.
-
Z-values and M-values from the input
observations
andentities
DataFrames are not used to calculate spatial or temporal nearness when searching for co-travelers. These values are not guaranteed to be included in the result linestring (_of _interest cotravel
)._track -
This tool does not consider directionality when discovering co-travel. Relatively large search distance and/or search duration values can result in detection of co-travel that may be implausible (i.e., the two entities are traveling in opposite directions).
-
This tool does not consider reachability when discovering co-travel. For example, two entities traveling at the same time on separate levels of a mult-level roadway may be returned as co-travelers when there is no practical way that the entities could be together.
-
It is unrealistic to calculate co-travel of a duration shorter than the observation period. For example, on data with observations every 2 minutes, it is meaningless to look for co-travel lasting less than 2 minutes.
-
Finding co-travel events of more than 2 entities is not supported.
Syntax
For more details, go to the GeoAnalytics Engine API reference for Find Co-travelers.
Setter | Description | Required |
---|---|---|
run(observations) | Runs the Find Co-travelers tool using the provided DataFrame. | Yes |
set | To be considered co-traveling, two entities must be within the search distance and search duration of each other without separating for at least this length of time. | Required if set is not used. |
set | To be considered co-traveling, two entities must be within the search distance and search duration of each other for at least this length of time out of the entire duration of the input data. This criterion allows for gaps in co-travel where two entities separate and come back together. The cumulative time that the two entities are within the search distance and search duration of each other is used to determine if they are co-traveling. | Required if set is not used. |
set | Sets the method used to calculate distances between track observations. There are two methods to choose from: ' or ' . See Usage notes for more details. | No |
set | Sets a DataFrame of point observations to use as seeds when finding co-travelers in the observations DataFrame. If not used, co-travelers will be found around all entities in the observations DataFrame. | No |
set | If exclude is True , entities with the same track ID will not be considered co-travelers even if they meet co-travel criteria. If this method is not used or if exclude is False , entities with the same ID will be evaluated for co-travel. | No |
set | Sets the maximum duration that co-travel criteria can not be met within a single co-travel incident. If two entities co-travel before separating for longer than the gap cutoff duration, any future co-travel will be considered a separate co-travel incident and will be returned as a separate record in the result. | No |
set | Sets a distance bound within which to search for co-traveling entities. | Yes |
set | Sets a duration bound within which to search for co-traveling entities. If no search duration is set, two entities must be within the search distance at the same time to be considered co-travelers. | No |
set | Sets one or more fields used to identify distinct entities. | Yes |
Examples
Run Find Co-travelers
import geoanalytics
geoanalytics.auth(username="myusername", password="mypassword")
from geoanalytics.tools import FindCotravelers
from pyspark.sql import functions as F
data_url = "https://services1.arcgis.com/36PP9fe9l4BSnArw/ArcGIS/rest/services/walking_points_UofR/FeatureServer/0"
df = spark.read.format("feature-service").load(data_url)
result = FindCotravelers() \
.setTrackFields("user_id") \
.setSearchDistance(distance=15, distance_unit="meters") \
.setSearchDuration(duration=30, duration_unit="seconds") \
.setConsecutiveDuration(duration=2, duration_unit="minutes") \
.run(observations=df)
result.drop("cotravel_track", "cotravel_id") \
.withColumn("cumu_dura", F.round("cumu_dura", 3)) \
.withColumn("consec_dura", F.round("consec_dura", 3)) \
.sort("cumu_dura").show(truncate=False)
+-------+-------+---------+-----------+----------------------+-----------------------+
|entity1|entity2|cumu_dura|consec_dura|cotravel_start |cotravel_end |
+-------+-------+---------+-----------+----------------------+-----------------------+
|user_1 |user_2 |349.554 |349.554 |2022-03-01 01:09:24.34|2022-03-01 01:15:13.894|
|user_1 |user_3 |773.968 |411.231 |2022-03-01 01:03:00 |2022-03-01 01:28:00 |
+-------+-------+---------+-----------+----------------------+-----------------------+
Plot results
ax = df.st.plot(figsize=(10,10), is_categorical=True, cmap_values="user_id", cmap="tab20c", s=43)
plot_config = {'ax':ax, 'zorder':4, 'basemap':'dark', 'linewidth':3, 'is_categorical':True,
'cmap_values':'cotravel_pair', 'cmap':'spring', 'legend':True,
'legend_kwds':{"labels":["User 1 (blue) & User 2 (orange)",
"User 1 (blue) & User 3 (green)"],
"bbox_to_anchor": (0.4, 0.2), "fontsize": "medium",
"markerscale": 2, "title": "Co-travel incidents"}}
result_plot = result.withColumn("cotravel_pair", F.concat(F.col("entity1"), F.lit("-"), F.col("entity2"))) \
.st.plot(**plot_config)
result_plot.set_title("Incidents of co-travel found in point observations of 5 moving entities", {'fontsize': 12})
result_plot.set_xlabel("X (Meters)")
result_plot.set_ylabel("Y (Meters)");
Version table
Release | Notes |
---|---|
1.5.0 | Python tool introduced |