geoanalytics.tracks.functions¶
after¶
- geoanalytics.tracks.functions.after(track, offset)¶
Returns a linestring column representing the subset of the input track that comes after the offset distance or offset duration from the start of the track. An offset column can be created with ST_CreateDistance or ST_CreateDuration. You can also define an offset with a tuple containing a number and a unit (e.g., (10, “kilometers”) or (5, “minutes”)).
Returns null if a track is invalid.
Refer to the GeoAnalytics guide for examples and usage notes: TRK_After
- Parameters
track – Linestring column.
offset (pyspark.sql.Column) – The offset distance or offset duration. The offset must be greater than zero.
- Returns
Linestring column representing the subset of the input track that comes after the offset distance or offset duration from the start of the track.
- Return type
pyspark.sql.Column
aggr_create_track¶
- geoanalytics.tracks.functions.aggr_create_track(point, timestamp)¶
Operates on a grouped DataFrame and creates tracks using the points in each group, where each point represents an entity’s observed location at an instant. The output tracks are linestrings that represent the shortest path between each observation. Each vertex in the linestring has a timestamp (stored as the M-value) and the vertices are ordered sequentially. You can group your DataFrame using DataFrame.groupBy() or with a GROUP BY clause in a SQL statement.
Refer to the GeoAnalytics guide for examples and usage notes: TRK_Aggr_CreateTrack
- Parameters
point (pyspark.sql.Column) – Point geometry column.
timestamp – Timestamp column to order points by.
- Returns
Linestring column representing the result tracks.
- Return type
pyspark.sql.Column
before¶
- geoanalytics.tracks.functions.before(track, offset)¶
Returns a linestring column representing the subset of the input track that is between the track start and the offset distance or offset duration. An offset column can be created with ST_CreateDistance or ST_CreateDuration. You can also define an offset with a tuple containing a number and a unit (e.g. (10, “kilometers”) or (5, “minutes”)).
Returns null if a track is invalid.
Refer to the GeoAnalytics guide for examples and usage notes: TRK_Before
- Parameters
track – Linestring column.
offset (pyspark.sql.Column) – The offset distance or offset duration. The offset must be greater than zero.
- Returns
Linestring column representing the subset of the input track that is between the track start and the offset distance or offset duration.
- Return type
pyspark.sql.Column
between¶
- geoanalytics.tracks.functions.between(track, start_offset, end_offset)¶
Returns a linestring column representing the subset of the input track that comes between the two offset distances or offset durations. An offset column can be created with ST_CreateDistance or ST_CreateDuration. You can also define an offset with a tuple containing a number and a unit (e.g., (10, “kilometers”) or (5, “minutes”)).
Returns null if a track is invalid.
Refer to the GeoAnalytics guide for examples and usage notes: TRK_Between
- Parameters
track – Linestring column.
start_offset (pyspark.sql.Column) – The start offset distance or start offset duration. The offset must be greater than zero.
end_offset (pyspark.sql.Column) – The end offset distance or end offset duration. The offset must be greater than zero.
- Returns
Linestring column representing the subset of the input track that comes between the two offset distances or offset durations on the track.
- Return type
pyspark.sql.Column
collapse_dwells¶
- geoanalytics.tracks.functions.collapse_dwells(track, distance_threshold, duration_threshold)¶
Returns a linestring column representing the input track with the dwell segments removed.
TRK_CollapseDwells removes dwell segments from the input track and connects the remaining points, preserving the start and end points of each dwell.
Returns null if a track is invalid.
The ST_CreateDistance and ST_CreateDuration functions can be used to define the distance and duration thresholds. You can also define them with a tuple containing a number and a unit (e.g., (10, “kilometers”) or (5, “minutes”)).
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_CollapseDwells
- Parameters
track (pyspark.sql.Column) – Linestring column.
distance_threshold (pyspark.sql.Column/struct/tuple) – The distance threshold used to define a dwell.
duration_threshold (pyspark.sql.Column/struct/tuple) – The duration threshold used to define a dwell.
- Returns
Linestring column representing the input track with the dwell segments collapsed.
- Return type
pyspark.sql.Column
distance_along¶
- geoanalytics.tracks.functions.distance_along(track, point, max_deviation=0.0, output_unit=None)¶
Returns a double column representing the length of the track between the track start and where the point intersects the track. You can optionally specify a max_deviation which is the maximum distance a point can be from the track while still being considered on the track. The value is in the units of the track’s spatial reference.
If the input track and point do not have the same spatial reference, the point will be transformed to the spatial reference of the track.
The result is returned in the units specified by output_unit. When output_unit is None, the result is in the units of the input track’s spatial reference if it is projected; otherwise, the result is in meters.
Returns null if a track is invalid.
Refer to the GeoAnalytics guide for examples and usage notes: TRK_DistanceAlong
- Parameters
track – Linestring column.
point (pyspark.sql.Column) – Point column.
max_deviation (float/int, optional) – Numeric value representing the maximum distance a point can be from the track while still being considered on the track.
output_unit (str, optional) – The units of the result. Choose from Meters, Kilometers, Feet, Yards, Miles, or NauticalMiles.
- Returns
DoubleType column representing the length of the track between the track start and where the point intersects the track
- Return type
pyspark.sql.Column
distance_within¶
- geoanalytics.tracks.functions.distance_within(track, geometry, output_unit=None)¶
Returns a float column representing the distance traveled within a geometry. The geometry type can be linestring or polygon. The result is returned in the units specified by output_unit. When output_unit is None, the result is in the units of the input track’s spatial reference if it’s projected; otherwise, the result is in meters.
If the track and geometry columns are in different spatial references, the function automatically transforms the geometry into the spatial reference of the track.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_DistanceWithin
- Parameters
track (pyspark.sql.Column) – Linestring column.
geometry (pyspark.sql.Column) – Geometry column. The geometry type can be linestring or polygon.
output_unit (str, optional) – The units of the result. Choose from Meters, Kilometers, Feet, Yards, Miles, or NauticalMiles.
- Returns
DoubleType column representing the distance traveled within the geometry.
- Return type
pyspark.sql.Column
duration¶
- geoanalytics.tracks.functions.duration(track, output_unit='seconds')¶
Returns a double column representing the duration of the input track. The duration is the difference between the first and last timestamps in the track. The result is returned in the units specified by output_unit. Returns null for invalid tracks.
Refer to the GeoAnalytics guide for examples and usage notes: TRK_Duration
- Parameters
track – Linestring column.
output_unit (str, optional) – The units of the result. Choose from Milliseconds, Seconds, Minutes, Hours, or Days.
- Returns
DoubleType column representing the track duration.
- Return type
pyspark.sql.Column
duration_along¶
- geoanalytics.tracks.functions.duration_along(track, point, max_deviation=0.0, output_unit='seconds')¶
Returns a double column representing the duration of the track between the track start and where the point intersects the track. You can optionally specify a max_deviation which is the maximum distance a point can be from the track while still being considered on the track. The value is in the units of the track’s spatial reference.
The result is returned in the units specified by output_unit. The default is seconds.
If the input track and point do not have the same spatial reference, the point will be transformed to the spatial reference of the track.
Returns null if a track is invalid.
Refer to the GeoAnalytics guide for examples and usage notes: TRK_DurationAlong
- Parameters
track (pyspark.sql.Column) – Linestring column.
point (pyspark.sql.Column) – Point column.
max_deviation (float/int, optional) – Numeric value representing the maximum distance a point can be from the track while still being considered on the track.
output_unit (str, optional) – The units of the result. Choose from Milliseconds, Seconds, Minutes, Hours, or Days.
- Returns
DoubleType column representing the duration of the track between the track start and where the point intersects the track.
- Return type
pyspark.sql.Column
duration_within¶
- geoanalytics.tracks.functions.duration_within(track, geometry, output_unit='seconds')¶
Returns a float column representing the duration of the track that intersects a linestring or polygon. The result is returned in the units specified by output_unit. The default is seconds.
If the track and geometry columns are in different spatial references, the function automatically transforms the geometry into the spatial reference of the track.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_DurationWithin
- Parameters
track (pyspark.sql.Column) – Linestring column.
geometry (pyspark.sql.Column) – Geometry column. The geometry type can be linestring or polygon.
output_unit (str, optional) – The units of the result. Choose from Milliseconds, Seconds, Minutes, Hours, or Days.
- Returns
DoubleType column representing the duration of the track that intersects the geometry.
- Return type
pyspark.sql.Column
end_timestamp¶
- geoanalytics.tracks.functions.end_timestamp(track)¶
Returns a timestamp column containing the last timestamp of each input track. Returns null for invalid tracks.
Refer to the GeoAnalytics guide for examples and usage notes: TRK_EndTimestamp
- Parameters
track – Linestring column.
- Returns
Timestamp column with start timestamp of each track.
- Return type
pyspark.sql.Column
entry_exit_points¶
- geoanalytics.tracks.functions.entry_exit_points(track, geometry)¶
Returns an array of struct representing the points at which a track enters or exists a linestring or polygon. The entry and exit point structs contain the following fields:
point: the geometry of the entry or exit point.
time: the timestamp of the entry or exit point formatted as HH-MM-SS hh:mm:ss.s.
track_endpoint: a boolean value. True if the entry or exit point is the starting or ending point of the track.
If the track and geometry columns are in different spatial references, the function automatically transforms the geometry into the spatial reference of the track. The spatial reference of the point geometry in the output is the same as the track.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_EntryExitPoints
- Parameters
track (pyspark.sql.Column) – Linestring column.
geometry (pyspark.sql.Column) – Geometry column. The geometry type can be linestring or polygon.
- Returns
Array column representing the entry and exit points that the track intersects with the linestring or polygon.
- Return type
pyspark.sql.Column
find_dwells¶
- geoanalytics.tracks.functions.find_dwells(track, distance_threshold, duration_threshold)¶
Returns an array of tracks, each track representing the points of the input track where the track is dwelling.
A track is considered to be dwelling if the points on the track have traveled a distance less than the distance threshold for a duration that exceeds the duration threshold. A dwell is defined on segments of the track where this condition is met.
TRK_FindDwells returns an array of tracks, each representing a dwelling portion of the input track.
Returns null if a track is invalid.
The ST_CreateDistance and ST_CreateDuration functions can be used to define the distance and duration thresholds. You can also define them with a tuple containing a number and a unit (e.g., (10, “kilometers”) or (5, “minutes”)).
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_FindDwells
- Parameters
track (pyspark.sql.Column) – Linestring column.
distance_threshold (pyspark.sql.Column/struct/tuple) – The distance threshold used to define a dwell.
duration_threshold (pyspark.sql.Column/struct/tuple) – The duration threshold used to define a dwell.
- Returns
Array column representing the tracks created from the dwell segments of the input track.
- Return type
pyspark.sql.Column
is_valid¶
- geoanalytics.tracks.functions.is_valid(track)¶
Returns a boolean column where the result is True if the input linestring is a valid track; otherwise, it returns False. A linestring is a valid track if it is non-null, non-empty, and has M-values that are distinct and strictly increasing.
Refer to the GeoAnalytics guide for examples and usage notes: TRK_IsValid
- Parameters
track – Linestring column.
- Returns
Geometry column with the centerline of the polygon feature.
- Return type
pyspark.sql.Column
lcss¶
- geoanalytics.tracks.functions.lcss(track1, track2, search_distance, search_duration=None)¶
Returns a double column representing the size of the longest common subsequence between the two input tracks.
Returns null if a track is invalid.
The longest common subsequence is a count of all pairs of observations, each from the two tracks, within the search distance and duration thresholds.
The ST_CreateDistance and ST_CreateDuration functions can be used to define the search distance and search duration parameters. You can also define them with a tuple containing a number and a unit (e.g., (10, “kilometers”) or (5, “minutes”)).
TRK_LCSS uses planar distance calculations when the tracks are in a projected coordinate system and geodesic distance calculations when the tracks are in a geographic coordinate system. If one of the tracks has an unknown spatial reference, the function will use planar distance calculations.
Refer to the GeoAnalytics guide for examples and usage notes: TRK_LCSS
- Parameters
track1 (pyspark.sql.Column) – Linestring column.
track2 (pyspark.sql.Column) – Linestring column.
search_distance (pyspark.sql.Column/struct/tuple) – Distance used to calculate the longest common subsequence. It can be set using ST_CreateDistance.
search_duration (pyspark.sql.Column/struct/tuple) – Duration used to calculate the longest common subsequence. It can be set using ST_CreateDuration.
- Returns
DoubleType column representing the size of the longest common subsequence between the two tracks.
- Return type
pyspark.sql.Column
length¶
- geoanalytics.tracks.functions.length(track, output_unit=None)¶
Returns a double column representing the length of the input track. Returns null for invalid tracks.
The result is returned in the units specified by output_unit. When output_unit is None, the result is in the units of the input track’s spatial reference if it is projected; otherwise, the result is in meters.
Planar distance calculations are used if the input tracks have a projected spatial reference or no spatial reference. Chordal distance calculations are used if the input tracks have a geographic spatial reference. For more information see Coordinate systems and transformations.
Refer to the GeoAnalytics guide for examples and usage notes: TRK_Length
- Parameters
track – Linestring column.
output_unit (str, optional) – The units of the result. Choose from Meters, Kilometers, Feet, Yards, Miles, or NauticalMiles.
- Returns
DoubleType column representing the track length.
- Return type
pyspark.sql.Column
query¶
- geoanalytics.tracks.functions.query(track, offset)¶
Returns a point column representing the location that is the offset distance or offset duration along the input track, measured from the track start. An offset column can be created with ST_CreateDistance or ST_CreateDuration. You can also define an offset with a tuple containing a number and a unit (e.g., (10, “kilometers”) or (5, “minutes”)).
Returns null if a track is invalid.
Refer to the GeoAnalytics guide for examples and usage notes: TRK_Query
- Parameters
track – Linestring column.
offset (pyspark.sql.Column) – The offset distance or offset duration. The offset must be greater than zero.
- Returns
Point column representing the location that is the offset distance or offset duration along the input track.
- Return type
pyspark.sql.Column
speed¶
- geoanalytics.tracks.functions.speed(track, output_unit='meterspersecond')¶
Returns a double column representing the speed of the input track. The speed is the length of the track (see TRK_Length) divided by the duration of the track (see TRK_Duration). The result is returned in the units specified by output_unit. Returns null for invalid tracks.
Refer to the GeoAnalytics guide for examples and usage notes: TRK_Speed
- Parameters
track – Linestring column.
output_unit (str, optional) – The units of the result. Choose from MetersPerSecond, MilesPerHour, NauticalMilesPerHour, FeetPerSecond, or KilometersPerHour.
- Returns
DoubleType column representing the track speed.
- Return type
pyspark.sql.Column
split_by_distance¶
- geoanalytics.tracks.functions.split_by_distance(track, distance)¶
Returns an array of tracks created by splitting the input track into segments with each segment no longer than the specified distance. The distance can be created with ST_CreateDistance or with a tuple containing a number and a unit (e.g., (10, “kilometers”)).
Returns null if a track is invalid.
Refer to the GeoAnalytics guide for examples and usage notes: TRK_SplitByDistance
- Parameters
track – Linestring column.
distance (pyspark.sql.Column) – The maximum length of result tracks. The distance must be greater than zero.
- Returns
Array column representing the tracks created by splitting the input track.
- Return type
pyspark.sql.Column
split_by_distance_gap¶
- geoanalytics.tracks.functions.split_by_distance_gap(track, gap_distance)¶
Returns an array of tracks created by splitting the input track wherever two vertices are farther apart than the specified gap distance. The track is split by removing the segment between the two vertices. The distance can be created with ST_CreateDistance or with a tuple containing a number and a unit (e.g., (10, “kilometers”)).
Returns null if a track is invalid.
Refer to the GeoAnalytics guide for examples and usage notes: TRK_SplitByDistanceGap
- Parameters
track – Linestring column.
gap_distance (pyspark.sql.Column) – The maximum distance allowed between two track vertices. The distance must be greater than zero.
- Returns
Array column representing the tracks created by splitting the input track.
- Return type
pyspark.sql.Column
split_by_duration¶
- geoanalytics.tracks.functions.split_by_duration(track, duration)¶
Returns an array of tracks created by splitting the input track into segments with each segment no longer than the specified duration. The duration can be created with ST_CreateDuration or with a tuple containing a number and a unit (e.g., (5, “minutes”)).
Returns null if a track is invalid.
Refer to the GeoAnalytics guide for examples and usage notes: TRK_SplitByDuration
- Parameters
track – Linestring column.
duration (pyspark.sql.Column) – The maximum duration of result tracks. The duration must be greater than zero.
- Returns
Array column representing the tracks created by splitting the input track.
- Return type
pyspark.sql.Column
split_by_dwells¶
- geoanalytics.tracks.functions.split_by_dwells(track, distance_threshold, duration_threshold)¶
Returns an array of tracks, each track representing a part of the input track where a dwell condition is not met.
TRK_SplitByDwells splits the input track by removing the dwell segments. The output is an array of tracks, each representing a portion of the input track where it is in motion.
Returns null if a track is invalid.
The ST_CreateDistance and ST_CreateDuration functions can be used to define the distance and duration thresholds. You can also define them with a tuple containing a number and a unit (e.g., (10, “kilometers”) or (5, “minutes”)).
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_SplitByDwells
- Parameters
track (pyspark.sql.Column) – Linestring column.
distance_threshold (pyspark.sql.Column/struct/tuple) – The distance threshold used to define a dwell.
duration_threshold (pyspark.sql.Column/struct/tuple) – The duration threshold used to define a dwell.
- Returns
Array column representing the tracks created by splitting the input track at points where the track is dwelling, before and after the start and end of the dwell.
- Return type
pyspark.sql.Column
split_by_time_gap¶
- geoanalytics.tracks.functions.split_by_time_gap(track, gap_duration)¶
Returns an array of tracks created by splitting the input track wherever two vertices are farther apart than the specified gap duration. The track is split by removing the segment between the two vertices. The duration can be created with ST_CreateDuration or with a tuple containing a number and a unit (e.g., (5, “minutes”)).
Returns null if a track is invalid.
Refer to the GeoAnalytics guide for examples and usage notes: TRK_SplitByTimeGap
- Parameters
track – Linestring column.
gap_duration (pyspark.sql.Column) – The maximum duration allowed between two track vertices. The duration must be greater than zero.
- Returns
Array column representing the tracks created by splitting the input track.
- Return type
pyspark.sql.Column
start_timestamp¶
- geoanalytics.tracks.functions.start_timestamp(track)¶
Returns a timestamp column containing the first timestamp of each input track. Returns null for invalid tracks.
Refer to the GeoAnalytics guide for examples and usage notes: TRK_StartTimestamp
- Parameters
track – Linestring column.
- Returns
Timestamp column with start timestamp of each track.
- Return type
pyspark.sql.Column