Spatiotemporal Join transfers fields from one DataFrame to another based on spatial, temporal, and field attribute relationships, or some combination of the three. Optionally, statistics can be calculated for the joined records.
A spatial join matches records from the input DataFrames (
join_) based on their spatial relationships. A temporal join matches records from the target DataFrames based on their temporal relationships. An attribute join matches records based on field values.
You can join records based on a spatial relationship, a temporal relationship, an attribute relationship, or a combination of the three.
|The spatial relationship that will determine whether records are joined to each other. The available relationships depend on the geometry type (point, linestring, or polygon) of the DataFrames being joined. Available spatial relationships are as follows: |
|The temporal relationship that will determine whether records are joined to each other. The available relationships depend on the time type (instant or interval) of the DataFrames being joined. Available temporal relationships are as follows:|
|The attribute relationship that will determine whether records are joined to each other. Records are matched when the field values in the join DataFrame are equal to the field values in the target DataFrame.|
If the target DataFrame and join DataFrame are in different coordinate systems, the coordinate system of the target DataFrame will be used.
'Nearis specified with
set, it is required that the target DataFrame's geometry has a projected coordinate system or the tool will fail. You can transform your data to a projected coordinate system by using ST_Transform.
If multiple join records match the same target record, you can decide whether all the matching records will be joined (
set) or all the matching records will be summarized together (
Join O n e To Many()
set) as follows:
Join O n e To One()
One-to-many—Joins all the matching records in the join DataFrame to the target DataFrame. The resulting DataFrame will contain multiple records of the target record.
One-to-one—Summarizes all of the matching join records with each record in the target DataFrame. Only the records that have a match will be included in the summary and output. The count of joined records will be added, in addition to other statistics such as sum, minimum, maximum, range, mean, variance, and standard deviation. By default, only those join records that have the specified relationships will be maintained in the output DataFrame (inner join). If
Trueis specified for
set, all input records are written to the output DataFrame (this is known as a left outer join).
Examples of a one-to-many and one-to-one attribute join are shown below. In this example, the one-to-one join only includes the count; additional statistics that can be calculated using
You can optionally build an expression to join records using
set. If you specify an expression, only records that meet the condition will be used. For example, using the expression
$target['Magnitude'] > $join['Explosion'], you only join target records if the
Magnitudefield is greater than the
Explosionfield in the join record. See Arcade expressions for more information.
- You cannot apply a left join when using a one-to-many relationship.
The fields in the output DataFrame differ depending on the join operation.
The following fields are included in the output DataFrame when running a one-to-many join:
- All fields from the target DataFrame
- All fields from the joined DataFrame
If the same non-geometry field name exists in the target and join DataFrame, the joined field will start with the word join (for example join_fieldname).
If the same geometry field name exists in the target and join DataFrame, the result will have multiple geometry fields with the same name, where the first geometry field in the result schema is from the target DataFrame and the second geometry field is from the joined DataFrame. You can rename the geometry fields of your input DataFrames to be unique before joining to avoid confusion.
In addition to all fields from the target DataFrame, the following fields are included in the output records when running a one-to-one join:
|The count of records that have been joined.|
|Specified statistics will each create an attribute field, named in the following format: statistic_fieldname. For example, the maximum and standard deviation of the id field is MAX_id and SD_id.|
Improve the performance of Spatiotemporal Join by doing one or more of the following:
Only analyze the records in your area of interest. You can pick the records of interest by using one of the following SQL functions:
- Make the join conditions as selective as possible. For example, a near join of 1 kilometer will perform better than a near join of 5 kilometers on the same DataFrame.
- Add more than one join condition. Additional join conditions make the join more selective. For example, a near join of 1 kilometer and 1 hour will perform better than a near join of 1 kilometer on the same data.
- Apply a join expression.
For more details, go to the GeoAnalytics Engine API reference for spatiotemporal join.
|Adds a summary statistic of a field in the join DataFrame to the result DataFrame.||No|
|Runs the Spatiotemporal Join tool using the provided DataFrames.||Yes|
|Sets a target field, relationship, and join field used to join equal attributes.||One of |
|Sets a condition to specified fields using an Arcade expression. Only records with fields that meet this conditions will be joined.||No|
|Sets the join operation to one to many. If multiple join records are found that have the same relationships with a single target record, the fields associated with the multiple join records will be aggregated using the specified summary statistics.||One of |
|Sets the join operation to one to one. If multiple join records are found that have the same relationships with a single target record, there will be multiple copies (records) of the target record joined to each matching join record.||One of |
|Specifies whether all target records will be returned in the result DataFrame (known as a left outer join) or only those that have the specified relationships with the join records (inner join). Left outer join can be used only with one-to-one join and is not supported for one-to-many join.||No|
|Sets the spatial relationship used to spatially join records.||One of |
|Sets the temporal relationship used to temporally join records.||One of |
# Log in import geoanalytics geoanalytics.auth(username="myusername", password="mypassword") # Imports from geoanalytics.tools import ReconstructTracks from geoanalytics.tools import SpatiotemporalJoin from geoanalytics.sql import functions as ST # Path to the hurricane track points data and USA counties data hurricanes_data_path = r"https://services2.arcgis.com/FiaPA4ga0iQKduv3/arcgis/rest/" \ "services/IBTrACS_ALL_list_v04r00_points_1/FeatureServer/0" counties_data_path = "https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/"\ "USA_Counties_Generalized/FeatureServer/0" # Create DataFrames for hurricane track points and USA counties hurricanes_df = spark.read.format("feature-service").load(hurricanes_data_path) hurricanes_df = hurricanes_df.withColumn("usa", ST.env_intersects("shape", xmin=-14165303, xmax=-7384568, ymin=1947768, ymax=7426938)).where("usa=true") counties_df = spark.read.format("feature-service").load(counties_data_path) # Reconstruct hurricane points into tracks using Reconstruct Tracks rt_result = ReconstructTracks() \ .setTrackFields("NAME") \ .setDistanceMethod(distance_method="Geodesic") \ .run(dataframe=hurricanes_df) # Use Spatiotemporal Join to join the hurricane tracks to each county join_result = SpatiotemporalJoin() \ .setLeftJoin(left_join=True) \ .setJoinOneToOne()\ .setSpatialRelationship(spatial_relationship="Intersects") \ .run(target_dataframe=counties_df, join_dataframe=rt_result)
# Plot the join result and visualize the frequency of hurricanes for each county result_plot = join_result.st.plot(cmap_values="COUNT", cmap="coolwarm", legend=True, figsize=(16,10), geometry="shape") result_plot.set_title("USA counties visualized by count of intersecting hurricanes") result_plot.set_xlabel("X (Meters)") result_plot.set_ylabel("Y (Meters)") result_plot.set_xlim(left=-14165303, right=-7384568) result_plot.set_ylim(bottom=1947768, top=7426938);