geoanalytics.sql.functions

aggr_convex_hull

geoanalytics.sql.functions.aggr_convex_hull(geometry)

Operates on a grouped DataFrame and returns the convex hull of geometries in each group. You can group your DataFrame using DataFrame.groupBy() or with a GROUP BY clause in a SQL statement.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Aggr_ConvexHull

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

Polygon column representing the convex hull of all of the geometries.

Return type

pyspark.sql.Column

aggr_intersection

geoanalytics.sql.functions.aggr_intersection(geometry)

Operates on a grouped DataFrame and returns the intersection of geometries in each group. You can group your DataFrame using DataFrame.groupBy() or with a GROUP BY clause in a SQL statement. An empty geometry is returned when no geometries intersect.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Aggr_Intersection

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

Geometry column representing the intersection of all of the geometries. The returned geometry column type will be the same as the input geometry column type.

Return type

pyspark.sql.Column

aggr_linestring

geoanalytics.sql.functions.aggr_linestring(point, order_by)

Operates on a grouped DataFrame and returns a linestring containing the points ordered by the order_by column. You can group your DataFrame using DataFrame.groupBy() or with a GROUP BY clause in a SQL statement.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Aggr_Linestring

Parameters
  • point (pyspark.sql.Column) – Point geometry column.

  • order_by (pyspark.sql.Column) – Column to sort by

Returns

Linestring column representing the sorted input points.

Return type

pyspark.sql.Column

aggr_mean_center

geoanalytics.sql.functions.aggr_mean_center(geometry, weight=None)

Operates on a grouped DataFrame and returns the weighted aggregated centroid (mean center) of geometries in each group. You can optionally specify a numeric weight column which is used to weight locations according to their relative importance. The default is unweighted. You can group your DataFrame using DataFrame.groupBy() or with a GROUP BY clause in a SQL statement.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Aggr_MeanCenter

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • weight (pyspark.sql.Column, optional) – Used to weight locations according to their relative importance, defaults to None. Can be a LongType, DoubleType or StringType column with numeric values.

Returns

Point column representing the weighted aggregate centroid.

Return type

pyspark.sql.Column

aggr_stdev_ellipse

geoanalytics.sql.functions.aggr_stdev_ellipse(geometry, num_stdev=1.0, weight=None, min_records=2)

Operates on a grouped DataFrame and returns the weighted aggregate standard deviational ellipse of geometries in each group. You can group your DataFrame using DataFrame.groupBy() or with a GROUP BY clause in a SQL statement.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Aggr_StdevEllipse

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • num_stdev (double, optional) – Size of the returned ellipse in standard deviations, defaults to 1.

  • weight (pyspark.sql.Column, optional) – Weights locations according to their relative importance, defaults to None. Can be a LongType, DoubleType or StringType column with numeric values.

  • min_records (int, optional) – Number of geometries that must be considered for a standard deviation to be calculated, defaults to 2.

Returns

Polygon column representing the weighted aggregate Standard Deviational Ellipse.

Return type

pyspark.sql.Column

aggr_union

geoanalytics.sql.functions.aggr_union(geometry)

Operates on a grouped DataFrame and returns the union of geometries in each group. All geometries are required to have the same type. For example having point, linestring, and polygon geometry types in the same column is not supported. You can group your DataFrame using DataFrame.groupBy() or with a GROUP BY clause in a SQL statement.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Aggr_Union

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

Geometry column representing the union of all of the geometries. The returned geometry column type will be the same type as the input geometries except in the case of point input which will return multipoint.

Return type

pyspark.sql.Column

area

geoanalytics.sql.functions.area(geometry)

Returns a double column with the planar area of each geometry in the input column. The unit of the area calculation is the same as the units of the input geometries. For example, if you have polygons in a spatial reference that uses meters, the result will be in square meters. If your input geometries are in a geographic coordinate system, it is recommended that you use ST_GeodesicArea. Geometry types other than polygon will return 0.0 for the area.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Area

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

DoubleType column representing the area.

Return type

pyspark.sql.Column

as_binary

geoanalytics.sql.functions.as_binary(geometry)

Returns the well-known binary (WKB) representation of the geometry as a binary column.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_AsBinary

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

BinaryType column with the Well-Known Binary (WKB) representation of the geometry.

Return type

pyspark.sql.Column

as_esri_json

geoanalytics.sql.functions.as_esri_json(geometry)

Returns the Esri JSON representation of the geometry as a string column.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_AsEsriJSON

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

StringType column with the Esri JSON representation of the geometry.

Return type

pyspark.sql.Column

as_geojson

geoanalytics.sql.functions.as_geojson(geometry)

Returns the GeoJSON representation of the geometry as a string column. Geometries that have an m-value and no z-coordinate will only return x,y coordinates.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_AsGeoJSON

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

StringType column with the GeoJSON representation of the geometry.

Return type

pyspark.sql.Column

as_shape

geoanalytics.sql.functions.as_shape(geometry)

Returns the shapefile representation of the geometry as a binary column.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_AsShape

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

BinaryType column with the shapefile representation of the geometry.

Return type

pyspark.sql.Column

as_text

geoanalytics.sql.functions.as_text(geometry)

Returns the well-known text (WKT) representation of the geometry as a string column.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_AsText

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

StringType column with the well-known text (WKT) representation of the geometry.

Return type

pyspark.sql.Column

azimuth

geoanalytics.sql.functions.azimuth(geometry1, geometry2)

Returns a double column representing the normalized azimuth in degrees. The output azimuth angle is heading from the first geometry to the second geometry. The angle is referenced from the north and is positive clockwise. This function requires that the first geometry column have a spatial reference set.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Azimuth

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

DoubleType column representing the normalized azimuth in degrees.

Return type

pyspark.sql.Column

bbox_intersects

geoanalytics.sql.functions.bbox_intersects(geometry, xmin, ymin, xmax, ymax)

Returns a boolean column where the result is True if the geometry intersects the defined envelope; otherwise, it returns False. The four numeric values represent the minimum and maximum x,y coordinates of an axis-aligned rectangle, also known as an envelope. The x,y coordinates should be specified in the same units as the input geometry column. For example, if the input geometry is in a spatial reference that uses degrees, xmin, ymin, xmax, and ymax should all be in degrees.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_BboxIntersects

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • xmin (int/float) – The minimum x-coordinate point for the Envelope.

  • ymin (int/float) – The minimum y-coordinate point for the Envelope.

  • xmax (int/float) – The maximum x-coordinate point for the Envelope.

  • ymax (int/float) – The maximum y-coordinate point for the Envelope.

Returns

BooleanType column. True if geometry intersects the Envelope, False otherwise.

Return type

pyspark.sql.Column

bin_center

geoanalytics.sql.functions.bin_center(bin)

Returns a point column representing the center point of each bin.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_BinCenter

Parameters

bin (pyspark.sql.Column) – Spatial Bin (bin2d) column with the key for the bin.

Returns

Point column representing the center point for the bin associated with the given key.

Return type

pyspark.sql.Column

bin_geometry

geoanalytics.sql.functions.bin_geometry(bin)

Returns a polygon column representing the geometry of each bin.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_BinGeometry

Parameters

bin (pyspark.sql.Column) – Spatial Bin (bin2d) column with the key for the bin.

Returns

Polygon column representing the polygon for the bin associated with the given key.

Return type

pyspark.sql.Column

bin_id

geoanalytics.sql.functions.bin_id(bin)

Returns a long column representing the unique identifier of each bin.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_BinId

Parameters

bin (pyspark.sql.Column) – Spatial Bin (bin2d) column with the key for the bin.

Returns

LongType column representing the ID of the given key.

Return type

pyspark.sql.Column

boundary

geoanalytics.sql.functions.boundary(geometry)

Returns a geometry column representing the topological boundary of the given geometry. The function will return a linestring if the input is a polygon and a multipoint if the input is a linestring. Point and multipoint inputs are not supported.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Boundary

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

Geometry column representing the boundary. The returned geometry column type will be multipoint for linestring and multilinestring input and linestring for polygon and multipolygon input.

Return type

pyspark.sql.Column

buffer

geoanalytics.sql.functions.buffer(geometry, distance)

Returns a polygon column with buffer polygons that represent the area that is less than or equal to the specified planar distance from each input geometry. The distance can be specified as a single value or a numeric column. The distance value should be in the same units as the input geometry. For example, if your input geometry is in a spatial reference that uses meters, you should specify the distance in meters. To create a buffer polygon using geodesic distance calculations use ST_GeodesicBuffer.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Buffer

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • distance (pyspark.sql.Column/int/float) – Distance used to create the buffer. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value.

Returns

Polygon column representing the buffer around the input geometry.

Return type

pyspark.sql.Column

cast

geoanalytics.sql.functions.cast(geometry, geometry_type)

Returns a geometry column that contains the input geometries cast to the geometry type specified by the string value. The string value can be ‘point’, ‘multipoint’, ‘linestring’, ‘polygon’, or ‘geometry’. The function will return null when the input geometries cannot be cast to the specified type.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Cast

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • geometry_type (str) – Geometry type to cast the input geometry to.

Returns

Geometry column representing the cast geometry type.

Return type

pyspark.sql.Column

centerline

geoanalytics.sql.functions.centerline(geometry)

Creates a centerline of a polygon geometry.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Centerline

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

Geometry column with the centerline of the polygon feature.

Return type

pyspark.sql.Column

centroid

geoanalytics.sql.functions.centroid(geometry)

Returns a point column that represents the centroid of each input geometry. The result point is not guaranteed to be on the surface of the geometry.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Centroid

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

Point column representing the centroid.

Return type

pyspark.sql.Column

contains

geoanalytics.sql.functions.contains(geometry1, geometry2)

Returns a boolean column where the result is True if the geometry in the first column completely contains the second; otherwise, it is False.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Contains

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

BooleanType column. True if geometry1 contains geometry2, False otherwise.

Return type

pyspark.sql.Column

closest_point

geoanalytics.sql.functions.closest_point(geometry1, geometry2)

Returns a point column representing the point on the first geometry that is closest to the second geometry. This function calculates the planar distance between the two geometries to identify the closest point on the first geometry in relation to the second geometry. To learn more about the difference between planar and geodesic calculations see Coordinate systems and transformations. If the two input geometry columns are in different spatial references, the spatial reference of the output geometry would be the same as the first geometry.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_ClosestPoint

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

Point column representing the closest point.

Return type

pyspark.sql.Column

convex_hull

geoanalytics.sql.functions.convex_hull(geometry)

Returns a geometry column that represents the convex hull of the input geometries in each record. A convex hull is the smallest geometry having only interior angles measuring less than 180° that encloses each input geometry. For multipoint, linestring, and polygon geometries the result will be a polygon. For point geometries, the result is a point. The result column will always have the generic geometry type.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_ConvexHull

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

Generic geometry column representing the convex hull of the given geometry.

Return type

pyspark.sql.Column

coord_dim

geoanalytics.sql.functions.coord_dim(geometry)

Returns an integer column representing the dimensionality of the coordinates in the input geometry. For example, an input geometry with x,y coordinates only will return 2, while a geometry with x,y,z coordinates will return 3.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_CoordDim

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

IntegerType column representing the dimensionality.

Return type

pyspark.sql.Column

crosses

geoanalytics.sql.functions.crosses(geometry1, geometry2)

Returns a boolean column where the result is True if the two geometries cross; otherwise, it returns False. Two geometries cross when their intersection is not empty and is not equal to either of the geometries. The intersection must also have a dimensionality less than the maximum dimension of the two input geometries.

This function is only relevant for the following combinations of geometries:

  • multipoint/linestring

  • multipoint/polygon

  • linestring/polygon

  • linestring/multipoint

  • linestring/linestring

  • polygon/multipoint

  • polygon/linestring

For all other combinations the function will always return False.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Crosses

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

BooleanType column. True if geometry1 crosses geometry2, False otherwise.

Return type

pyspark.sql.Column

densify

geoanalytics.sql.functions.densify(geometry, max_segment_length)

Returns a geometry column of densified geometries. This function adds vertices along linestrings and polygons such that every segment within the geometry is no longer than max_segment_length with planar distance calculation. The max_segment_length is in the same units as the input geometry and should be greater than zero. To densify geometry using geodesic distance calculation, use ST_GeodesicDensify.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Densify

Parameters
  • geometry (pyspark.sql.Column) – Polygon or linestring geometry column

  • max_segment_length (pyspark.sql.Column/int/float) – Maximum length of all planar segments in the resulting polygon or linestring.

Returns

Geometry column representing the densified linestrings or polygons with planar distance calculation.

Return type

pyspark.sql.Column

difference

geoanalytics.sql.functions.difference(geometry1, geometry2)

Returns a geometry column representing the parts of the first geometry that do not intersect the second geometry. The result column geometry type will be that of the first geometry. If the first geometry is completely contained in the second geometry, then null geometry is returned.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Difference

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

Geometry column representing the part of geometry1 that does not intersect geometry2.

Return type

pyspark.sql.Column

dimension

geoanalytics.sql.functions.dimension(geometry)

Returns an integer column representing the dimensionality of the input geometry. Points and multipoints have a dimension of 0, lines 1, and polygons 2.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Dimension

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

IntegerType column representing the dimensionality.

Return type

pyspark.sql.Column

disjoint

geoanalytics.sql.functions.disjoint(geometry1, geometry2)

Returns a boolean column where the result is True if the first and second geometry are disjoint; otherwise, it returns False. Two geometries are disjoint if they are not overlapping, touching or intersecting each other.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Disjoint

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

BooleanType column. True if geometry1 and geometry2 are disjoint, False otherwise.

Return type

pyspark.sql.Column

distance

geoanalytics.sql.functions.distance(geometry1, geometry2)

Returns a double column representing the planar distance between the two input geometries. For multipoints, lines, and polygons, the distance is calculated from the nearest point between the geometries. The result will be in the same units as the input geometry data. For example, if your input geometries are in a spatial reference that uses meters, the result values will be in meters. If your input geometries are in a geographic coordinate system, use ST_GeodesicDistance to calculate distance.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Distance

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

DoubleType column representing the planar distance.

Return type

pyspark.sql.Column

dwithin

geoanalytics.sql.functions.dwithin(geometry1, geometry2, distance, geodesic=False)

Returns a boolean column where the result is True if the two geometries are spatially within the given distance; otherwise, it returns False. For multipoints, lines, and polygons, the distance is calculated from the nearest point between the geometries. You can optionally provide a boolean value that determines if geodesic distances will be used by the function. Planar distances will be used by default.

When using planar calculations, the distance value is specified in the same units as the input geometry. When geodesic is set to True, the distance value is specified in meters.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_DWithin

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

  • distance (pyspark.sql.Column/int/float) – Distance value to use. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value.

  • geodesic (bool, optional) – Geodesic distance will be used between geometries instead of planar distance, defaults to False.

Returns

BooleanType column. True if geometry1 and geometry2 are spatially within a given distance, False otherwise.

Return type

pyspark.sql.Column

end_point

geoanalytics.sql.functions.end_point(linestring)

Returns a point column representing the last point of the input linestring.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_EndPoint

Parameters

linestring (pyspark.sql.Column) – Linestring geometry column.

Returns

Point column representing the ending point.

Return type

pyspark.sql.Column

env_intersects

geoanalytics.sql.functions.env_intersects(geometry, *args, **kwargs)

Returns a boolean column where the result is True if the envelopes of two geometries spatially intersect; otherwise, it returns False.

Note

The behavior of ST_EnvIntersects was changed at version 1.2.0. ST_EnvIntersects will not support defining envelopes with minimum and maximum x,y coordinates in the future. Use ST_BboxIntersects to do this instead.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_EnvIntersects

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

BooleanType column. True if the envelopes of two geometries spatially intersect in 2D, False otherwise.

Return type

pyspark.sql.Column

envelope

geoanalytics.sql.functions.envelope(geometry)

Returns a polygon column representing an envelope for each geometry in the input column, where an envelope is the smallest rectangle that encompasses the input geometry and aligns to the x-axis and y-axis. To find the smallest rectangle that encompasses a geometry but is not axis-aligned, use ST_MinBoundingBox. If the input geometry is a single point, the function will create a degenerate polygon at the location of the point.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Envelope

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

Polygon column representing the envelope.

Return type

pyspark.sql.Column

equals

geoanalytics.sql.functions.equals(geometry1, geometry2)

Returns a boolean column where the result is True if the first geometry and the second geometry are spatially equal; otherwise, it returns False.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Equals

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

BooleanType column. True if geometry1 and geometry2 are spatially equal, False otherwise.

Return type

pyspark.sql.Column

exterior_ring

geoanalytics.sql.functions.exterior_ring(polygon)

Returns a linestring column representing the exterior ring of the polygon. Multipart polygons will return null.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_ExteriorRing

Parameters

polygon (pyspark.sql.Column) – Polygon geometry column.

Returns

Linestring column representing the exterior ring.

Return type

pyspark.sql.Column

flip

geoanalytics.sql.functions.flip(geometry, mode)

Flips the input geometry around an axis. There are three options:

  • X_AXIS - flips a geometry vertically around the horizontal axis.

  • Y_AXIS - flips a geometry horizontally around the vertical axis.

  • BOTH_AXES - flips a geometry horizontally around the vertical axis and vertically around the horizontal axis.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Flip

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • mode (str) – Choose from X_AXIS, Y_AXIS, or BOTH_AXES.

Returns

Geometry column with the flipped geometries. The returned geometry column type will be the same as the input geometry column type.

Return type

pyspark.sql.Column

generalize

geoanalytics.sql.functions.generalize(geometry, tolerance)

Returns a geometry column that generalizes the input linestring or polygon geometry using the Douglas-Peucker algorithm with the specified tolerance. The result is the input geometry generalized to include only a subset of the original geometry’s vertices. Point and multipoint geometry types are not supported as input.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Generalize

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • tolerance (pyspark.sql.Column/int/float) – Numeric value that limits the distance the output geometry can differ from the input geometry. Can be a LongType, DoubleType or StringType column or a numeric value.

Returns

Geometry column with the generalized geometries. The returned geometry column type will be the same as the input geometry column type.

Return type

pyspark.sql.Column

geodesic_area

geoanalytics.sql.functions.geodesic_area(geometry)

Returns a double column containing the geodesic area of the input geometry in square meters. For point, multipoint, and linestring geometries this function will always return 0. This function is more accurate but less performant than ST_Area and requires that a spatial reference is set on the input geometry column. To learn more about the difference between planar and geodesic calculations see Coordinate systems and transformations.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_GeodesicArea

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

DoubleType column representing the geodesic area.

Return type

pyspark.sql.Column

geodesic_buffer

geoanalytics.sql.functions.geodesic_buffer(geometry, distance)

Returns a polygon column with buffer polygons representing the geodesic area that is less than or equal to the specified distance from each input geometry. The distance can be specified as a single value or a numeric column and should be specified in meters. The result will also be in meters. This function is more accurate but less performant than ST_Buffer and requires that a spatial reference is set on the input geometry column. To learn more about the difference between planar and geodesic calculations see Coordinate systems and transformations.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_GeodesicBuffer

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • distance (pyspark.sql.Column/int/float) – Distance used to create the buffer. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value.

Returns

Polygon column representing the geodesic buffer around the input geometry.

Return type

pyspark.sql.Column

geodesic_closest_point

geoanalytics.sql.functions.geodesic_closest_point(geometry1, geometry2)

Returns a point column representing the point on the first geometry that is closest to the second geometry. This function calculates the geodesic distance between the two geometries to identify the closest point on the first geometry in relation to the second geometry. This function requires that a spatial reference is set on the input geometry columns. If the two geometry columns are in different spatial references, the function automatically transforms the second geometry into the spatial reference of the first. To learn more about the difference between planar and geodesic calculations, see Coordinate systems and transformations.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_GeodesicClosestPoint

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

Point column representing the geodesic closest point.

Return type

pyspark.sql.Column

geodesic_densify

geoanalytics.sql.functions.geodesic_densify(geometry, max_segment_length)

Returns a geometry column of densified geometries. This function adds vertices along polygons or linestrings to create densified approximations of geodesic segments with each segment being no longer than max_segment_length. The max_segment_length should be specified in meters and greater than zero. This function is more accurate but less performant than ST_Densify and requires that a spatial reference is set on the input geometry column. To learn more about the difference between planar and geodesic calculations see Coordinate systems and transformations.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_GeodesicDensify

Parameters
  • geometry (pyspark.sql.Column) – Polygon or linestring geometry column

  • max_segment_length (pyspark.sql.Column/int/float) – Maximum length in meters of all geodesic segments in the resulting polygon or linestring.

Returns

Geometry column representing the densified linestrings or polygons with geodesic distance calculation.

Return type

pyspark.sql.Column

geodesic_distance

geoanalytics.sql.functions.geodesic_distance(geometry1, geometry2)

Returns a double column representing the geodesic distance between the two input geometries in meters. For multipoints, lines, and polygons, the distance is calculated from the nearest point between the geometries. This function is more accurate but less performant than ST_Distance and requires that a spatial reference is set on at least the first input geometry column. To learn more about the difference between planar and geodesic calculations see Coordinate systems and transformations. If the two geometry columns are in different spatial references, the function will automatically transform the second geometry into the spatial reference of the first.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_GeodesicDistance

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

DoubleType column representing the geodesic distance.

Return type

pyspark.sql.Column

geodesic_length

geoanalytics.sql.functions.geodesic_length(geometry)

Returns a double column that represents the geodesic length of the input geometry. The length is calculated in meters. For point and multipoint geometries this function will always return 0. For polygon geometries this function will return the geodesic length of the perimeter of the polygon. This function is more accurate but less performant than ST_Length and requires that a spatial reference is set on the input geometry column. To learn more about the difference between planar and geodesic calculations see Coordinate systems and transformations.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_GeodesicLength

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

DoubleType column representing the geodesic length.

Return type

pyspark.sql.Column

geodesic_shortest_line

geoanalytics.sql.functions.geodesic_shortest_line(geometry1, geometry2)

Returns a linestring column representing the shortest line that touches two geometries, using geodesic distance calculation. This function returns only one shortest line if there are more than one. If the two input geometries intersect, an empty line geometry is returned. This function requires that a spatial reference ID is set on the input geometry columns. If the two geometry columns are in different spatial references, the function automatically transforms the second geometry into the spatial reference of the first. To learn more about the difference between planar and geodesic calculations, see Coordinate systems and transformations.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_GeodesicShortestLine

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

Geometry column representing the geodesic shortest line.

Return type

pyspark.sql.Column

geom_from_binary

geoanalytics.sql.functions.geom_from_binary(wkb, sr=None)

Returns a geometry column. The input binary column must contain the well-known binary (WKB) representation of geometries. You can optionally specify a spatial reference for the result geometry column. The sr parameter value must be a valid SRID or WKT string. This function should only be used when you don’t know the geometry type represented by the input column or when the input column contains more than one geometry type. In other cases, use the function specific to the geometry type of your input data (i.e. ST_PointFromBinary, ST_LineFromBinary, ST_MPointFromBinary, or ST_PolyFromBinary).

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_GeomFromBinary

Parameters
  • wkb (pyspark.sql.Column) – BinaryType column with the Well-Known Binary (WKB) representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned generic geometry, defaults to None.

Returns

Generic geometry column from the Well-Known Binary (WKB) representation.

Return type

pyspark.sql.Column

geom_from_esri_json

geoanalytics.sql.functions.geom_from_esri_json(json_str, sr=None)

Returns a geometry column. The input string column must contain the Esri JSON representation of geometries. You can optionally specify a spatial reference for the result geometry column. The sr parameter value must be a valid SRID or WKT string. Any spatial reference defined in the input strings will not be used. This function should only be used when you don’t know the geometry type represented by the input column or when the input column contains more than one geometry type. In other cases, use the function specific to the geometry type of your input data (i.e. ST_PointFromEsriJSON, ST_LineFromEsriJSON, ST_MPointFromEsriJSON, or ST_PolyFromEsriJSON).

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_GeomFromEsriJSON

Parameters
  • json_str (pyspark.sql.Column) – StringType column with the Esri JSON representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned generic geometry, defaults to None.

Returns

Generic geometry column from the Esri JSON representation.

Return type

pyspark.sql.Column

geom_from_geojson

geoanalytics.sql.functions.geom_from_geojson(json_str, sr=None)

Returns a geometry column. The input string column must contain the GeoJSON representation of geometries. You can optionally specify a spatial reference for the result geometry column. The sr parameter value must be a valid SRID or WKT string. Any spatial reference defined in the input strings will not be used. This function should only be used when you don’t know the geometry type represented by the input column or when the input column contains more than one geometry type. In other cases, use the function specific to the geometry type of your input data (i.e. ST_PointFromGeoJSON, ST_LineFromGeoJSON, ST_MPointFromGeoJSON, or ST_PolyFromGeoJSON).

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_GeomFromGeoJSON

Parameters
  • json_str (pyspark.sql.Column) – StringType column with the GeoJSON representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned generic geometry, defaults to None.

Returns

Generic geometry column from the GeoJSON representation.

Return type

pyspark.sql.Column

geom_from_shape

geoanalytics.sql.functions.geom_from_shape(shp, sr=None)

Returns a geometry column. The input binary column must contain the shapefile representation of geometries. You can optionally specify a spatial reference for the result geometry column. The sr parameter value must be a valid SRID or WKT string. This function should only be used when you don’t know the geometry type represented by the input column or when the input column contains more than one geometry type. In other cases, use the function specific to the geometry type of your input data (i.e. ST_PointFromShape, ST_LineFromShape, ST_MPointFromShape, or ST_PolyFromShape).

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_GeomFromShape

Parameters
  • shp (pyspark.sql.Column) – BinaryType column with the shapefile representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned generic geometry, defaults to None.

Returns

Generic geometry column from the shapefile representation.

Return type

pyspark.sql.Column

geom_from_text

geoanalytics.sql.functions.geom_from_text(wkt, sr=None)

Returns a geometry column. The string column must contain the well-known text (WKT) representation of geometries. You can optionally specify a spatial reference for the result geometry column. The sr parameter value must be a valid SRID or WKT string. This function should only be used when you don’t know the geometry type represented by the input column or when the input column contains more than one geometry type. In other cases, use the function specific to the geometry type of your input data (i.e. ST_PointFromText, ST_LineFromText, ST_MPointFromText, or ST_PolyFromText).

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_GeomFromText

Parameters
  • wkt (pyspark.sql.Column) – StringType column with the well-known text (WKT) representation of geometries.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned generic geometry, defaults to None.

Returns

Generic geometry column from the well-known text (WKT) representation.

Return type

pyspark.sql.Column

geometries

geoanalytics.sql.functions.geometries(geometry)

Returns an array column. Multipoint geometries return an array of points, multipart linestrings return an array of single-path linestrings, and multipart polygons return an array of single-ring polygons.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Geometries

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

Array column representing an array of single part geometries.

Return type

pyspark.sql.Column

geometry_n

geoanalytics.sql.functions.geometry_n(geometry, n)

Returns a geometry column. The output column contains the nth single-part geometry from a multipart geometry. When n=0, the first single-part geometry is returned. If the nth geometry doesn’t exist, null is returned.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_GeometryN

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • n (pyspark.sql.Column/int) – Index of the geometry to return. Can be an IntegerType column or an integer value.

Returns

Geometry column representing the nth geometry. The returned geometry column type will be the same type as the input geometries except in the case of multipoint input which will return point.

Return type

pyspark.sql.Column

geometry_type

geoanalytics.sql.functions.geometry_type(geometry)

Returns a string column. The string indicates the type of each input geometry (i.e. ‘Point’, ‘MultiPoint’, ‘Linestring’, or ‘Polygon’).

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_GeometryType

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

StringType column representing the geometry type.

Return type

pyspark.sql.Column

h3_bin

geoanalytics.sql.functions.h3_bin(geometry, h3_resolution)

Returns a bin column containing a single H3 bin at the specified resolution for each record in the input column. The centroid of the input geometry is guaranteed to intersect with the bin returned but is not necessarily coincident with the bin center. Use ST_BinGeometry to obtain the geometry of each result bin.

This function can also be called with a long column representing the ID of the bin (see ST_BinId). The bin ID will be cast to a bin column.

ST_H3Bin requires the spatial reference of the geometry column to be GCS_WGS_1984 (EPSG:4326). If the input geometry is in a different spatial reference, this function automatically transforms the geometry into GCS_WGS_1984. To learn more about spatial references, see Coordinate systems and transformations.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_H3Bin

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • h3_resolution (int) – H3 cell resolution, see H3 documentation for more information.

Returns

Spatial bin (bin2d) column representing a single H3 bin for each geometry.

Return type

pyspark.sql.Column

h3_bins

geoanalytics.sql.functions.h3_bins(geometry, h3_resolution, padding=0.0)

Returns an array column containing H3 bins at the specified resolution that cover the spatial extent of each record in the input column. You can optionally specify a numeric value for padding, which conceptually applies a buffer of the specified distance to the input geometry before creating the H3 bins. Use ST_BinGeometry to obtain the geometry of each result bin.

ST_H3Bins requires the spatial reference to be set to GCS_WGS_1984 (EPSG:4326). If the input geometry is in a different spatial reference, this function automatically transforms the geometry into GCS_WGS_1984. To learn more about spatial references, see Coordinate systems and transformations.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_H3Bins

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • h3_resolution (int) – H3 cell resolution, see H3 documentation for more information.

  • padding (float, optional) – Numerical buffer value applied to the geometry before finding the intersecting bins, defaults to 0.0.

Returns

Array column representing an array of spatial bin (bin2d) H3 bins.

Return type

pyspark.sql.Column

hex_bin

geoanalytics.sql.functions.hex_bin(geometry, bin_size)

Returns a bin column containing a single hexagonal bin for each record in the input column. The specified bin size determines the height of each bin and is in the same units as the input geometry. The centroid of the input geometry is guaranteed to intersect with the bin returned but is not necessarily coincident with the bin center. Use ST_BinGeometry to obtain the geometry of each result bin.

This function can also be called with a long column representing the ID of the bin (see ST_BinId). The bin ID will be cast to a bin column.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_HexBin

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • bin_size (int/float) – Numerical value representing the height of the hexagonal bin.

Returns

Spatial bin (bin2d) column representing a single hexagonal bin for each geometry.

Return type

pyspark.sql.Column

hex_bins

geoanalytics.sql.functions.hex_bins(geometry, bin_size, padding=0.0)

Returns an array column containing hexagonal bins that cover the spatial extent of each record in the input column. The specified bin size determines the height of each bin and is in the same units as the input geometry. You can optionally specify a numeric value for padding, which conceptually applies a buffer of the specified distance to the input geometry before creating the hexagonal bins.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_HexBins

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • bin_size (int/float) – Numerical value representing the height of the hexagonal bin.

  • padding (int/float/str, optional) – Numerical buffer value applied to the geometry before finding the intersecting bins, defaults to 0.0.

Returns

Array column representing an array of spatial bin (bin2d) hexagonal bins.

Return type

pyspark.sql.Column

interior_ring_n

geoanalytics.sql.functions.interior_ring_n(polygon, n)

Returns a linestring column. The output is the nth interior ring of the input polygon as a linestring. If there is more than one interior ring, the order of the interior rings is defined by the order in the input polygon. When n=0, the first interior ring is returned. If the index exceeds the number of interior rings in the polygon, null is returned. If the input is a multipolygon null is returned.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_InteriorRingN

Parameters
  • polygon (pyspark.sql.Column) – Polygon geometry column.

  • n (pyspark.sql.Column/int) – Index of the interior ring to return. Can be an IntegerType column or an integer value.

Returns

Linestring column representing the nth interior ring.

Return type

pyspark.sql.Column

intersection

geoanalytics.sql.functions.intersection(geometry1, geometry2, intersect_type=None)

Returns a geometry column containing the intersection of two input geometry records. You can optionally specify a string value that determines the geometry type of the result. The string can be one of: ‘multipoint’, ‘linestring’ or ‘polygon’. If no intersection type is specified, the function will return the same geometry type as the input geometry with the lowest dimension. For example, if you calculate the intersection of a polygon and a linestring the function will return a linestring.

The function will return an empty geometry if the two input geometries do not intersect or there is no intersection that matches the specified intersect type. If the intersection is a single point, the geometry type of the result column will be a multipoint.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Intersection

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

  • intersect_type (str, optional) – Sets the output geometry type, defaults to None.

Returns

Geometry column.

Return type

pyspark.sql.Column

intersects

geoanalytics.sql.functions.intersects(geometry1, geometry2)

Returns a boolean column where the result is True if the first geometry and the second geometry spatially intersect; otherwise, it returns False.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Intersects

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

BooleanType column. True if geometry1 and geometry2 spatially intersect in 2D, False otherwise.

Return type

pyspark.sql.Column

is_3d

geoanalytics.sql.functions.is_3d(geometry)

Returns a boolean column where the result is True if the geometry is three-dimensional; otherwise, it returns False. The geometry is considered three-dimensional if it has x,y,z coordinates.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Is3D

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

BooleanType column. True if the geometry is three-dimensional, False otherwise.

Return type

pyspark.sql.Column

is_closed

geoanalytics.sql.functions.is_closed(linestring)

Returns a boolean column where the result is True if the start and end point of a given linestring are coincident; otherwise, it returns False.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_IsClosed

Parameters

linestring (pyspark.sql.Column) – Linestring geometry column.

Returns

BooleanType column. True if the linestring is closed, False otherwise.

Return type

pyspark.sql.Column

is_empty

geoanalytics.sql.functions.is_empty(geometry)

Returns a boolean column where the result is True if the geometry is empty; otherwise, it returns False.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_IsEmpty

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

BooleanType column. True if the geometry is empty, False otherwise.

Return type

pyspark.sql.Column

is_measured

geoanalytics.sql.functions.is_measured(geometry)

Returns a boolean column where the result is True if the geometry has an m-value; otherwise, it returns False.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_IsMeasured

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

BooleanType column. True if the geometry has m-values, False otherwise.

Return type

pyspark.sql.Column

is_ring

geoanalytics.sql.functions.is_ring(geometry)

Returns a boolean column where the result is True if the input linestring is closed and simple; otherwise, it returns False.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_IsRing

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

BooleanType column. True if the linestring is both closed and simple, False otherwise.

Return type

pyspark.sql.Column

is_simple

geoanalytics.sql.functions.is_simple(geometry)

Returns a boolean column where the result is True if the given geometry is simple; otherwise, it returns False.

The criteria for simplicity vary for each geometry type and are as follows:

  • A point is always simple.

  • A multipoint is considered simple if no two points are coincident.

  • A linestring is considered simple if it does not cross the same point twice, except for start and end points. A multipart linestring is only considered simple if the parts do not intersect except at the start or end points of the parts.

  • A polygon or multipart polygon is considered simple if each ring does not cross the same point twice and no two rings cross or intersect except at a single point (i.e. they are tangent at a point but not a line).

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_IsSimple

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

BooleanType column. True if the geometry has self-intersection or self-tangency, False otherwise.

Return type

pyspark.sql.Column

length

geoanalytics.sql.functions.length(geometry)

Returns a double column representing the planar length of the input geometry. The length is calculated in the same units as the input geometry. For point and multipoint geometries the function will always return 0. For polygon geometries this function will return the length of the perimeter of the polygon.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Length

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

DoubleType column representing the planar length.

Return type

pyspark.sql.Column

line_from_binary

geoanalytics.sql.functions.line_from_binary(wkb, sr=None)

Returns a linestring column. The input binary column must contain the well-known binary (WKB) representation of linestring geometries. You can optionally specify a spatial reference for the result linestring column. The sr parameter value must be a valid SRID or WKT string. If a linestring cannot be created from the input binary the function will return null.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_LineFromBinary

Parameters
  • wkb (pyspark.sql.Column) – BinaryType column with the Well-Known Binary (WKB) representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned linestring geometry, defaults to None.

Returns

Linestring column from the Well-Known Binary (WKB) representation.

Return type

pyspark.sql.Column

line_from_esri_json

geoanalytics.sql.functions.line_from_esri_json(json_str, sr=None)

Returns a linestring column. The input string column must contain the Esri JSON representation of linestring geometries. You can optionally specify a spatial reference for the result linestring column. The sr parameter value must be a valid SRID or WKT string. Any SRID defined in the input strings will not be used. If a linestring cannot be created from the input string the function will return null.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_LineFromEsriJSON

Parameters
  • json_str (pyspark.sql.Column) – StringType column with the Esri JSON representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned linestring geometry, defaults to None.

Returns

Linestring column from the Esri JSON representation.

Return type

pyspark.sql.Column

line_from_geojson

geoanalytics.sql.functions.line_from_geojson(json_str, sr=None)

Returns a linestring column. The input string column must contain the GeoJSON representation of linestring geometries. You can optionally specify a spatial reference for the result linestring column. The sr parameter value must be a valid SRID or WKT string. If a linestring cannot be created from the input string the function will return null.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_LineFromGeoJSON

Parameters
  • json_str (pyspark.sql.Column) – StringType column with the GeoJSON representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned linestring geometry, defaults to None.

Returns

Linestring column from the GeoJSON representation.

Return type

pyspark.sql.Column

line_from_shape

geoanalytics.sql.functions.line_from_shape(shp, sr=None)

Returns a linestring column. The input binary column must contain the shapefile representation of linestring geometries. You can optionally specify a spatial reference for the result linestring column. The sr parameter value must be a valid SRID or WKT string. If a linestring cannot be created from the input binary the function will return null.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_LineFromShape

Parameters
  • shp (pyspark.sql.Column) – BinaryType column with the shapefile representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned linestring geometry, defaults to None.

Returns

Linestring column from the shapefile representation.

Return type

pyspark.sql.Column

line_from_text

geoanalytics.sql.functions.line_from_text(wkt, sr=None)

Returns a linestring column. The string column must contain the well-known text (WKT) representation of linestring geometries. You can optionally specify a spatial reference for the result linestring column. The sr parameter value must be a valid SRID or WKT string.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_LineFromText

Parameters
  • wkt (pyspark.sql.Column) – StringType column with the well-known text (WKT) representation of linestring geometries.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned linestring geometry, defaults to None.

Returns

Linestring column from the well-known text (WKT) representation.

Return type

pyspark.sql.Column

linestring

geoanalytics.sql.functions.linestring(points)

Returns a linestring column. The input arrays must be arrays of point geometries. The function creates a linestring geometry by connecting the point geometries in the same order that they are stored in the input array.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Linestring

Parameters

points (pyspark.sql.Column) – Array of point geometries.

Returns

Linestring column representing the array of points.

Return type

pyspark.sql.Column

m

geoanalytics.sql.functions.m(point, new_value=None)

Can work as a getter or a setter, depending on the inputs.

Getter: Takes a point column and returns a double column containing the m-values of the input points. If a point does not have an m-value the function returns NaN.

Setter: Takes a point column and a numeric value and returns a point column containing the input points with the m-values set to the numeric value.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_M

Parameters
  • point (pyspark.sql.Column) – Point geometry column.

  • new_value (pyspark.sql.Column/int/float, optional) – M-value to set. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.

Returns

  • Getter: DoubleType column representing the m-value.

  • Setter: Point column representing the updated m-value.

Return type

pyspark.sql.Column

make_point

geoanalytics.sql.functions.make_point(x, y, z=None, m=None)

Returns a point column. The two input columns must contain the x,y coordinates of the points respectively. You can optionally specify two additional input columns with z-coordinates and m-values. The spatial reference of the result column will always be 0 and should be set to a valid ID using ST_SRID.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_MakePoint

Parameters
  • x (pyspark.sql.Column/int/float) – X-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (x-coordinates) or a numeric value.

  • y (pyspark.sql.Column/int/float) – Y-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (y-coordinates) or a numeric value.

  • z (pyspark.sql.Column/int/float, optional) – Z-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.

  • m (pyspark.sql.Column/int/float, optional) – M-value for the point. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.

Returns

Point column representing the point geometries.

Return type

pyspark.sql.Column

max_m

geoanalytics.sql.functions.max_m(geometry)

Returns a double column containing the maximum m-value of each input geometry. If the input geometry does not have m-values the function will return NaN.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_MaxM

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

DoubleType column representing the maximum m-value for the envelope.

Return type

pyspark.sql.Column

max_x

geoanalytics.sql.functions.max_x(geometry)

Returns a double column containing the maximum x-coordinate of each input geometry.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_MaxX

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

DoubleType column representing the maximum x-coordinate for the envelope.

Return type

pyspark.sql.Column

max_y

geoanalytics.sql.functions.max_y(geometry)

Returns a double column containing the maximum y-coordinate of each input geometry.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_MaxY

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

DoubleType column representing the maximum y-coordinate for the envelope.

Return type

pyspark.sql.Column

max_z

geoanalytics.sql.functions.max_z(geometry)

Returns a double column containing the maximum z-coordinate of each input geometry. If the input geometry does not have z-coordinates the function will return NaN.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_MaxZ

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

DoubleType column representing the maximum z-coordinate for the envelope.

Return type

pyspark.sql.Column

min_bounding_box

geoanalytics.sql.functions.min_bounding_box(geometry, by_area=True)

Returns a polygon column containing a polygon for each geometry in the input column. The polygon is the smallest rectangle of arbitrary alignment that encompasses the input geometry. You can optionally provide a boolean value that determines how the rectangle will be created. There are two options:

  • True: Creates a rectangle with the minimum possible area. This is the default.

  • False: Creates a rectangle with the minimum possible width.

For point geometries, this function will return a degenerate polygon at the location of the point.

To find the minimum bounding box that aligns to the x and y axis, use ST_Envelope.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_MinBoundingBox

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • by_area (pyspark.sql.Column/bool, optional) – BooleanType column or a boolean value. True minimizes the bounding box area, False minimizes the width. Defaults to True.

Returns

Polygon column representing the bounding box.

Return type

pyspark.sql.Column

min_m

geoanalytics.sql.functions.min_m(geometry)

Returns a double column containing the minimum m-value of each input geometry. If the input geometry does not have m-values the function will return NaN.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_MinM

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

DoubleType column representing the minimum m-value for the envelope.

Return type

pyspark.sql.Column

min_x

geoanalytics.sql.functions.min_x(geometry)

Returns a double column containing the minimum x-coordinate of each input geometry.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_MinX

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

DoubleType column representing the minimum x-coordinate for the envelope.

Return type

pyspark.sql.Column

min_y

geoanalytics.sql.functions.min_y(geometry)

Returns a double column containing the minimum y-coordinate of each input geometry.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_MinY

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

DoubleType column representing the minimum y-coordinate for the envelope.

Return type

pyspark.sql.Column

min_z

geoanalytics.sql.functions.min_z(geometry)

Returns a double column containing the minimum z-coordinate of each input geometry. If the input geometry does not have z-coordinates the function will return NaN.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_MinZ

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

DoubleType column representing the minimum z-coordinate for the envelope.

Return type

pyspark.sql.Column

mpoint_from_binary

geoanalytics.sql.functions.mpoint_from_binary(wkb, sr=None)

Returns a multipoint column. The input binary column must contain the well-known binary (WKB) representation of multipoint geometries. You can optionally specify a spatial reference for the result multipoint column. The sr parameter value must be a valid SRID or WKT string. If amultipoint cannot be created from the input binary the function will return null.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_MPointFromBinary

Parameters
  • wkb (pyspark.sql.Column) – BinaryType column with the Well-Known Binary (WKB) representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned multipoint geometry, defaults to None.

Returns

Multipoint column from the Well-Known Binary (WKB) representation.

Return type

pyspark.sql.Column

mpoint_from_esri_json

geoanalytics.sql.functions.mpoint_from_esri_json(json_str, sr=None)

Returns a multipoint column. The input string column must contain the Esri JSON representation of multipoint geometries. You can optionally specify a spatial reference for the result multipoint column. The sr parameter value must be a valid SRID or WKT string. Any SRID defined in the input strings will not be used. If a multipoint cannot be created from the input string the function will return null.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_MPointFromEsriJSON

Parameters
  • json_str (pyspark.sql.Column) – StringType column with the Esri JSON representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned multipoint geometry, defaults to None.

Returns

Multipoint column from the Esri JSON representation.

Return type

pyspark.sql.Column

mpoint_from_geojson

geoanalytics.sql.functions.mpoint_from_geojson(json_str, sr=None)

Returns a multipoint column. The input string column must contain the GeoJSON representation of multipoint geometries. You can optionally specify a spatial reference for the result multipoint column. The sr parameter value must be a valid SRID or WKT string. If a multipoint cannot be created from the input string the function will return null.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_MPointFromGeoJSON

Parameters
  • json_str (pyspark.sql.Column) – StringType column with the GeoJSON representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned multipoint geometry, defaults to None.

Returns

Multipoint column from the GeoJSON representation.

Return type

pyspark.sql.Column

mpoint_from_shape

geoanalytics.sql.functions.mpoint_from_shape(shp, sr=None)

Returns a multipoint column. The input binary column must contain the shapefile representation of multipoint geometries. You can optionally specify a spatial reference for the result multipoint column. The sr parameter value must be a valid SRID or WKT string. If a multipoint cannot be created from the input binary the function will return null.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_MPointFromShape

Parameters
  • shp (pyspark.sql.Column) – BinaryType column with the shapefile representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned multipoint geometry, defaults to None.

Returns

Multipoint column from the shapefile representation.

Return type

pyspark.sql.Column

mpoint_from_text

geoanalytics.sql.functions.mpoint_from_text(wkt, sr=None)

Returns a multipoint column. The string column must contain the well-known text (WKT) representation of multipoint geometries. You can optionally specify a spatial reference for the result multipoint column. The sr parameter value must be a valid SRID or WKT string.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_MPointFromText

Parameters
  • wkt (pyspark.sql.Column) – StringType column with the well-known text (WKT) representation of multipoint geometries.

  • sr (int/sr) – Spatial reference (SRID or WKT) to set on the returned multipoint geometry, defaults to None.

Returns

Multipoint column from the well-known text (WKT) representation.

Return type

pyspark.sql.Column

multilinestring

geoanalytics.sql.functions.multilinestring(*arrayOfPoints)

Returns a linestring column. The input array column must contain an array of arrays of point geometries.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_MultiLinestring

Parameters

arrayOfPoints (pyspark.sql.Column) – Array of point geometry arrays.

Returns

Linestring column representing the array of points.

Return type

pyspark.sql.Column

multipoint

geoanalytics.sql.functions.multipoint(points)

Returns a multipoint column. The input array column must contain an array of point geometries.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_MultiPoint

Parameters

points (pyspark.sql.Column) – Array of point geometries.

Returns

Multipoint column representing the array of points.

Return type

pyspark.sql.Column

multipolygon

geoanalytics.sql.functions.multipolygon(*arrayOfPoints)

Returns a polygon column. The input column must contain an array of arrays of point geometries. The output polygon column represents the one or more rings created from the point arrays.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_MultiPolygon

Parameters

arrayOfPoints (pyspark.sql.Column) – Array of point geometry arrays.

Returns

Polygon column representing the array of points.

Return type

pyspark.sql.Column

num_geometries

geoanalytics.sql.functions.num_geometries(geometry)

Returns an integer column representing the number of geometries in each record.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_NumGeometries

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

IntegerType column representing the number of geometries.

Return type

pyspark.sql.Column

num_interior_ring

geoanalytics.sql.functions.num_interior_ring(polygon)

OGC alias for ST_NumInteriorRings.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_NumInteriorRings

Parameters

polygon (pyspark.sql.Column) – Polygon Geometry column.

Returns

IntegerType column representing the number of interior rings.

Return type

pyspark.sql.Column

num_interior_rings

geoanalytics.sql.functions.num_interior_rings(polygon)

Returns an integer column representing the number of interior rings in the input polygon. The function will return null when the input is a multipart polygon.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_NumInteriorRings

Parameters

polygon (pyspark.sql.Column) – Polygon geometry column.

Returns

IntegerType column representing the number of interior rings.

Return type

pyspark.sql.Column

num_points

geoanalytics.sql.functions.num_points(geometry)

Returns an integer column representing the number of points in the input geometry.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_NumPoints

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

IntegerType column representing the number of points.

Return type

pyspark.sql.Column

overlaps

geoanalytics.sql.functions.overlaps(geometry1, geometry2)

Returns a boolean column where the result is True if the first geometry and the second geometry spatially overlap; otherwise, it returns False. Two geometries overlap when their intersection is the same geometry type as either of the inputs but not equal to either of the inputs.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Overlaps

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

BooleanType column. True if geometry1 and geometry2 spatially overlap, False otherwise.

Return type

pyspark.sql.Column

point

geoanalytics.sql.functions.point(x, y, sr=None)

Returns a point column. The two numeric columns or values must contain the x,y coordinates of the point geometries. You can optionally specify a spatial reference for the result point column. The sr parameter value must be a valid SRID or WKT string.

To create point geometries with a z-coordinate and/or m-value, use ST_PointZ, ST_PointZM, ST_PointM, or ST_MakePoint.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Point

Parameters
  • x (pyspark.sql.Column/int/float) – X-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (x-coordinates) or a numeric value.

  • y (pyspark.sql.Column/int/float) – Y-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (y-coordinates) or a numeric value.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned multipoint geometry, defaults to None.

Returns

Point column representing the point geometries.

Return type

pyspark.sql.Column

point_from_binary

geoanalytics.sql.functions.point_from_binary(wkb, sr=None)

Returns a point column. The input binary column must contain the well-known binary (WKB) representation of point geometries. You can optionally specify a spatial reference for the result point column. The sr parameter value must be a valid SRID or WKT string. If a point cannot be created from the input binary the function will return null.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_PointFromBinary

Parameters
  • wkb (pyspark.sql.Column) – BinaryType column with the Well-Known Binary (WKB) representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned point geometry, defaults to None.

Returns

Point column from the Well-Known Binary (WKB) representation.

Return type

pyspark.sql.Column

point_from_esri_json

geoanalytics.sql.functions.point_from_esri_json(json_str, sr=None)

Returns a point column. The input string column must contain the Esri JSON representation of point geometries. You can optionally specify a spatial reference for the result point column. The sr parameter value must be a valid SRID or WKT string. Any SRID defined in the input strings will not be used. If a point cannot be created from the input string the function will return null.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_PointFromEsriJSON

Parameters
  • json_str (pyspark.sql.Column) – StringType column with the Esri JSON representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned point geometry, defaults to None.

Returns

Point column from the Esri JSON representation.

Return type

pyspark.sql.Column

point_from_geojson

geoanalytics.sql.functions.point_from_geojson(json_str, sr=None)

Returns a point column. The input string column must contain the GeoJSON representation of point geometries. You can optionally specify a spatial reference for the result point column. The sr parameter value must be a valid SRID or WKT string. If a point cannot be created from the input string the function will return null.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_PointFromGeoJSON

Parameters
  • json_str (pyspark.sql.Column) – StringType column with the GeoJSON representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned point geometry, defaults to None.

Returns

Point column from the GeoJSON representation.

Return type

pyspark.sql.Column

point_from_shape

geoanalytics.sql.functions.point_from_shape(shp, sr=None)

Returns a point column. The input binary column must contain the shapefile representation of point geometries. You can optionally specify a spatial reference for the result point column. The sr parameter value must be a valid SRID or WKT string. If a point cannot be created from the input binary the function will return null.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_PointFromShape

Parameters
  • shp (pyspark.sql.Column) – BinaryType column with the shapefile representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned point geometry, defaults to None.

Returns

Point column from the shapefile representation.

Return type

pyspark.sql.Column

point_from_text

geoanalytics.sql.functions.point_from_text(wkt, sr=None)

Returns a point column. The string column must contain the well-known text (WKT) representation of point geometries. You can optionally specify a spatial reference for the result point column. The sr parameter value must be a valid SRID or WKT string.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_PointFromText

Parameters
  • wkt (pyspark.sql.Column) – StringType column with the well-known text (WKT) representation of point geometries.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned point geometry, defaults to None.

Returns

Point column from the well-known text (WKT) representation.

Return type

pyspark.sql.Column

point_m

geoanalytics.sql.functions.point_m(x, y, m, sr=None)

Returns a point column. The three numeric columns or values must contain the x,y coordinates and m-values of the point geometries. You can optionally specify a spatial reference for the result point column. The sr parameter value must be a valid SRID or WKT string.

To create point geometries without m-values or z-coordinates use ST_Point. To create point geometries with z-coordinates use ST_PointZ, ST_PointZM, or ST_MakePoint.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_PointM

Parameters
  • x (pyspark.sql.Column/int/float) – X-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (x-coordinates) or a numeric value.

  • y (pyspark.sql.Column/int/float) – Y-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (y-coordinates) or a numeric value.

  • m (pyspark.sql.Column/int/float) – M-value for the point. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned multipoint geometry, defaults to None.

Returns

Point column representing the point geometries.

Return type

pyspark.sql.Column

point_n

geoanalytics.sql.functions.point_n(geometry, n)

Returns a point column. The output column represents the nth point in the input geometry, where 0 is the first point. If the nth point does not exist the function returns null. This function always returns null for multipart linestrings, and multipart polygons.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_PointN

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • n (pyspark.sql.Column/int) – Index of the point to return. Can be an IntegerType column or an integer value.

Returns

Point column representing the nth point.

Return type

pyspark.sql.Column

point_on_surface

geoanalytics.sql.functions.point_on_surface(geometry)

Returns a point column. The function returns a point that lies on the surface of linestring or polygon geometries.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_PointOnSurface

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

Point column representing a point that lies on the surface.

Return type

pyspark.sql.Column

point_z

geoanalytics.sql.functions.point_z(x, y, z, sr=None)

Returns a point column. The three numeric columns or values must contain the x,y,z coordinates of the point geometries. You can optionally specify a spatial reference for the result point column. The sr parameter value must be a valid SRID or WKT string.

To create point geometries without m-values or z-coordinates use ST_Point. To create point geometries with m-values use ST_PointM, ST_PointZM, or ST_MakePoint.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_PointZ

Parameters
  • x (pyspark.sql.Column/int/float) – X-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (x-coordinates) or a numeric value.

  • y (pyspark.sql.Column/int/float) – Y-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (y-coordinates) or a numeric value.

  • z (pyspark.sql.Column/int/float) – Z-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned multipoint geometry, defaults to None.

Returns

Point column representing the point geometries.

Return type

pyspark.sql.Column

point_zm

geoanalytics.sql.functions.point_zm(x, y, z, m, sr=None)

Returns a point column. The four numeric columns or values must contain the x,y,z coordinates and m-values of the point geometries. You can optionally specify a spatial reference for the result point column. The sr parameter value must be a valid SRID or WKT string.

To create point geometries without m-values or z-coordinates use ST_Point. To create point geometries with only m-values or only z-coordinates use ST_PointM, ST_PointZ, or ST_MakePoint.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_PointZM

Parameters
  • x (pyspark.sql.Column/int/float) – X-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (x coordinates) or a numeric value.

  • y (pyspark.sql.Column/int/float) – Y-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (y-coordinates) or a numeric value.

  • z (pyspark.sql.Column/int/float) – Z-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.

  • m (pyspark.sql.Column/int/float) – M-value for the point. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned multipoint geometry, defaults to None.

Returns

Point column representing the point geometries.

Return type

pyspark.sql.Column

points

geoanalytics.sql.functions.points(geometry)

Returns an array column. For linestring and polygon geometries the function returns the vertices of the input geometry as an array of points. For multipoint and point geometries the function returns an array of all points in the input geometry.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Points

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

Array column representing an array of point geometries.

Return type

pyspark.sql.Column

poly_from_binary

geoanalytics.sql.functions.poly_from_binary(wkb, sr=None)

Returns a polygon column. The input binary column must contain the well-known binary (WKB) representation of polygon geometries. You can optionally specify a spatial reference for the result polygon column. The sr parameter value must be a valid SRID or WKT string. If a polygon cannot be created from the input binary the function will return null.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_PolyFromBinary

Parameters
  • wkb (pyspark.sql.Column) – BinaryType column with the Well-Known Binary (WKB) representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned polygon geometry, defaults to None.

Returns

Polygon column from the Well-Known Binary (WKB) representation.

Return type

pyspark.sql.Column

poly_from_esri_json

geoanalytics.sql.functions.poly_from_esri_json(json_str, sr=None)

Returns a polygon column. The input string column must contain the Esri JSON representation of polygon geometries. You can optionally specify a spatial reference for the result polygon column. The sr parameter value must be a valid SRID or WKT string. Any SRID defined in the input strings will not be used. If a polygon cannot be created from the input string the function will return null.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_PolyFromEsriJSON

Parameters
  • json_str (pyspark.sql.Column) – StringType column with the Esri JSON representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned polygon geometry, defaults to None.

Returns

Polygon column from the Esri JSON representation.

Return type

pyspark.sql.Column

poly_from_geojson

geoanalytics.sql.functions.poly_from_geojson(json_str, sr=None)

Returns a polygon column. The input string column must contain the GeoJSON representation of polygon geometries. You can optionally specify a spatial reference for the result polygon column. The sr parameter value must be a valid SRID or WKT string. If a polygon cannot be created from the input string the function will return null.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_PolyFromGeoJSON

Parameters
  • json_str (pyspark.sql.Column) – StringType column with the GeoJSON representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned polygon geometry, defaults to None.

Returns

Polygon column from the GeoJSON representation.

Return type

pyspark.sql.Column

poly_from_shape

geoanalytics.sql.functions.poly_from_shape(shp, sr=None)

Returns a polygon column. The input binary column must contain the shapefile representation of polygon geometries. You can optionally specify a spatial reference for the result polygon column. The sr parameter value must be a valid SRID or WKT string. If a polygon cannot be created from the input binary the function will return null.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_PolyFromShape

Parameters
  • shp (pyspark.sql.Column) – BinaryType column with the shapefile representation.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned polygon geometry, defaults to None.

Returns

Polygon column from the shapefile representation.

Return type

pyspark.sql.Column

poly_from_text

geoanalytics.sql.functions.poly_from_text(wkt, sr=None)

Returns a polygon column. The string column must contain the well-known text (WKT) representation of polygon geometries. You can optionally specify a spatial reference for the result polygon column. The sr parameter value must be a valid SRID or WKT string.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_PolyFromText

Parameters
  • wkt (pyspark.sql.Column) – StringType column with the well-known text (WKT) representation of polygon geometries.

  • sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned polygon geometry, defaults to None.

Returns

Polygon column from the well-known text (WKT) representation.

Return type

pyspark.sql.Column

polygon

geoanalytics.sql.functions.polygon(points)

Returns a polygon column. The input column must contain an array of point geometries.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Polygon

Parameters

points (pyspark.sql.Column) – Array of point geometries.

Returns

Polygon column representing the array of points.

Return type

pyspark.sql.Column

relate

geoanalytics.sql.functions.relate(geometry1, geometry2, relation)

Returns a boolean column where the result is True if the first geometry and the second geometry satisfy the spatial relationship defined by the specified DE-9IM string code; otherwise, it returns False. The string code contains nine characters that represent the nine spatial relations of the dimensionally extended 9-intersection model (DE-9IM). The character values indicate the dimensionality of the relationship: 0 for points, 1 for linestrings, 2 for polygons, and ‘F’ to indicate an empty set.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Relate

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

  • relation (pyspark.sql.Column/str) – The DE-9IM matrix value that will be used to compare the spatial relationship. Can be a StringType column or string value.

Returns

BooleanType column. True if the spatial relationship of geometry1 and geometry2 match the DE-9IM matrix value, False otherwise.

Return type

pyspark.sql.Column

rotate

geoanalytics.sql.functions.rotate(geometry, angle_in_radians, rotation_center=None)

Rotates a geometry counterclockwise by an angle specified in radians.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Rotate

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • angle_in_radians (pyspark.sql.Column/int/float) – Rotation angle in radians. Can be a LongType, DoubleType or StringType column or a numeric value.

  • rotation_center (pyspark.sql.Column, optional) – if specified, the geometry is rotated about the rotation_center, else around the origin (0,0).

Returns

Geometry column with the rotated geometries. The returned geometry column type will be the same as the input geometry column type.

Return type

pyspark.sql.Column

scale

geoanalytics.sql.functions.scale(geometry, x_scale_factor, y_scale_factor)

Scales the geometry to a new size by multiplying the coordinates with the corresponding factor parameters.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Scale

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • x_scale_factor (pyspark.sql.Column/int/float) – Numeric value that specifies the scale factor in the X direction. Can be a LongType, DoubleType or StringType column or a numeric value.

  • y_scale_factor (pyspark.sql.Column/int/float) – Numeric value that specifies the scale factor in the Y direction. Can be a LongType, DoubleType or StringType column or a numeric value.

Returns

Geometry column with the scaled geometries. The returned geometry column type will be the same as the input geometry column type.

Return type

pyspark.sql.Column

segmentize

geoanalytics.sql.functions.segmentize(linestring, max_segment_length=2)

Returns an array column. This function creates an array of linestrings from the input linestring by breaking the input linestring into segments that are shorter than or equal to the maximum length specified. The maximum segment length is in the same units as the input geometry.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Segmentize

Parameters
  • linestring (pyspark.sql.Column) – Linestring geometry column.

  • max_segment_length (pyspark.sql.Column/int/float) – Maximum length for any segment created. Can be a LongType, DoubleType or StringType column or an integer or float value.

Returns

Array column representing an array of linestring segments.

Return type

pyspark.sql.Column

segments

geoanalytics.sql.functions.segments(linestring, num_points=2, step_size=1)

Returns an array column. The function creates an array of linestrings by splitting the input linestring at a certain number of vertices using a moving window. By default the function will create segments with two points and the moving window will move one point at a time (step size of 1). You can optionally include a larger number of points in each linestring by specifying a numeric value greater than 2. You can also increase the step size by setting it to a value greater than 1. Setting the step size to one less than the number of points will always result in segments that touch but do not overlap.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Segments

Parameters
  • linestring (pyspark.sql.Column) – Linestring geometry column.

  • num_points (pyspark.sql.Column/int, optional) – Numeric value representing the number of points in each segment, defaults to 2. Can be a LongType or StringType column or an integer value.

  • step_size (pyspark.sql.Column, optional) – Numeric value representing the number of points between the start of each new segment, defaults to 1. Can be a LongType or StringType column or an integer value.

Returns

Array column representing an array of linestring segments.

Return type

pyspark.sql.Column

shear

geoanalytics.sql.functions.shear(geometry, proportion_x, proportion_y)

Returns a geometry column that generalizes the input linestring or polygon geometry using the Douglas-Peucker algorithm with the specified tolerance. The result is the input geometry generalized to include only a subset of the original geometry’s vertices. Point and multipoint geometry types are not supported as input.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Shear

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • proportion_x (pyspark.sql.Column/int/float) – Numeric value that specifies the proportion of shearing in the X direction. Can be a LongType, DoubleType or StringType column or a numeric value.

  • proportion_y (pyspark.sql.Column/int/float) – Numeric value that specifies the proportion of shearing in the Y direction. Can be a LongType, DoubleType or StringType column or a numeric value.

Returns

Geometry column with the sheared geometries. The returned geometry column type will be the same as the input geometry column type.

Return type

pyspark.sql.Column

shortest_line

geoanalytics.sql.functions.shortest_line(geometry1, geometry2)

Returns a linestring column representing the shortest line that touches two geometries, using planar distance calculation. This function returns only one shortest line if there are more than one. If the two input geometries intersect, an empty line geometry is returned. If the two geometry columns are in different spatial references, the function automatically transforms the second geometry into the spatial reference of the first. To create a shortest line using geodesic distance calculation, use ST_GeodesicShortestLine.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_ShortestLine

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

Geometry column representing the shortest line.

Return type

pyspark.sql.Column

simplify

geoanalytics.sql.functions.simplify(geometry)

Returns a geometry column containing the simplified geometries. This function simplifies the input geometry according to the OpenGIS Simple Features Implementation Specification for SQL 1.2.1 (06-103r4).

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Simplify

Parameters

geometry (pyspark.sql.Column) – Geometry column.

Returns

Geometry column representing the simplified geometry. The returned geometry column type will be the same as the input geometry column type.

Return type

pyspark.sql.Column

split

geoanalytics.sql.functions.split(geometry, splitter)

Returns an array column from an input linestring or polygon column. This function splits the geometry with the splitter linestring and returns the resulting parts as an array of geometries. If the input geometry is a linestring, the output will be an array of linestrings. If the input geometry type is a polygon, the output will be an array of polygons. If a linestring is split by an equal linestring, an empty linestring along with the input linestring and the splitter linestring are returned.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Split

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • splitter (pyspark.sql.Column) – Linestring geometry column.

Returns

Array column representing an array of geometries resulting from the split. The returned geometry type will be the same as the geometry column type.

Return type

pyspark.sql.Column

square_bin

geoanalytics.sql.functions.square_bin(geometry, bin_size)

Returns a bin column containing a single square bin for each record in the input column. The specified bin size determines the height of each bin and is in the same units as the input geometry. The centroid of the input geometry is guaranteed to intersect with the bin returned but is not necessarily coincident with the bin center. Use ST_BinGeometry to obtain the geometry of each result bin.

This function can also be called with a long column representing the ID of the bin (see ST_BinId). The bin ID will be cast to a bin column.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_SquareBin

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • bin_size (int/float) – Numeric value representing the size of the side of the square bin.

Returns

Spatial bin (bin2d) column representing a single square bin for each geometry.

Return type

pyspark.sql.Column

square_bins

geoanalytics.sql.functions.square_bins(geometry, bin_size, padding=0.0)

Returns an array column containing square bins that cover the spatial extent of each record in the input column. The specified bin size determines the height of each bin and is in the same units as the input geometry. You can optionally specify a numeric value for padding, which conceptually applies a buffer of the specified distance to the input geometry before creating the square bins.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_SquareBins

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • bin_size (int/float) – Numerical value representing the size of the side of the square bin.

  • padding (int/float/str, optional) – Numerical buffer value applied to the geometry before finding the intersecting bins, defaults to 0.0.

Returns

Array column representing an array of spatial bin (bin2d) square bins.

Return type

pyspark.sql.Column

sr_text

geoanalytics.sql.functions.sr_text(geometry, wkt=None)

Can work as a getter or a setter, depending on the inputs.

Getter: Takes a geometry column and returns the spatial reference (WKT) of the column as a string column. If the spatial reference of the input geometry column has not been set, the function returns an empty string.

Setter: Takes a geometry column and a string value and returns the input geometry column with its spatial reference WKT set to the string value. This does not affect the geometry data in the column. To transform your geometry data from one spatial reference to another, use ST_Transform.

Refer to the GeoAnalytics On-Demand Engine guide for examples and usage notes: ST_SRText

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • wkt (str, optional) – Spatial reference (WKT) to set on the geometry, defaults to None.

Returns

  • Getter: StringType column representing the spatial reference (WKT) for the geometry.

  • Setter: Geometry column representing the geometry with the updated spatial reference. The returned geometry column type will be the same as the input geometry column type.

Return type

pyspark.sql.Column

srid

geoanalytics.sql.functions.srid(geometry, srid=None)

Can work as a getter or a setter, depending on the inputs.

Getter: Takes a geometry column and returns the spatial reference (SRID) of the column as an integer column. If the SRID of the input geometry column has not been set, the function returns 0.

Setter: Takes a geometry column and a numeric value and returns the input geometry column with its SRID set to the numeric value. This does not affect the geometry data in the column. To transform your geometry data from one spatial reference to another, use ST_Transform.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_SRID

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • srid (int, optional) – Spatial reference (SRID) to set on the geometry, defaults to None.

Returns

  • Getter: IntegerType column representing the spatial reference (SRID) for the geometry.

  • Setter: Geometry column representing the geometry with the updated spatial reference. The returned geometry column type will be the same as the input geometry column type.

Return type

pyspark.sql.Column

start_point

geoanalytics.sql.functions.start_point(linestring)

Returns a point column representing the first point of the input linestring.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_StartPoint

Parameters

linestring (pyspark.sql.Column) – Linestring geometry column.

Returns

Point column representing the starting point.

Return type

pyspark.sql.Column

sym_difference

geoanalytics.sql.functions.sym_difference(geometry1, geometry2)

Returns a geometry column containing the geometries that represent the portions of the input geometries that do not intersect. If one of the input geometry types is geometry, the output type will be the same. For all other cases the result geometry type will be the same as the input geometry type with the highest dimension.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_SymDifference

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

Geometry column representing the portions of geometry1 and geometry2 that do not intersect. The returned geometry column type will be the highest dimension of the two input geometries. If one or both of the input geometries is a generic geometry type, then a generic geometry column type will be returned. For example linestring and polygon input will return a polygon geometry type.

Return type

pyspark.sql.Column

symmetric_diff

geoanalytics.sql.functions.symmetric_diff(geometry1, geometry2)

Esri alias for OGC ST_SymDifference.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_SymDifference

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

Geometry column representing the portions of geometry1 and geometry2 that do not intersect. The returned geometry column type will be the highest dimension of the two input geometries. If one or both of the input geometries is a generic geometry type, then a generic geometry column type will be returned. For example linestring and polygon input will return a polygon geometry type.

Return type

pyspark.sql.Column

touches

geoanalytics.sql.functions.touches(geometry1, geometry2)

Returns a boolean column where the result is True if the first geometry and the second geometry spatially touch on their boundaries (i.e., their intersection is a single point); otherwise, it returns False.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Touches

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

BooleanType column. True if geometry1 and geometry2 spatially touch on their boundaries, False otherwise.

Return type

pyspark.sql.Column

transform

geoanalytics.sql.functions.transform(geometry, sr, *, extent=None, datum_transform=None)

Returns a geometry column. The input geometry column must have a spatial reference set. The sr parameter value must be a valid SRID or WKT string. The function returns the input geometries transformed into the specified spatial reference. It will also set the spatial reference of the result column. To learn more about what it means to transform your geometry data, see Coordinate systems and transformations. To set the spatial reference of a geometry column without transforming the geometries, use ST_SRID or ST_SRText.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Transform

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • sr (int/str) – The spatial reference (SRID or WKT) that the geometry will be projected into.

  • extent (BoundingBox, optional) – Extent of the area of analysis to use when determining the best transformation to use.

  • datum_transform (str, optional) – Transformation path to use when transforming between geographic spatial references. This parameter overrides the session level transform settings as well as the extent param.

Returns

Geometry column representing the projected geometry. The returned geometry column type will be the same as the input geometry column type.

Return type

pyspark.sql.Column

translate

geoanalytics.sql.functions.translate(geometry, x_offset, y_offset)

Translates a geometry by given offsets.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Translate

Parameters
  • geometry (pyspark.sql.Column) – Geometry column.

  • x_offset (pyspark.sql.Column/int/float) – Numeric value that specifies the offset in the X direction. Can be a LongType, DoubleType or StringType column or a numeric value.

  • y_offset (pyspark.sql.Column/int/float) – Numeric value that specifies the offset in the Y direction. Can be a LongType, DoubleType or StringType column or a numeric value.

Returns

Geometry column with the translated geometries. The returned geometry column type will be the same as the input geometry column type.

Return type

pyspark.sql.Column

union

geoanalytics.sql.functions.union(*geometries)

Returns a geometry column containing the geometries that represent the spatial union of the geometries in each row of the input columns. The geometry types of the input columns must be the same. To find the union of all geometries in a group or column use ST_Aggr_Union.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Union

Parameters

geometries (pyspark.sql.Column) – Multiple geometry columns.

Returns

Geometry column representing the spatial union of the geometries. The returned geometry column type will be the same type as the input geometries except in the case of point input which will return multipoint. If one or both of the input geometries is a generic geometry type, then a generic geometry column type will be returned.

Return type

pyspark.sql.Column

within

geoanalytics.sql.functions.within(geometry1, geometry2)

Returns a boolean column where the result is True if the first geometry is completely inside the second geometry; otherwise, it returns False.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Within

Parameters
  • geometry1 (pyspark.sql.Column) – Geometry column.

  • geometry2 (pyspark.sql.Column) – Geometry column.

Returns

BooleanType column. True if geometry1 is within geometry2, False otherwise.

Return type

pyspark.sql.Column

wkb_to_sql

geoanalytics.sql.functions.wkb_to_sql(wkb)

OGC alias for ST_GeomFromBinary without SRID.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_GeomFromBinary

Parameters

wkb (pyspark.sql.Column) – BinaryType column with the Well-Known Binary (WKB) representation.

Returns

Generic geometry column from the Well-Known Binary (WKB) representation.

Return type

pyspark.sql.Column

wkt_to_sql

geoanalytics.sql.functions.wkt_to_sql(wkt)

OGC alias for ST_GeomFromText without SRID.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_GeomFromText

Parameters

wkt (pyspark.sql.Column) – StringType column with the well-known text (WKT) representation of geometries.

Returns

Generic geometry column from the well-known text (WKT) representation.

Return type

pyspark.sql.Column

x

geoanalytics.sql.functions.x(point, new_value=None)

Can work as a getter or a setter, depending on the inputs.

Getter: Takes a point column and returns a double column containing the x-coordinate of the input points.

Setter: Takes a point column and a numeric value or column and returns a point column containing the input points with the x-coordinates set to the numeric value.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_X

Parameters
  • point (pyspark.sql.Column) – Point geometry column.

  • new_value (pyspark.sql.Column/int/float, optional) – X-coordinate to set. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.

Returns

  • Getter: DoubleType column representing the x-coordinate.

  • Setter: Point column representing the updated x-coordinate.

Return type

pyspark.sql.Column

y

geoanalytics.sql.functions.y(point, new_value=None)

Can work as a getter or a setter, depending on the inputs.

Getter: Takes a point column and returns a double column containing the y-coordinate of the input points.

Setter: Takes a point column and a numeric value or column and returns a point column containing the input points with the y-coordinates set to the numeric value.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Y

Parameters
  • point (pyspark.sql.Column) – Point geometry column.

  • new_value (pyspark.sql.Column/int/float, optional) – Y-coordinate to set. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.

Returns

  • Getter: DoubleType column representing the y-coordinate.

  • Setter: Point column representing the updated y-coordinate.

Return type

pyspark.sql.Column

z

geoanalytics.sql.functions.z(point, new_value=None)

Can work as a getter or a setter, depending on the inputs.

Getter: Takes a point column and returns a double column containing the z-coordinates of the input points. If a point does not have a z-coordinate the function returns NaN.

Setter: Takes a point column and a numeric value or column and returns a point column containing the input points with the z-coordinates set to the numeric value.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_Z

Parameters
  • point (pyspark.sql.Column) – Point geometry column.

  • new_value (pyspark.sql.Column/int/float, optional) – Z-coordinate to set. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.

Returns

  • Getter: DoubleType column representing the z-coordinate.

  • Setter: Point column representing the updated z-coordinate.

Return type

pyspark.sql.Column