geoanalytics.sql.functions¶

aggr_convex_hull¶

geoanalytics.sql.functions.aggr_convex_hull(geometry)¶

Operates on a grouped DataFrame and returns the convex hull of geometries in each group. You can group your DataFrame using DataFrame.groupBy() or with a GROUP BY clause in a SQL statement.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Aggr_ConvexHull

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: Polygon column representing the convex hull of all of the geometries.
Return type:: pyspark.sql.Column

aggr_intersection¶

geoanalytics.sql.functions.aggr_intersection(geometry)¶

Operates on a grouped DataFrame and returns the intersection of geometries in each group. You can group your DataFrame using DataFrame.groupBy() or with a GROUP BY clause in a SQL statement. An empty geometry is returned when no geometries intersect.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Aggr_Intersection

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: Geometry column representing the intersection of all of the geometries. The returned geometry column type will be the same as the input geometry column type.
Return type:: pyspark.sql.Column

aggr_linestring¶

geoanalytics.sql.functions.aggr_linestring(point, order_by)¶

Operates on a grouped DataFrame and returns a linestring containing the points ordered by the order_by column. You can group your DataFrame using DataFrame.groupBy() or with a GROUP BY clause in a SQL statement.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Aggr_Linestring

Parameters:

point (pyspark.sql.Column) – Point geometry column.
order_by (pyspark.sql.Column) – Column to sort by

Returns:

Linestring column representing the sorted input points.

Return type:

pyspark.sql.Column

aggr_mean_center¶

geoanalytics.sql.functions.aggr_mean_center(geometry, weight=None)¶

Operates on a grouped DataFrame and returns the weighted aggregated centroid (mean center) of geometries in each group. You can optionally specify a numeric weight column which is used to weight locations according to their relative importance. The default is unweighted. You can group your DataFrame using DataFrame.groupBy() or with a GROUP BY clause in a SQL statement.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Aggr_MeanCenter

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
weight (pyspark.sql.Column, optional) – Used to weight locations according to their relative importance, defaults to None. Can be a LongType, DoubleType or StringType column with numeric values.

Returns:

Point column representing the weighted aggregate centroid.

Return type:

pyspark.sql.Column

aggr_stdev_ellipse¶

geoanalytics.sql.functions.aggr_stdev_ellipse(geometry, num_stdev=1.0, weight=None, min_records=2)¶

Operates on a grouped DataFrame and returns the weighted aggregate standard deviational ellipse of geometries in each group. You can group your DataFrame using DataFrame.groupBy() or with a GROUP BY clause in a SQL statement.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Aggr_StdevEllipse

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
num_stdev (double, optional) – Size of the returned ellipse in standard deviations, defaults to 1.
weight (pyspark.sql.Column, optional) – Weights locations according to their relative importance, defaults to None. Can be a LongType, DoubleType or StringType column with numeric values.
min_records (int, optional) – Number of geometries that must be considered for a standard deviation to be calculated, defaults to 2.

Returns:

Polygon column representing the weighted aggregate Standard Deviational Ellipse.

Return type:

pyspark.sql.Column

aggr_union¶

geoanalytics.sql.functions.aggr_union(geometry)¶

Operates on a grouped DataFrame and returns the union of geometries in each group. All geometries are required to have the same type. For example having point, linestring, and polygon geometry types in the same column is not supported. You can group your DataFrame using DataFrame.groupBy() or with a GROUP BY clause in a SQL statement.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Aggr_Union

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: Geometry column representing the union of all of the geometries. The returned geometry column type will be the same type as the input geometries except in the case of point input which will return multipoint.
Return type:: pyspark.sql.Column

area¶

geoanalytics.sql.functions.area(geometry)¶

Returns a double column with the planar area of each geometry in the input column. The unit of the area calculation is the same as the units of the input geometries. For example, if you have polygons in a spatial reference that uses meters, the result will be in square meters. If your input geometries are in a geographic coordinate system, it is recommended that you use ST_GeodesicArea. Geometry types other than polygon will return 0.0 for the area.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Area

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: DoubleType column representing the area.
Return type:: pyspark.sql.Column

as_binary¶

geoanalytics.sql.functions.as_binary(geometry)¶

Returns the well-known binary (WKB) representation of the geometry as a binary column.

Refer to the GeoAnalytics guide for examples and usage notes: ST_AsBinary

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: BinaryType column with the Well-Known Binary (WKB) representation of the geometry.
Return type:: pyspark.sql.Column

as_esri_json¶

geoanalytics.sql.functions.as_esri_json(geometry)¶

Returns the Esri JSON representation of the geometry as a string column.

Refer to the GeoAnalytics guide for examples and usage notes: ST_AsEsriJSON

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: StringType column with the Esri JSON representation of the geometry.
Return type:: pyspark.sql.Column

as_geojson¶

geoanalytics.sql.functions.as_geojson(geometry)¶

Returns the GeoJSON representation of the geometry as a string column. Geometries that have an m-value and no z-coordinate will only return x,y coordinates.

Refer to the GeoAnalytics guide for examples and usage notes: ST_AsGeoJSON

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: StringType column with the GeoJSON representation of the geometry.
Return type:: pyspark.sql.Column

as_shape¶

geoanalytics.sql.functions.as_shape(geometry)¶

Returns the shapefile representation of the geometry as a binary column.

Refer to the GeoAnalytics guide for examples and usage notes: ST_AsShape

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: BinaryType column with the shapefile representation of the geometry.
Return type:: pyspark.sql.Column

as_text¶

geoanalytics.sql.functions.as_text(geometry)¶

Returns the well-known text (WKT) representation of the geometry as a string column.

Refer to the GeoAnalytics guide for examples and usage notes: ST_AsText

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: StringType column with the well-known text (WKT) representation of the geometry.
Return type:: pyspark.sql.Column

azimuth¶

geoanalytics.sql.functions.azimuth(geometry1, geometry2)¶

Returns a double column representing the normalized azimuth in degrees. The output azimuth angle is heading from the first geometry to the second geometry. The angle is referenced from the north and is positive clockwise. This function requires that the first geometry column have a spatial reference set.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Azimuth

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

DoubleType column representing the normalized azimuth in degrees.

Return type:

pyspark.sql.Column

bbox_intersects¶

geoanalytics.sql.functions.bbox_intersects(geometry, xmin, ymin, xmax, ymax)¶

Returns a boolean column where the result is True if the geometry intersects the defined envelope; otherwise, it returns False. The four numeric values represent the minimum and maximum x,y coordinates of an axis-aligned rectangle, also known as an envelope. The x,y coordinates should be specified in the same units as the input geometry column. For example, if the input geometry is in a spatial reference that uses degrees, xmin, ymin, xmax, and ymax should all be in degrees.

Refer to the GeoAnalytics guide for examples and usage notes: ST_BboxIntersects

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
xmin (int/float) – The minimum x-coordinate point for the Envelope.
ymin (int/float) – The minimum y-coordinate point for the Envelope.
xmax (int/float) – The maximum x-coordinate point for the Envelope.
ymax (int/float) – The maximum y-coordinate point for the Envelope.

Returns:

BooleanType column. True if geometry intersects the Envelope, False otherwise.

Return type:

pyspark.sql.Column

bin_center¶

geoanalytics.sql.functions.bin_center(bin)¶

Returns a point column representing the center point of each bin.

Refer to the GeoAnalytics guide for examples and usage notes: ST_BinCenter

Parameters:: bin (pyspark.sql.Column) – Spatial Bin (bin2d) column with the key for the bin.
Returns:: Point column representing the center point for the bin associated with the given key.
Return type:: pyspark.sql.Column

bin_geometry¶

geoanalytics.sql.functions.bin_geometry(bin)¶

Returns a polygon column representing the geometry of each bin.

Refer to the GeoAnalytics guide for examples and usage notes: ST_BinGeometry

Parameters:: bin (pyspark.sql.Column) – Spatial Bin (bin2d) column with the key for the bin.
Returns:: Polygon column representing the polygon for the bin associated with the given key.
Return type:: pyspark.sql.Column

bin_id¶

geoanalytics.sql.functions.bin_id(bin)¶

Returns a long column representing the unique identifier of each bin.

Refer to the GeoAnalytics guide for examples and usage notes: ST_BinId

Parameters:: bin (pyspark.sql.Column) – Spatial Bin (bin2d) column with the key for the bin.
Returns:: LongType column representing the ID of the given key.
Return type:: pyspark.sql.Column

boundary¶

geoanalytics.sql.functions.boundary(geometry)¶

Returns a geometry column representing the topological boundary of the given geometry. The function will return a linestring if the input is a polygon and a multipoint if the input is a linestring. Point and multipoint inputs are not supported.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Boundary

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: Geometry column representing the boundary. The returned geometry column type will be multipoint for linestring and multilinestring input and linestring for polygon and multipolygon input.
Return type:: pyspark.sql.Column

buffer¶

geoanalytics.sql.functions.buffer(geometry, distance)¶

Returns a polygon column with buffer polygons that represent the area that is less than or equal to the specified planar distance from each input geometry. The distance can be specified with or without a unit. When specified with a unit, the distance can be created with ST_CreateDistance or with a tuple containing a number and a unit (e.g., (10, “kilometers”)). When specified without a unit, the distance can be a single value or a numeric column, interpreted as distance in the same units as the input geometry. To create a buffer polygon using geodesic distance calculations use ST_GeodesicBuffer.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Buffer

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
distance (pyspark.sql.Column/int/float/tuple) – Distance used to create the buffer. When specified without a unit, it can be a LongType, DoubleType or StringType column with numeric values or a numeric value. When specified with a unit, it can be a StructType column or tuple containing a number and a unit.

Returns:

Polygon column representing the buffer around the input geometry.

Return type:

pyspark.sql.Column

cast¶

geoanalytics.sql.functions.cast(geometry, geometry_type)¶

Returns a geometry column that contains the input geometries cast to the geometry type specified by the string value. The string value can be ‘point’, ‘multipoint’, ‘linestring’, ‘polygon’, or ‘geometry’. The function will return null when the input geometries cannot be cast to the specified type.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Cast

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
geometry_type (str) – Geometry type to cast the input geometry to.

Returns:

Geometry column representing the cast geometry type.

Return type:

pyspark.sql.Column

centerline¶

geoanalytics.sql.functions.centerline(geometry)¶

Creates a centerline of a polygon geometry.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Centerline

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: Geometry column with the centerline of the polygon feature.
Return type:: pyspark.sql.Column

centroid¶

geoanalytics.sql.functions.centroid(geometry)¶

Returns a point column that represents the centroid of each input geometry. The result point is not guaranteed to be on the surface of the geometry.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Centroid

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: Point column representing the centroid.
Return type:: pyspark.sql.Column

closest_point¶

geoanalytics.sql.functions.closest_point(geometry1, geometry2)¶

Returns a point column representing the point on the first geometry that is closest to the second geometry. This function calculates the planar distance between the two geometries to identify the closest point on the first geometry in relation to the second geometry. To learn more about the difference between planar and geodesic calculations see Coordinate systems and transformations. If the two input geometry columns are in different spatial references, the spatial reference of the output geometry would be the same as the first geometry.

Refer to the GeoAnalytics guide for examples and usage notes: ST_ClosestPoint

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

Point column representing the closest point.

Return type:

pyspark.sql.Column

contains¶

geoanalytics.sql.functions.contains(geometry1, geometry2)¶

Returns a boolean column where the result is True if the geometry in the first column completely contains the second; otherwise, it is False.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Contains

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

BooleanType column. True if geometry1 contains geometry2, False otherwise.

Return type:

pyspark.sql.Column

convex_hull¶

geoanalytics.sql.functions.convex_hull(geometry)¶

Returns a geometry column that represents the convex hull of the input geometries in each record. A convex hull is the smallest geometry having only interior angles measuring less than 180° that encloses each input geometry. For multipoint, linestring, and polygon geometries the result will be a polygon. For point geometries, the result is a point. The result column will always have the generic geometry type.

Refer to the GeoAnalytics guide for examples and usage notes: ST_ConvexHull

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: Generic geometry column representing the convex hull of the given geometry.
Return type:: pyspark.sql.Column

coord_dim¶

geoanalytics.sql.functions.coord_dim(geometry)¶

Returns an integer column representing the dimensionality of the coordinates in the input geometry. For example, an input geometry with x,y coordinates only will return 2, while a geometry with x,y,z coordinates will return 3.

Refer to the GeoAnalytics guide for examples and usage notes: ST_CoordDim

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: IntegerType column representing the dimensionality.
Return type:: pyspark.sql.Column

create_distance¶

geoanalytics.sql.functions.create_distance(value, unit)¶

Returns a struct column representing a distance.

Refer to the GeoAnalytics guide for examples and usage notes: ST_CreateDistance

Parameters:

value (pyspark.sql.Column/float/int) – A numeric value or column of numeric values.
unit (pyspark.sql.Column/str) – The distance unit. Choose from Meters, Kilometers, Feet, Yards, Miles, or NauticalMiles.

Returns:

StructType column representing the distance.

Return type:

pyspark.sql.Column

create_duration¶

geoanalytics.sql.functions.create_duration(value, unit)¶

Returns a struct column representing a duration.

Refer to the GeoAnalytics guide for examples and usage notes: ST_CreateDuration

Parameters:

value (pyspark.sql.Column/int) – An integer value or column of integer values.
unit (pyspark.sql.Column/str) – The duration unit. Choose from Milliseconds, Seconds, Minutes, Hours, or Days.

Returns:

StructType column representing the duration.

Return type:

pyspark.sql.Column

crosses¶

geoanalytics.sql.functions.crosses(geometry1, geometry2)¶

Returns a boolean column where the result is True if the two geometries cross; otherwise, it returns False. Two geometries cross when their intersection is not empty and is not equal to either of the geometries. The intersection must also have a dimensionality less than the maximum dimension of the two input geometries.

This function is only relevant for the following combinations of geometries:

multipoint/linestring

multipoint/polygon

linestring/polygon

linestring/multipoint

linestring/linestring

polygon/multipoint

polygon/linestring

For all other combinations the function will always return False.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Crosses

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

BooleanType column. True if geometry1 crosses geometry2, False otherwise.

Return type:

pyspark.sql.Column

densify¶

geoanalytics.sql.functions.densify(geometry, max_segment_length)¶

Returns a geometry column of densified geometries. This function adds vertices along linestrings and polygons such that every segment within the geometry is no longer than max_segment_length with planar distance calculation. The max_segment_length can be specified with or without a unit. When specified with a unit, it can be created with ST_CreateDistance or with a tuple containing a number and a unit (e.g., (10, “meters”)). When specified with a unit, it is in the same units as the input geometry and should be greater than zero. To densify geometry using geodesic distance calculation, use ST_GeodesicDensify.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Densify

Parameters:

geometry (pyspark.sql.Column) – Polygon or linestring geometry column
max_segment_length (pyspark.sql.Column/int/float/tuple) – Maximum length of all planar segments in the resulting polygon or linestring.

Returns:

Geometry column representing the densified linestrings or polygons with planar distance calculation.

Return type:

pyspark.sql.Column

difference¶

geoanalytics.sql.functions.difference(geometry1, geometry2)¶

Returns a geometry column representing the parts of the first geometry that do not intersect the second geometry. The result column geometry type will be that of the first geometry. If the first geometry is completely contained in the second geometry, then null geometry is returned.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Difference

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

Geometry column representing the part of geometry1 that does not intersect geometry2.

Return type:

pyspark.sql.Column

dimension¶

geoanalytics.sql.functions.dimension(geometry)¶

Returns an integer column representing the dimensionality of the input geometry. Points and multipoints have a dimension of 0, lines 1, and polygons 2.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Dimension

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: IntegerType column representing the dimensionality.
Return type:: pyspark.sql.Column

disjoint¶

geoanalytics.sql.functions.disjoint(geometry1, geometry2)¶

Returns a boolean column where the result is True if the first and second geometry are disjoint; otherwise, it returns False. Two geometries are disjoint if they are not overlapping, touching or intersecting each other.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Disjoint

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

BooleanType column. True if geometry1 and geometry2 are disjoint, False otherwise.

Return type:

pyspark.sql.Column

distance¶

geoanalytics.sql.functions.distance(geometry1, geometry2)¶

Returns a double column representing the planar distance between the two input geometries. For multipoints, lines, and polygons, the distance is calculated from the nearest point between the geometries. The result will be in the same units as the input geometry data. For example, if your input geometries are in a spatial reference that uses meters, the result values will be in meters. If your input geometries are in a geographic coordinate system, use ST_GeodesicDistance to calculate distance.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Distance

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

DoubleType column representing the planar distance.

Return type:

pyspark.sql.Column

dwithin¶

geoanalytics.sql.functions.dwithin(geometry1, geometry2, distance, geodesic=False)¶

Returns a boolean column where the result is True if the two geometries are spatially within the given distance; otherwise, it returns False. For multipoints, lines, and polygons, the distance is calculated from the nearest point between the geometries. You can optionally provide a boolean value that determines if geodesic distances will be used by the function. Planar distances will be used by default.

The distance can be specified with or without a unit. When specified with a unit, the distance can be created with ST_CreateDistance or with a tuple containing a number and a unit (e.g., (10, “kilometers”)). When geodesic distance is used, and distance is specified without a unit, the function interprets the value as distance in meters. When planar distance is used, and distance is specified without a unit, the function interprets the value as distance in the same units as the input geometry.

Refer to the GeoAnalytics guide for examples and usage notes: ST_DWithin

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.
distance (pyspark.sql.Column/int/float/tuple) – Distance value to use. When specified without a unit, it can be a LongType, DoubleType or StringType column with numeric values or a numeric value. When specified with a unit, it can be a StructType column or tuple containing a number and a unit.
geodesic (bool, optional) – Geodesic distance will be used between geometries instead of planar distance, defaults to False.

Returns:

BooleanType column. True if geometry1 and geometry2 are spatially within a given distance, False otherwise.

Return type:

pyspark.sql.Column

end_point¶

geoanalytics.sql.functions.end_point(linestring)¶

Returns a point column representing the last point of the input linestring.

Refer to the GeoAnalytics guide for examples and usage notes: ST_EndPoint

Parameters:: linestring (pyspark.sql.Column) – Linestring geometry column.
Returns:: Point column representing the ending point.
Return type:: pyspark.sql.Column

envelope¶

geoanalytics.sql.functions.envelope(geometry)¶

Returns a polygon column representing an envelope for each geometry in the input column, where an envelope is the smallest rectangle that encompasses the input geometry and aligns to the x-axis and y-axis. To find the smallest rectangle that encompasses a geometry but is not axis-aligned, use ST_MinBoundingBox. If the input geometry is a single point, the function will create a degenerate polygon at the location of the point.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Envelope

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: Polygon column representing the envelope.
Return type:: pyspark.sql.Column

env_intersects¶

geoanalytics.sql.functions.env_intersects(geometry, *args, **kwargs)¶

Returns a boolean column where the result is True if the envelopes of two geometries spatially intersect; otherwise, it returns False.

Note

The behavior of ST_EnvIntersects was changed at version 1.2.0. ST_EnvIntersects will not support defining envelopes with minimum and maximum x,y coordinates in the future. Use ST_BboxIntersects to do this instead.

Refer to the GeoAnalytics guide for examples and usage notes: ST_EnvIntersects

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

BooleanType column. True if the envelopes of two geometries spatially intersect in 2D, False otherwise.

Return type:

pyspark.sql.Column

equals¶

geoanalytics.sql.functions.equals(geometry1, geometry2)¶

Returns a boolean column where the result is True if the first geometry and the second geometry are spatially equal; otherwise, it returns False.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Equals

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

BooleanType column. True if geometry1 and geometry2 are spatially equal, False otherwise.

Return type:

pyspark.sql.Column

euclidean_distance¶

geoanalytics.sql.functions.euclidean_distance(linestring1, linestring2)¶

Returns a double column representing the Euclidean distance between the two input linestring geometries. The result will be in the same units as the input geometry data. For example, if the input geometries are in a spatial reference that uses meters, the result values will be in meters.

Refer to the GeoAnalytics guide for examples and usage notes: ST_EuclideanDistance

Parameters:

linestring1 (pyspark.sql.Column) – Linestring geometry column.
linestring2 (pyspark.sql.Column) – Linestring geometry column.

Returns:

DoubleType column representing the Euclidean distance between the two input linestrings.

Return type:

pyspark.sql.Column

exterior_ring¶

geoanalytics.sql.functions.exterior_ring(polygon)¶

Returns a linestring column representing the exterior ring of the polygon. Multipart polygons will return null.

Refer to the GeoAnalytics guide for examples and usage notes: ST_ExteriorRing

Parameters:: polygon (pyspark.sql.Column) – Polygon geometry column.
Returns:: Linestring column representing the exterior ring.
Return type:: pyspark.sql.Column

flip¶

geoanalytics.sql.functions.flip(geometry, mode)¶

Flips the input geometry around an axis. There are three options:

X_AXIS - flips a geometry vertically around the horizontal axis.
Y_AXIS - flips a geometry horizontally around the vertical axis.
BOTH_AXES - flips a geometry horizontally around the vertical axis and vertically around the horizontal axis.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Flip

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
mode (str) – Choose from X_AXIS, Y_AXIS, or BOTH_AXES.

Returns:

Geometry column with the flipped geometries. The returned geometry column type will be the same as the input geometry column type.

Return type:

pyspark.sql.Column

fréchet_distance¶

geoanalytics.sql.functions.frechet_distance(linestring1, linestring2)¶

Returns a double column representing the discrete Fréchet distance between the two input linestring geometries.

The function can be used with or without the diacritic.

Refer to the GeoAnalytics guide for examples and usage notes: ST_FréchetDistance

Parameters:

linestring1 (pyspark.sql.Column) – Linestring geometry column.
linestring2 (pyspark.sql.Column) – Linestring geometry column.

Returns:

DoubleType column representing the discrete Fréchet distance between the two input linestrings.

Return type:

pyspark.sql.Column

generalize¶

geoanalytics.sql.functions.generalize(geometry, tolerance)¶

Returns a geometry column that generalizes the input linestring or polygon geometry using the Douglas-Peucker algorithm with the specified tolerance. The tolerance can be specified with or without a unit. When specified with a unit, the tolerance can be created with ST_CreateDistance or with a tuple containing a number and a unit (e.g., (10, “meters”)). When specified without a unit, the tolerance can be a single value or a numeric column, and it is interpreted as distance in the same units as the input geometry. The result is the input geometry generalized to include only a subset of the original geometry’s vertices. Point and multipoint geometry types are not supported as input.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Generalize

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
tolerance (pyspark.sql.Column/int/float/tuple) – Numeric value that limits the distance the output geometry can differ from the input geometry. When specified without a unit, it can be a LongType, DoubleType or StringType column or a numeric value. When specified with a unit, it can be a StructType column or tuple containing a number and a unit.

Returns:

Geometry column with the generalized geometries. The returned geometry column type will be the same as the input geometry column type.

Return type:

pyspark.sql.Column

geodesic_area¶

geoanalytics.sql.functions.geodesic_area(geometry)¶

Returns a double column containing the geodesic area of the input geometry in square meters. For point, multipoint, and linestring geometries this function will always return 0. This function is more accurate but less performant than ST_Area and requires that a spatial reference is set on the input geometry column. To learn more about the difference between planar and geodesic calculations see Coordinate systems and transformations.

Refer to the GeoAnalytics guide for examples and usage notes: ST_GeodesicArea

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: DoubleType column representing the geodesic area.
Return type:: pyspark.sql.Column

geodesic_buffer¶

geoanalytics.sql.functions.geodesic_buffer(geometry, distance)¶

Returns a polygon column with buffer polygons representing the geodesic area that is less than or equal to the specified distance from each input geometry. The distance can be specified with or without a unit. When specified with a unit, the distance can be created with ST_CreateDistance or with a tuple containing a number and a unit (e.g., (10, “kilometers”)). When specified without a unit, the distance can be a single value or a numeric column, interpreted as distance in meters. The result will also be in meters. This function is more accurate but less performant than ST_Buffer and requires that a spatial reference is set on the input geometry column. To learn more about the difference between planar and geodesic calculations see Coordinate systems and transformations.

Refer to the GeoAnalytics guide for examples and usage notes: ST_GeodesicBuffer

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
distance (pyspark.sql.Column/int/float/tuple) – Distance used to create the buffer. When specified without a unit, it can be a LongType, DoubleType or StringType column with numeric values or a numeric value. When specified with a unit, it can be a StructType column or tuple containing a number and a unit.

Returns:

Polygon column representing the geodesic buffer around the input geometry.

Return type:

pyspark.sql.Column

geodesic_closest_point¶

geoanalytics.sql.functions.geodesic_closest_point(geometry1, geometry2)¶

Returns a point column representing the point on the first geometry that is closest to the second geometry. This function calculates the geodesic distance between the two geometries to identify the closest point on the first geometry in relation to the second geometry. This function requires that a spatial reference is set on the input geometry columns. If the two geometry columns are in different spatial references, the function automatically transforms the second geometry into the spatial reference of the first. To learn more about the difference between planar and geodesic calculations, see Coordinate systems and transformations.

Refer to the GeoAnalytics guide for examples and usage notes: ST_GeodesicClosestPoint

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

Point column representing the geodesic closest point.

Return type:

pyspark.sql.Column

geodesic_densify¶

geoanalytics.sql.functions.geodesic_densify(geometry, max_segment_length)¶

Returns a geometry column of densified geometries. This function adds vertices along linestrings or polygons to create densified approximations of geodesic segments with each segment being no longer than max_segment_length. The max_segment_length can be specified with or without a unit. When specified with a unit, it can be created with ST_CreateDistance or with a tuple containing a number and a unit (e.g., (10, “meters”)). When specified without a unit, max_segment_length should be specified in meters and greater than zero. This function is more accurate but less performant than ST_Densify and requires that a spatial reference is set on the input geometry column. To learn more about the difference between planar and geodesic calculations see Coordinate systems and transformations.

Refer to the GeoAnalytics guide for examples and usage notes: ST_GeodesicDensify

Parameters:

geometry (pyspark.sql.Column) – Polygon or linestring geometry column
max_segment_length (pyspark.sql.Column/int/float/tuple) – Maximum length in meters of all geodesic segments in the resulting polygon or linestring.

Returns:

Geometry column representing the densified linestrings or polygons with geodesic distance calculation.

Return type:

pyspark.sql.Column

geodesic_distance¶

geoanalytics.sql.functions.geodesic_distance(geometry1, geometry2)¶

Returns a double column representing the geodesic distance between the two input geometries in meters. For multipoints, lines, and polygons, the distance is calculated from the nearest point between the geometries. This function is more accurate but less performant than ST_Distance and requires that a spatial reference is set on at least the first input geometry column. To learn more about the difference between planar and geodesic calculations see Coordinate systems and transformations. If the two geometry columns are in different spatial references, the function will automatically transform the second geometry into the spatial reference of the first.

Refer to the GeoAnalytics guide for examples and usage notes: ST_GeodesicDistance

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

DoubleType column representing the geodesic distance.

Return type:

pyspark.sql.Column

geodesic_length¶

geoanalytics.sql.functions.geodesic_length(geometry)¶

Returns a double column that represents the geodesic length of the input geometry. The length is calculated in meters. For point and multipoint geometries this function will always return 0. For polygon geometries this function will return the geodesic length of the perimeter of the polygon. This function is more accurate but less performant than ST_Length and requires that a spatial reference is set on the input geometry column. To learn more about the difference between planar and geodesic calculations see Coordinate systems and transformations.

Refer to the GeoAnalytics guide for examples and usage notes: ST_GeodesicLength

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: DoubleType column representing the geodesic length.
Return type:: pyspark.sql.Column

geodesic_shortest_line¶

geoanalytics.sql.functions.geodesic_shortest_line(geometry1, geometry2)¶

Returns a linestring column representing the shortest line that touches two geometries, using geodesic distance calculation. This function returns only one shortest line if there are more than one. If the two input geometries intersect, an empty line geometry is returned. This function requires that a spatial reference ID is set on the input geometry columns. If the two geometry columns are in different spatial references, the function automatically transforms the second geometry into the spatial reference of the first. To learn more about the difference between planar and geodesic calculations, see Coordinate systems and transformations.

Refer to the GeoAnalytics guide for examples and usage notes: ST_GeodesicShortestLine

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

Geometry column representing the geodesic shortest line.

Return type:

pyspark.sql.Column

geohash_bin¶

geoanalytics.sql.functions.geohash_bin(geometry, precision)¶

Returns a bin column containing a single Geohash bin at the specified precision for each record in the input column. Use ST_BinGeometry to obtain the geometry of each result bin.

This function can also be called with a string column representing the ID of the bin (see ST_BinId). The bin ID will be cast to a bin column.

ST_GeohashBin requires the spatial reference of the geometry column to be GCS_WGS_1984 (EPSG:4326). If the input geometry is in a different spatial reference, this function automatically transforms the geometry into GCS_WGS_1984. To learn more about spatial references, see Coordinate systems and transformations.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_GeohashBin

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
precision (int) – Numerical value representing the size of the Geohash bin.

Returns:

Spatial bin (bin2d) column representing a single Geohash bin for each geometry.

Return type:

pyspark.sql.Column

geohash_bins¶

geoanalytics.sql.functions.geohash_bins(geometry, precision, padding=0.0)¶

Returns an array column containing Geohash bins at the specified precision that cover the spatial extent of each record in the input column. You can optionally specify a numeric value for padding, which conceptually applies a buffer of the specified distance to the input geometry before creating the Geohash bins. The padding value is in meters. Use ST_BinGeometry to obtain the geometry of each result bin.

ST_GeohashBins requires the spatial reference to be set to GCS_WGS_1984 (EPSG:4326). If the input geometry is in a different spatial reference, this function automatically transforms the geometry into GCS_WGS_1984. To learn more about spatial references, see Coordinate systems and transformations.

Refer to the GeoAnalytics Engine guide for examples and usage notes: ST_GeohashBins

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
precision (int) – Numerical value representing the size of the Geohash bin.
padding (float, optional) – Numerical buffer value applied to the geometry before finding the intersecting bins, defaults to 0.0.

Returns:

Array column representing an array of spatial bin (bin2d) Geohash bins.

Return type:

pyspark.sql.Column

geom_from_binary¶

geoanalytics.sql.functions.geom_from_binary(wkb, sr=None)¶

Returns a geometry column. The input binary column must contain the well-known binary (WKB) representation of geometries. You can optionally specify a spatial reference for the result geometry column. The sr parameter value must be a valid SRID or WKT string. This function should only be used when you don’t know the geometry type represented by the input column or when the input column contains more than one geometry type. In other cases, use the function specific to the geometry type of your input data (i.e. ST_PointFromBinary, ST_LineFromBinary, ST_MPointFromBinary, or ST_PolyFromBinary).

Refer to the GeoAnalytics guide for examples and usage notes: ST_GeomFromBinary

Parameters:

wkb (pyspark.sql.Column) – BinaryType column with the Well-Known Binary (WKB) representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned generic geometry, defaults to None.

Returns:

Generic geometry column from the Well-Known Binary (WKB) representation.

Return type:

pyspark.sql.Column

geom_from_esri_json¶

geoanalytics.sql.functions.geom_from_esri_json(json_str, sr=None)¶

Returns a geometry column. The input string column must contain the Esri JSON representation of geometries. You can optionally specify a spatial reference for the result geometry column. The sr parameter value must be a valid SRID or WKT string. Any spatial reference defined in the input strings will not be used. This function should only be used when you don’t know the geometry type represented by the input column or when the input column contains more than one geometry type. In other cases, use the function specific to the geometry type of your input data (i.e. ST_PointFromEsriJSON, ST_LineFromEsriJSON, ST_MPointFromEsriJSON, or ST_PolyFromEsriJSON).

Refer to the GeoAnalytics guide for examples and usage notes: ST_GeomFromEsriJSON

Parameters:

json_str (pyspark.sql.Column) – StringType column with the Esri JSON representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned generic geometry, defaults to None.

Returns:

Generic geometry column from the Esri JSON representation.

Return type:

pyspark.sql.Column

geom_from_geojson¶

geoanalytics.sql.functions.geom_from_geojson(json_str, sr=None)¶

Returns a geometry column. The input string column must contain the GeoJSON representation of geometries. You can optionally specify a spatial reference for the result geometry column. The sr parameter value must be a valid SRID or WKT string. Any spatial reference defined in the input strings will not be used. This function should only be used when you don’t know the geometry type represented by the input column or when the input column contains more than one geometry type. In other cases, use the function specific to the geometry type of your input data (i.e. ST_PointFromGeoJSON, ST_LineFromGeoJSON, ST_MPointFromGeoJSON, or ST_PolyFromGeoJSON).

Refer to the GeoAnalytics guide for examples and usage notes: ST_GeomFromGeoJSON

Parameters:

json_str (pyspark.sql.Column) – StringType column with the GeoJSON representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned generic geometry, defaults to None.

Returns:

Generic geometry column from the GeoJSON representation.

Return type:

pyspark.sql.Column

geom_from_shape¶

geoanalytics.sql.functions.geom_from_shape(shp, sr=None)¶

Returns a geometry column. The input binary column must contain the shapefile representation of geometries. You can optionally specify a spatial reference for the result geometry column. The sr parameter value must be a valid SRID or WKT string. This function should only be used when you don’t know the geometry type represented by the input column or when the input column contains more than one geometry type. In other cases, use the function specific to the geometry type of your input data (i.e. ST_PointFromShape, ST_LineFromShape, ST_MPointFromShape, or ST_PolyFromShape).

Refer to the GeoAnalytics guide for examples and usage notes: ST_GeomFromShape

Parameters:

shp (pyspark.sql.Column) – BinaryType column with the shapefile representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned generic geometry, defaults to None.

Returns:

Generic geometry column from the shapefile representation.

Return type:

pyspark.sql.Column

geom_from_text¶

geoanalytics.sql.functions.geom_from_text(wkt, sr=None)¶

Returns a geometry column. The string column must contain the well-known text (WKT) representation of geometries. You can optionally specify a spatial reference for the result geometry column. The sr parameter value must be a valid SRID or WKT string. This function should only be used when you don’t know the geometry type represented by the input column or when the input column contains more than one geometry type. In other cases, use the function specific to the geometry type of your input data (i.e. ST_PointFromText, ST_LineFromText, ST_MPointFromText, or ST_PolyFromText).

Refer to the GeoAnalytics guide for examples and usage notes: ST_GeomFromText

Parameters:

wkt (pyspark.sql.Column) – StringType column with the well-known text (WKT) representation of geometries.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned generic geometry, defaults to None.

Returns:

Generic geometry column from the well-known text (WKT) representation.

Return type:

pyspark.sql.Column

geometries¶

geoanalytics.sql.functions.geometries(geometry)¶

Returns an array column. Multipoint geometries return an array of points, multipart linestrings return an array of single-path linestrings, and multipart polygons return an array of single-ring polygons.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Geometries

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: Array column representing an array of single part geometries.
Return type:: pyspark.sql.Column

geometry_n¶

geoanalytics.sql.functions.geometry_n(geometry, n)¶

Returns a geometry column. The output column contains the nth single-part geometry from a multipart geometry. When n=0, the first single-part geometry is returned. If the nth geometry doesn’t exist, null is returned.

Refer to the GeoAnalytics guide for examples and usage notes: ST_GeometryN

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
n (pyspark.sql.Column/int) – Index of the geometry to return. Can be an IntegerType column or an integer value.

Returns:

Geometry column representing the nth geometry. The returned geometry column type will be the same type as the input geometries except in the case of multipoint input which will return point.

Return type:

pyspark.sql.Column

geometry_type¶

geoanalytics.sql.functions.geometry_type(geometry)¶

Returns a string column. The string indicates the type of each input geometry (i.e. ‘Point’, ‘MultiPoint’, ‘Linestring’, or ‘Polygon’).

Refer to the GeoAnalytics guide for examples and usage notes: ST_GeometryType

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: StringType column representing the geometry type.
Return type:: pyspark.sql.Column

h3_bin¶

geoanalytics.sql.functions.h3_bin(geometry, h3_resolution)¶

Returns a bin column containing a single H3 bin at the specified resolution for each record in the input column. The centroid of the input geometry is guaranteed to intersect with the bin returned but is not necessarily coincident with the bin center. Use ST_BinGeometry to obtain the geometry of each result bin.

This function can also be called with a long column representing the ID of the bin (see ST_BinId). The bin ID will be cast to a bin column.

ST_H3Bin requires the spatial reference of the geometry column to be GCS_WGS_1984 (EPSG:4326). If the input geometry is in a different spatial reference, this function automatically transforms the geometry into GCS_WGS_1984. To learn more about spatial references, see Coordinate systems and transformations.

Refer to the GeoAnalytics guide for examples and usage notes: ST_H3Bin

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
h3_resolution (int) – H3 cell resolution, see H3 documentation for more information.

Returns:

Spatial bin (bin2d) column representing a single H3 bin for each geometry.

Return type:

pyspark.sql.Column

h3_bins¶

geoanalytics.sql.functions.h3_bins(geometry, h3_resolution, padding=0.0)¶

Returns an array column containing H3 bins at the specified resolution that cover the spatial extent of each record in the input column. You can optionally specify a numeric value for padding, which conceptually applies a buffer of the specified distance to the input geometry before creating the H3 bins. The padding value is in meters. Use ST_BinGeometry to obtain the geometry of each result bin.

ST_H3Bins requires the spatial reference to be set to GCS_WGS_1984 (EPSG:4326). If the input geometry is in a different spatial reference, this function automatically transforms the geometry into GCS_WGS_1984. To learn more about spatial references, see Coordinate systems and transformations.

Refer to the GeoAnalytics guide for examples and usage notes: ST_H3Bins

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
h3_resolution (int) – H3 cell resolution, see H3 documentation for more information.
padding (float, optional) – Numerical buffer value applied to the geometry before finding the intersecting bins, defaults to 0.0.

Returns:

Array column representing an array of spatial bin (bin2d) H3 bins.

Return type:

pyspark.sql.Column

hausdorff_distance¶

geoanalytics.sql.functions.hausdorff_distance(geometry1, geometry2)¶

Returns a double column representing the Hausdorff distance between the two input geometries.

Refer to the GeoAnalytics guide for examples and usage notes: ST_HausdorffDistance

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

DoubleType column representing the Hausdorff distance between the two input geometries.

Return type:

pyspark.sql.Column

hex_bin¶

geoanalytics.sql.functions.hex_bin(geometry, bin_size)¶

Returns a bin column containing a single hexagonal bin for each record in the input column. The specified bin size determines the height of each bin and is in the same units as the input geometry. The centroid of the input geometry is guaranteed to intersect with the bin returned but is not necessarily coincident with the bin center. Use ST_BinGeometry to obtain the geometry of each result bin.

This function can also be called with a long column representing the ID of the bin (see ST_BinId). The bin ID will be cast to a bin column.

Refer to the GeoAnalytics guide for examples and usage notes: ST_HexBin

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
bin_size (int/float) – Numerical value representing the height of the hexagonal bin.

Returns:

Spatial bin (bin2d) column representing a single hexagonal bin for each geometry.

Return type:

pyspark.sql.Column

hex_bins¶

geoanalytics.sql.functions.hex_bins(geometry, bin_size, padding=0.0)¶

Returns an array column containing hexagonal bins that cover the spatial extent of each record in the input column. The specified bin size determines the height of each bin and is in the same units as the input geometry. You can optionally specify a numeric value for padding, which conceptually applies a buffer of the specified distance to the input geometry before creating the hexagonal bins.

Refer to the GeoAnalytics guide for examples and usage notes: ST_HexBins

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
bin_size (int/float) – Numerical value representing the height of the hexagonal bin.
padding (int/float/str, optional) – Numerical buffer value applied to the geometry before finding the intersecting bins, defaults to 0.0.

Returns:

Array column representing an array of spatial bin (bin2d) hexagonal bins.

Return type:

pyspark.sql.Column

interior_ring_n¶

geoanalytics.sql.functions.interior_ring_n(polygon, n)¶

Returns a linestring column. The output is the nth interior ring of the input polygon as a linestring. If there is more than one interior ring, the order of the interior rings is defined by the order in the input polygon. When n=0, the first interior ring is returned. If the index exceeds the number of interior rings in the polygon, null is returned. If the input is a multipolygon null is returned.

Refer to the GeoAnalytics guide for examples and usage notes: ST_InteriorRingN

Parameters:

polygon (pyspark.sql.Column) – Polygon geometry column.
n (pyspark.sql.Column/int) – Index of the interior ring to return. Can be an IntegerType column or an integer value.

Returns:

Linestring column representing the nth interior ring.

Return type:

pyspark.sql.Column

intersection¶

geoanalytics.sql.functions.intersection(geometry1, geometry2, intersect_type=None)¶

Returns a geometry column containing the intersection of two input geometry records. You can optionally specify a string value that determines the geometry type of the result. The string can be one of: ‘multipoint’, ‘linestring’ or ‘polygon’. If no intersection type is specified, the function will return the same geometry type as the input geometry with the lowest dimension. For example, if you calculate the intersection of a polygon and a linestring the function will return a linestring.

The function will return an empty geometry if the two input geometries do not intersect or there is no intersection that matches the specified intersect type. If the intersection is a single point, the geometry type of the result column will be a multipoint.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Intersection

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.
intersect_type (str, optional) – Sets the output geometry type, defaults to None.

Returns:

Geometry column.

Return type:

pyspark.sql.Column

intersects¶

geoanalytics.sql.functions.intersects(geometry1, geometry2)¶

Returns a boolean column where the result is True if the first geometry and the second geometry spatially intersect; otherwise, it returns False.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Intersects

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

BooleanType column. True if geometry1 and geometry2 spatially intersect in 2D, False otherwise.

Return type:

pyspark.sql.Column

is_3d¶

geoanalytics.sql.functions.is_3d(geometry)¶

Returns a boolean column where the result is True if the geometry is three-dimensional; otherwise, it returns False. The geometry is considered three-dimensional if it has x,y,z coordinates.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Is3D

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: BooleanType column. True if the geometry is three-dimensional, False otherwise.
Return type:: pyspark.sql.Column

is_closed¶

geoanalytics.sql.functions.is_closed(linestring)¶

Returns a boolean column where the result is True if the start and end point of a given linestring are coincident; otherwise, it returns False.

Refer to the GeoAnalytics guide for examples and usage notes: ST_IsClosed

Parameters:: linestring (pyspark.sql.Column) – Linestring geometry column.
Returns:: BooleanType column. True if the linestring is closed, False otherwise.
Return type:: pyspark.sql.Column

is_empty¶

geoanalytics.sql.functions.is_empty(geometry)¶

Returns a boolean column where the result is True if the geometry is empty; otherwise, it returns False.

Refer to the GeoAnalytics guide for examples and usage notes: ST_IsEmpty

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: BooleanType column. True if the geometry is empty, False otherwise.
Return type:: pyspark.sql.Column

is_measured¶

geoanalytics.sql.functions.is_measured(geometry)¶

Returns a boolean column where the result is True if the geometry has an m-value; otherwise, it returns False.

Refer to the GeoAnalytics guide for examples and usage notes: ST_IsMeasured

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: BooleanType column. True if the geometry has m-values, False otherwise.
Return type:: pyspark.sql.Column

is_ring¶

geoanalytics.sql.functions.is_ring(geometry)¶

Returns a boolean column where the result is True if the input linestring is closed and simple; otherwise, it returns False.

Refer to the GeoAnalytics guide for examples and usage notes: ST_IsRing

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: BooleanType column. True if the linestring is both closed and simple, False otherwise.
Return type:: pyspark.sql.Column

is_simple¶

geoanalytics.sql.functions.is_simple(geometry)¶

Returns a boolean column where the result is True if the given geometry is simple; otherwise, it returns False.

The criteria for simplicity vary for each geometry type and are as follows:

A point is always simple.

A multipoint is considered simple if no two points are coincident.

A linestring is considered simple if it does not cross the same point twice, except for start and end points. A multipart linestring is only considered simple if the parts do not intersect except at the start or end points of the parts.

A polygon or multipart polygon is considered simple if each ring does not cross the same point twice and no two rings cross or intersect except at a single point (i.e. they are tangent at a point but not a line).

Refer to the GeoAnalytics guide for examples and usage notes: ST_IsSimple

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: BooleanType column. True if the geometry has self-intersection or self-tangency, False otherwise.
Return type:: pyspark.sql.Column

length¶

geoanalytics.sql.functions.length(geometry)¶

Returns a double column representing the planar length of the input geometry. The length is calculated in the same units as the input geometry. For point and multipoint geometries the function will always return 0. For polygon geometries this function will return the length of the perimeter of the polygon.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Length

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: DoubleType column representing the planar length.
Return type:: pyspark.sql.Column

line_from_binary¶

geoanalytics.sql.functions.line_from_binary(wkb, sr=None)¶

Returns a linestring column. The input binary column must contain the well-known binary (WKB) representation of linestring geometries. You can optionally specify a spatial reference for the result linestring column. The sr parameter value must be a valid SRID or WKT string. If a linestring cannot be created from the input binary the function will return null.

Refer to the GeoAnalytics guide for examples and usage notes: ST_LineFromBinary

Parameters:

wkb (pyspark.sql.Column) – BinaryType column with the Well-Known Binary (WKB) representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned linestring geometry, defaults to None.

Returns:

Linestring column from the Well-Known Binary (WKB) representation.

Return type:

pyspark.sql.Column

line_from_esri_json¶

geoanalytics.sql.functions.line_from_esri_json(json_str, sr=None)¶

Returns a linestring column. The input string column must contain the Esri JSON representation of linestring geometries. You can optionally specify a spatial reference for the result linestring column. The sr parameter value must be a valid SRID or WKT string. Any SRID defined in the input strings will not be used. If a linestring cannot be created from the input string the function will return null.

Refer to the GeoAnalytics guide for examples and usage notes: ST_LineFromEsriJSON

Parameters:

json_str (pyspark.sql.Column) – StringType column with the Esri JSON representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned linestring geometry, defaults to None.

Returns:

Linestring column from the Esri JSON representation.

Return type:

pyspark.sql.Column

line_from_geojson¶

geoanalytics.sql.functions.line_from_geojson(json_str, sr=None)¶

Returns a linestring column. The input string column must contain the GeoJSON representation of linestring geometries. You can optionally specify a spatial reference for the result linestring column. The sr parameter value must be a valid SRID or WKT string. If a linestring cannot be created from the input string the function will return null.

Refer to the GeoAnalytics guide for examples and usage notes: ST_LineFromGeoJSON

Parameters:

json_str (pyspark.sql.Column) – StringType column with the GeoJSON representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned linestring geometry, defaults to None.

Returns:

Linestring column from the GeoJSON representation.

Return type:

pyspark.sql.Column

line_from_shape¶

geoanalytics.sql.functions.line_from_shape(shp, sr=None)¶

Returns a linestring column. The input binary column must contain the shapefile representation of linestring geometries. You can optionally specify a spatial reference for the result linestring column. The sr parameter value must be a valid SRID or WKT string. If a linestring cannot be created from the input binary the function will return null.

Refer to the GeoAnalytics guide for examples and usage notes: ST_LineFromShape

Parameters:

shp (pyspark.sql.Column) – BinaryType column with the shapefile representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned linestring geometry, defaults to None.

Returns:

Linestring column from the shapefile representation.

Return type:

pyspark.sql.Column

line_from_text¶

geoanalytics.sql.functions.line_from_text(wkt, sr=None)¶

Returns a linestring column. The string column must contain the well-known text (WKT) representation of linestring geometries. You can optionally specify a spatial reference for the result linestring column. The sr parameter value must be a valid SRID or WKT string.

Refer to the GeoAnalytics guide for examples and usage notes: ST_LineFromText

Parameters:

wkt (pyspark.sql.Column) – StringType column with the well-known text (WKT) representation of linestring geometries.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned linestring geometry, defaults to None.

Returns:

Linestring column from the well-known text (WKT) representation.

Return type:

pyspark.sql.Column

line_interpolate_point¶

geoanalytics.sql.functions.line_interpolate_point(linestring, fraction)¶

Returns a point geometry at the specified fraction along the input linestring.

Refer to the GeoAnalytics guide for examples and usage notes: ST_LineInterpolatePoint

Parameters:

linestring (pyspark.sql.Column) – Linestring geometry column.
fraction (pyspark.sql.Column/int/float) – Numeric value between 0 and 1 representing the fraction of the line length.

Returns:

Point geometry column at the specified fraction along the input linestring.

Return type:

pyspark.sql.Column

line_locate_point¶

geoanalytics.sql.functions.line_locate_point(linestring, point)¶

Returns a double column representing the fraction along the input linestring to the specified point.

Refer to the GeoAnalytics guide for examples and usage notes: ST_LineLocatePoint

Parameters:

linestring (pyspark.sql.Column) – Linestring geometry column.
point (pyspark.sql.Column) – Point geometry column.

Returns:

Double column representing the fraction along the input linestring to the specified point.

Return type:

pyspark.sql.Column

linestring¶

geoanalytics.sql.functions.linestring(points)¶

Returns a linestring column. The input arrays must be arrays of point geometries. The function creates a linestring geometry by connecting the point geometries in the same order that they are stored in the input array.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Linestring

Parameters:: points (pyspark.sql.Column) – Array of point geometries.
Returns:: Linestring column representing the array of points.
Return type:: pyspark.sql.Column

m¶

geoanalytics.sql.functions.m(point, new_value=None)¶

Can work as a getter or a setter, depending on the inputs.

Getter: Takes a point column and returns a double column containing the m-values of the input points. If a point does not have an m-value the function returns NaN.

Setter: Takes a point column and a numeric value and returns a point column containing the input points with the m-values set to the numeric value.

Refer to the GeoAnalytics guide for examples and usage notes: ST_M

Parameters:

point (pyspark.sql.Column) – Point geometry column.
new_value (pyspark.sql.Column/int/float, optional) – M-value to set. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.

Returns:

Getter: DoubleType column representing the m-value.
Setter: Point column representing the updated m-value.

Return type:

pyspark.sql.Column

make_point¶

geoanalytics.sql.functions.make_point(x, y, z=None, m=None)¶

Returns a point column. The two input columns must contain the x,y coordinates of the points respectively. You can optionally specify two additional input columns with z-coordinates and m-values. The spatial reference of the result column will always be 0 and should be set to a valid ID using ST_SRID.

Refer to the GeoAnalytics guide for examples and usage notes: ST_MakePoint

Parameters:

x (pyspark.sql.Column/int/float) – X-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (x-coordinates) or a numeric value.
y (pyspark.sql.Column/int/float) – Y-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (y-coordinates) or a numeric value.
z (pyspark.sql.Column/int/float, optional) – Z-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.
m (pyspark.sql.Column/int/float, optional) – M-value for the point. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.

Returns:

Point column representing the point geometries.

Return type:

pyspark.sql.Column

max_m¶

geoanalytics.sql.functions.max_m(geometry)¶

Returns a double column containing the maximum m-value of each input geometry. If the input geometry does not have m-values the function will return NaN.

Refer to the GeoAnalytics guide for examples and usage notes: ST_MaxM

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: DoubleType column representing the maximum m-value for the envelope.
Return type:: pyspark.sql.Column

max_x¶

geoanalytics.sql.functions.max_x(geometry)¶

Returns a double column containing the maximum x-coordinate of each input geometry.

Refer to the GeoAnalytics guide for examples and usage notes: ST_MaxX

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: DoubleType column representing the maximum x-coordinate for the envelope.
Return type:: pyspark.sql.Column

max_y¶

geoanalytics.sql.functions.max_y(geometry)¶

Returns a double column containing the maximum y-coordinate of each input geometry.

Refer to the GeoAnalytics guide for examples and usage notes: ST_MaxY

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: DoubleType column representing the maximum y-coordinate for the envelope.
Return type:: pyspark.sql.Column

max_z¶

geoanalytics.sql.functions.max_z(geometry)¶

Returns a double column containing the maximum z-coordinate of each input geometry. If the input geometry does not have z-coordinates the function will return NaN.

Refer to the GeoAnalytics guide for examples and usage notes: ST_MaxZ

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: DoubleType column representing the maximum z-coordinate for the envelope.
Return type:: pyspark.sql.Column

min_bounding_box¶

geoanalytics.sql.functions.min_bounding_box(geometry, by_area=True)¶

Returns a polygon column containing a polygon for each geometry in the input column. The polygon is the smallest rectangle of arbitrary alignment that encompasses the input geometry. You can optionally provide a boolean value that determines how the rectangle will be created. There are two options:

True: Creates a rectangle with the minimum possible area. This is the default.

False: Creates a rectangle with the minimum possible width.

For point geometries, this function will return a degenerate polygon at the location of the point.

To find the minimum bounding box that aligns to the x and y axis, use ST_Envelope.

Refer to the GeoAnalytics guide for examples and usage notes: ST_MinBoundingBox

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
by_area (pyspark.sql.Column/bool, optional) – BooleanType column or a boolean value. True minimizes the bounding box area, False minimizes the width. Defaults to True.

Returns:

Polygon column representing the bounding box.

Return type:

pyspark.sql.Column

min_m¶

geoanalytics.sql.functions.min_m(geometry)¶

Returns a double column containing the minimum m-value of each input geometry. If the input geometry does not have m-values the function will return NaN.

Refer to the GeoAnalytics guide for examples and usage notes: ST_MinM

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: DoubleType column representing the minimum m-value for the envelope.
Return type:: pyspark.sql.Column

min_x¶

geoanalytics.sql.functions.min_x(geometry)¶

Returns a double column containing the minimum x-coordinate of each input geometry.

Refer to the GeoAnalytics guide for examples and usage notes: ST_MinX

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: DoubleType column representing the minimum x-coordinate for the envelope.
Return type:: pyspark.sql.Column

min_y¶

geoanalytics.sql.functions.min_y(geometry)¶

Returns a double column containing the minimum y-coordinate of each input geometry.

Refer to the GeoAnalytics guide for examples and usage notes: ST_MinY

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: DoubleType column representing the minimum y-coordinate for the envelope.
Return type:: pyspark.sql.Column

min_z¶

geoanalytics.sql.functions.min_z(geometry)¶

Returns a double column containing the minimum z-coordinate of each input geometry. If the input geometry does not have z-coordinates the function will return NaN.

Refer to the GeoAnalytics guide for examples and usage notes: ST_MinZ

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: DoubleType column representing the minimum z-coordinate for the envelope.
Return type:: pyspark.sql.Column

mpoint_from_binary¶

geoanalytics.sql.functions.mpoint_from_binary(wkb, sr=None)¶

Returns a multipoint column. The input binary column must contain the well-known binary (WKB) representation of multipoint geometries. You can optionally specify a spatial reference for the result multipoint column. The sr parameter value must be a valid SRID or WKT string. If amultipoint cannot be created from the input binary the function will return null.

Refer to the GeoAnalytics guide for examples and usage notes: ST_MPointFromBinary

Parameters:

wkb (pyspark.sql.Column) – BinaryType column with the Well-Known Binary (WKB) representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned multipoint geometry, defaults to None.

Returns:

Multipoint column from the Well-Known Binary (WKB) representation.

Return type:

pyspark.sql.Column

mpoint_from_esri_json¶

geoanalytics.sql.functions.mpoint_from_esri_json(json_str, sr=None)¶

Returns a multipoint column. The input string column must contain the Esri JSON representation of multipoint geometries. You can optionally specify a spatial reference for the result multipoint column. The sr parameter value must be a valid SRID or WKT string. Any SRID defined in the input strings will not be used. If a multipoint cannot be created from the input string the function will return null.

Refer to the GeoAnalytics guide for examples and usage notes: ST_MPointFromEsriJSON

Parameters:

json_str (pyspark.sql.Column) – StringType column with the Esri JSON representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned multipoint geometry, defaults to None.

Returns:

Multipoint column from the Esri JSON representation.

Return type:

pyspark.sql.Column

mpoint_from_geojson¶

geoanalytics.sql.functions.mpoint_from_geojson(json_str, sr=None)¶

Returns a multipoint column. The input string column must contain the GeoJSON representation of multipoint geometries. You can optionally specify a spatial reference for the result multipoint column. The sr parameter value must be a valid SRID or WKT string. If a multipoint cannot be created from the input string the function will return null.

Refer to the GeoAnalytics guide for examples and usage notes: ST_MPointFromGeoJSON

Parameters:

json_str (pyspark.sql.Column) – StringType column with the GeoJSON representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned multipoint geometry, defaults to None.

Returns:

Multipoint column from the GeoJSON representation.

Return type:

pyspark.sql.Column

mpoint_from_shape¶

geoanalytics.sql.functions.mpoint_from_shape(shp, sr=None)¶

Returns a multipoint column. The input binary column must contain the shapefile representation of multipoint geometries. You can optionally specify a spatial reference for the result multipoint column. The sr parameter value must be a valid SRID or WKT string. If a multipoint cannot be created from the input binary the function will return null.

Refer to the GeoAnalytics guide for examples and usage notes: ST_MPointFromShape

Parameters:

shp (pyspark.sql.Column) – BinaryType column with the shapefile representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned multipoint geometry, defaults to None.

Returns:

Multipoint column from the shapefile representation.

Return type:

pyspark.sql.Column

mpoint_from_text¶

geoanalytics.sql.functions.mpoint_from_text(wkt, sr=None)¶

Returns a multipoint column. The string column must contain the well-known text (WKT) representation of multipoint geometries. You can optionally specify a spatial reference for the result multipoint column. The sr parameter value must be a valid SRID or WKT string.

Refer to the GeoAnalytics guide for examples and usage notes: ST_MPointFromText

Parameters:

wkt (pyspark.sql.Column) – StringType column with the well-known text (WKT) representation of multipoint geometries.
sr (int/sr) – Spatial reference (SRID or WKT) to set on the returned multipoint geometry, defaults to None.

Returns:

Multipoint column from the well-known text (WKT) representation.

Return type:

pyspark.sql.Column

multilinestring¶

geoanalytics.sql.functions.multilinestring(*point_arrays)¶

Returns a linestring column. The input array column must contain an array of arrays of point geometries.

Refer to the GeoAnalytics guide for examples and usage notes: ST_MultiLinestring

Parameters:: point_arrays (pyspark.sql.Column) – Array of point geometry arrays.
Returns:: Linestring column representing the array of points.
Return type:: pyspark.sql.Column

multipoint¶

geoanalytics.sql.functions.multipoint(points)¶

Returns a multipoint column. The input array column must contain an array of point geometries.

Refer to the GeoAnalytics guide for examples and usage notes: ST_MultiPoint

Parameters:: points (pyspark.sql.Column) – Array of point geometries.
Returns:: Multipoint column representing the array of points.
Return type:: pyspark.sql.Column

multipolygon¶

geoanalytics.sql.functions.multipolygon(*point_arrays)¶

Returns a polygon column. The input column must contain an array of arrays of point geometries. The output polygon column represents the one or more rings created from the point arrays.

Refer to the GeoAnalytics guide for examples and usage notes: ST_MultiPolygon

Parameters:: point_arrays (pyspark.sql.Column) – Array of point geometry arrays.
Returns:: Polygon column representing the array of points.
Return type:: pyspark.sql.Column

num_geometries¶

geoanalytics.sql.functions.num_geometries(geometry)¶

Returns an integer column representing the number of geometries in each record.

Refer to the GeoAnalytics guide for examples and usage notes: ST_NumGeometries

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: IntegerType column representing the number of geometries.
Return type:: pyspark.sql.Column

num_interior_ring¶

geoanalytics.sql.functions.num_interior_ring(polygon)¶

OGC alias for ST_NumInteriorRings.

Refer to the GeoAnalytics guide for examples and usage notes: ST_NumInteriorRings

Parameters:: polygon (pyspark.sql.Column) – Polygon Geometry column.
Returns:: IntegerType column representing the number of interior rings.
Return type:: pyspark.sql.Column

num_interior_rings¶

geoanalytics.sql.functions.num_interior_rings(polygon)¶

Returns an integer column representing the number of interior rings in the input polygon. The function will return null when the input is a multipart polygon.

Refer to the GeoAnalytics guide for examples and usage notes: ST_NumInteriorRings

Parameters:: polygon (pyspark.sql.Column) – Polygon geometry column.
Returns:: IntegerType column representing the number of interior rings.
Return type:: pyspark.sql.Column

num_points¶

geoanalytics.sql.functions.num_points(geometry)¶

Returns an integer column representing the number of points in the input geometry.

Refer to the GeoAnalytics guide for examples and usage notes: ST_NumPoints

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: IntegerType column representing the number of points.
Return type:: pyspark.sql.Column

overlaps¶

geoanalytics.sql.functions.overlaps(geometry1, geometry2)¶

Returns a boolean column where the result is True if the first geometry and the second geometry spatially overlap; otherwise, it returns False. Two geometries overlap when their intersection is the same geometry type as either of the inputs but not equal to either of the inputs.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Overlaps

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

BooleanType column. True if geometry1 and geometry2 spatially overlap, False otherwise.

Return type:

pyspark.sql.Column

point¶

geoanalytics.sql.functions.point(x, y, sr=None)¶

Returns a point column. The two numeric columns or values must contain the x,y coordinates of the point geometries. You can optionally specify a spatial reference for the result point column. The sr parameter value must be a valid SRID or WKT string.

To create point geometries with a z-coordinate and/or m-value, use ST_PointZ, ST_PointZM, ST_PointM, or ST_MakePoint.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Point

Parameters:

x (pyspark.sql.Column/int/float) – X-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (x-coordinates) or a numeric value.
y (pyspark.sql.Column/int/float) – Y-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (y-coordinates) or a numeric value.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned multipoint geometry, defaults to None.

Returns:

Point column representing the point geometries.

Return type:

pyspark.sql.Column

point_from_binary¶

geoanalytics.sql.functions.point_from_binary(wkb, sr=None)¶

Returns a point column. The input binary column must contain the well-known binary (WKB) representation of point geometries. You can optionally specify a spatial reference for the result point column. The sr parameter value must be a valid SRID or WKT string. If a point cannot be created from the input binary the function will return null.

Refer to the GeoAnalytics guide for examples and usage notes: ST_PointFromBinary

Parameters:

wkb (pyspark.sql.Column) – BinaryType column with the Well-Known Binary (WKB) representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned point geometry, defaults to None.

Returns:

Point column from the Well-Known Binary (WKB) representation.

Return type:

pyspark.sql.Column

point_from_esri_json¶

geoanalytics.sql.functions.point_from_esri_json(json_str, sr=None)¶

Returns a point column. The input string column must contain the Esri JSON representation of point geometries. You can optionally specify a spatial reference for the result point column. The sr parameter value must be a valid SRID or WKT string. Any SRID defined in the input strings will not be used. If a point cannot be created from the input string the function will return null.

Refer to the GeoAnalytics guide for examples and usage notes: ST_PointFromEsriJSON

Parameters:

json_str (pyspark.sql.Column) – StringType column with the Esri JSON representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned point geometry, defaults to None.

Returns:

Point column from the Esri JSON representation.

Return type:

pyspark.sql.Column

point_from_geojson¶

geoanalytics.sql.functions.point_from_geojson(json_str, sr=None)¶

Returns a point column. The input string column must contain the GeoJSON representation of point geometries. You can optionally specify a spatial reference for the result point column. The sr parameter value must be a valid SRID or WKT string. If a point cannot be created from the input string the function will return null.

Refer to the GeoAnalytics guide for examples and usage notes: ST_PointFromGeoJSON

Parameters:

json_str (pyspark.sql.Column) – StringType column with the GeoJSON representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned point geometry, defaults to None.

Returns:

Point column from the GeoJSON representation.

Return type:

pyspark.sql.Column

point_from_shape¶

geoanalytics.sql.functions.point_from_shape(shp, sr=None)¶

Returns a point column. The input binary column must contain the shapefile representation of point geometries. You can optionally specify a spatial reference for the result point column. The sr parameter value must be a valid SRID or WKT string. If a point cannot be created from the input binary the function will return null.

Refer to the GeoAnalytics guide for examples and usage notes: ST_PointFromShape

Parameters:

shp (pyspark.sql.Column) – BinaryType column with the shapefile representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned point geometry, defaults to None.

Returns:

Point column from the shapefile representation.

Return type:

pyspark.sql.Column

point_from_text¶

geoanalytics.sql.functions.point_from_text(wkt, sr=None)¶

Returns a point column. The string column must contain the well-known text (WKT) representation of point geometries. You can optionally specify a spatial reference for the result point column. The sr parameter value must be a valid SRID or WKT string.

Refer to the GeoAnalytics guide for examples and usage notes: ST_PointFromText

Parameters:

wkt (pyspark.sql.Column) – StringType column with the well-known text (WKT) representation of point geometries.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned point geometry, defaults to None.

Returns:

Point column from the well-known text (WKT) representation.

Return type:

pyspark.sql.Column

point_m¶

geoanalytics.sql.functions.point_m(x, y, m, sr=None)¶

Returns a point column. The three numeric columns or values must contain the x,y coordinates and m-values of the point geometries. You can optionally specify a spatial reference for the result point column. The sr parameter value must be a valid SRID or WKT string.

To create point geometries without m-values or z-coordinates use ST_Point. To create point geometries with z-coordinates use ST_PointZ, ST_PointZM, or ST_MakePoint.

Refer to the GeoAnalytics guide for examples and usage notes: ST_PointM

Parameters:

x (pyspark.sql.Column/int/float) – X-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (x-coordinates) or a numeric value.
y (pyspark.sql.Column/int/float) – Y-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (y-coordinates) or a numeric value.
m (pyspark.sql.Column/int/float) – M-value for the point. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned multipoint geometry, defaults to None.

Returns:

Point column representing the point geometries.

Return type:

pyspark.sql.Column

point_n¶

geoanalytics.sql.functions.point_n(geometry, n)¶

Returns a point column. The output column represents the nth point in the input geometry, where 0 is the first point. If the nth point does not exist the function returns null. This function always returns null for multipart linestrings, and multipart polygons.

Refer to the GeoAnalytics guide for examples and usage notes: ST_PointN

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
n (pyspark.sql.Column/int) – Index of the point to return. Can be an IntegerType column or an integer value.

Returns:

Point column representing the nth point.

Return type:

pyspark.sql.Column

point_on_surface¶

geoanalytics.sql.functions.point_on_surface(geometry)¶

Returns a point column. The function returns a point that lies on the surface of linestring or polygon geometries.

Refer to the GeoAnalytics guide for examples and usage notes: ST_PointOnSurface

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: Point column representing a point that lies on the surface.
Return type:: pyspark.sql.Column

point_z¶

geoanalytics.sql.functions.point_z(x, y, z, sr=None)¶

Returns a point column. The three numeric columns or values must contain the x,y,z coordinates of the point geometries. You can optionally specify a spatial reference for the result point column. The sr parameter value must be a valid SRID or WKT string.

To create point geometries without m-values or z-coordinates use ST_Point. To create point geometries with m-values use ST_PointM, ST_PointZM, or ST_MakePoint.

Refer to the GeoAnalytics guide for examples and usage notes: ST_PointZ

Parameters:

x (pyspark.sql.Column/int/float) – X-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (x-coordinates) or a numeric value.
y (pyspark.sql.Column/int/float) – Y-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (y-coordinates) or a numeric value.
z (pyspark.sql.Column/int/float) – Z-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned multipoint geometry, defaults to None.

Returns:

Point column representing the point geometries.

Return type:

pyspark.sql.Column

point_zm¶

geoanalytics.sql.functions.point_zm(x, y, z, m, sr=None)¶

Returns a point column. The four numeric columns or values must contain the x,y,z coordinates and m-values of the point geometries. You can optionally specify a spatial reference for the result point column. The sr parameter value must be a valid SRID or WKT string.

To create point geometries without m-values or z-coordinates use ST_Point. To create point geometries with only m-values or only z-coordinates use ST_PointM, ST_PointZ, or ST_MakePoint.

Refer to the GeoAnalytics guide for examples and usage notes: ST_PointZM

Parameters:

x (pyspark.sql.Column/int/float) – X-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (x coordinates) or a numeric value.
y (pyspark.sql.Column/int/float) – Y-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values (y-coordinates) or a numeric value.
z (pyspark.sql.Column/int/float) – Z-coordinate for the point. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.
m (pyspark.sql.Column/int/float) – M-value for the point. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned multipoint geometry, defaults to None.

Returns:

Point column representing the point geometries.

Return type:

pyspark.sql.Column

points¶

geoanalytics.sql.functions.points(geometry)¶

Returns an array column. For linestring and polygon geometries the function returns the vertices of the input geometry as an array of points. For multipoint and point geometries the function returns an array of all points in the input geometry.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Points

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: Array column representing an array of point geometries.
Return type:: pyspark.sql.Column

poly_from_binary¶

geoanalytics.sql.functions.poly_from_binary(wkb, sr=None)¶

Returns a polygon column. The input binary column must contain the well-known binary (WKB) representation of polygon geometries. You can optionally specify a spatial reference for the result polygon column. The sr parameter value must be a valid SRID or WKT string. If a polygon cannot be created from the input binary the function will return null.

Refer to the GeoAnalytics guide for examples and usage notes: ST_PolyFromBinary

Parameters:

wkb (pyspark.sql.Column) – BinaryType column with the Well-Known Binary (WKB) representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned polygon geometry, defaults to None.

Returns:

Polygon column from the Well-Known Binary (WKB) representation.

Return type:

pyspark.sql.Column

poly_from_esri_json¶

geoanalytics.sql.functions.poly_from_esri_json(json_str, sr=None)¶

Returns a polygon column. The input string column must contain the Esri JSON representation of polygon geometries. You can optionally specify a spatial reference for the result polygon column. The sr parameter value must be a valid SRID or WKT string. Any SRID defined in the input strings will not be used. If a polygon cannot be created from the input string the function will return null.

Refer to the GeoAnalytics guide for examples and usage notes: ST_PolyFromEsriJSON

Parameters:

json_str (pyspark.sql.Column) – StringType column with the Esri JSON representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned polygon geometry, defaults to None.

Returns:

Polygon column from the Esri JSON representation.

Return type:

pyspark.sql.Column

poly_from_geojson¶

geoanalytics.sql.functions.poly_from_geojson(json_str, sr=None)¶

Returns a polygon column. The input string column must contain the GeoJSON representation of polygon geometries. You can optionally specify a spatial reference for the result polygon column. The sr parameter value must be a valid SRID or WKT string. If a polygon cannot be created from the input string the function will return null.

Refer to the GeoAnalytics guide for examples and usage notes: ST_PolyFromGeoJSON

Parameters:

json_str (pyspark.sql.Column) – StringType column with the GeoJSON representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned polygon geometry, defaults to None.

Returns:

Polygon column from the GeoJSON representation.

Return type:

pyspark.sql.Column

poly_from_shape¶

geoanalytics.sql.functions.poly_from_shape(shp, sr=None)¶

Returns a polygon column. The input binary column must contain the shapefile representation of polygon geometries. You can optionally specify a spatial reference for the result polygon column. The sr parameter value must be a valid SRID or WKT string. If a polygon cannot be created from the input binary the function will return null.

Refer to the GeoAnalytics guide for examples and usage notes: ST_PolyFromShape

Parameters:

shp (pyspark.sql.Column) – BinaryType column with the shapefile representation.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned polygon geometry, defaults to None.

Returns:

Polygon column from the shapefile representation.

Return type:

pyspark.sql.Column

poly_from_text¶

geoanalytics.sql.functions.poly_from_text(wkt, sr=None)¶

Returns a polygon column. The string column must contain the well-known text (WKT) representation of polygon geometries. You can optionally specify a spatial reference for the result polygon column. The sr parameter value must be a valid SRID or WKT string.

Refer to the GeoAnalytics guide for examples and usage notes: ST_PolyFromText

Parameters:

wkt (pyspark.sql.Column) – StringType column with the well-known text (WKT) representation of polygon geometries.
sr (int/str, optional) – Spatial reference (SRID or WKT) to set on the returned polygon geometry, defaults to None.

Returns:

Polygon column from the well-known text (WKT) representation.

Return type:

pyspark.sql.Column

polygon¶

geoanalytics.sql.functions.polygon(points)¶

Returns a polygon column. The input column must contain an array of point geometries.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Polygon

Parameters:: points (pyspark.sql.Column) – Array of point geometries.
Returns:: Polygon column representing the array of points.
Return type:: pyspark.sql.Column

relate¶

geoanalytics.sql.functions.relate(geometry1, geometry2, relation)¶

Returns a boolean column where the result is True if the first geometry and the second geometry satisfy the spatial relationship defined by the specified DE-9IM string code; otherwise, it returns False. The string code contains nine characters that represent the nine spatial relations of the dimensionally extended 9-intersection model (DE-9IM). The character values indicate the dimensionality of the relationship: 0 for points, 1 for linestrings, 2 for polygons, and ‘F’ to indicate an empty set.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Relate

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.
relation (pyspark.sql.Column/str) – The DE-9IM matrix value that will be used to compare the spatial relationship. Can be a StringType column or string value.

Returns:

BooleanType column. True if the spatial relationship of geometry1 and geometry2 match the DE-9IM matrix value, False otherwise.

Return type:

pyspark.sql.Column

rotate¶

geoanalytics.sql.functions.rotate(geometry, angle_in_radians, rotation_center=None)¶

Rotates a geometry counterclockwise by an angle specified in radians.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Rotate

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
angle_in_radians (pyspark.sql.Column/int/float) – Rotation angle in radians. Can be a LongType, DoubleType or StringType column or a numeric value.
rotation_center (pyspark.sql.Column, optional) – if specified, the geometry is rotated about the rotation_center, else around the origin (0,0).

Returns:

Geometry column with the rotated geometries. The returned geometry column type will be the same as the input geometry column type.

Return type:

pyspark.sql.Column

scale¶

geoanalytics.sql.functions.scale(geometry, x_scale_factor, y_scale_factor)¶

Scales the geometry to a new size by multiplying the coordinates with the corresponding factor parameters.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Scale

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
x_scale_factor (pyspark.sql.Column/int/float) – Numeric value that specifies the scale factor in the X direction. Can be a LongType, DoubleType or StringType column or a numeric value.
y_scale_factor (pyspark.sql.Column/int/float) – Numeric value that specifies the scale factor in the Y direction. Can be a LongType, DoubleType or StringType column or a numeric value.

Returns:

Geometry column with the scaled geometries. The returned geometry column type will be the same as the input geometry column type.

Return type:

pyspark.sql.Column

segmentize¶

geoanalytics.sql.functions.segmentize(linestring, max_segment_length=2)¶

Returns an array column. This function creates an array of linestrings from the input linestring by breaking the input linestring into segments that are shorter than or equal to the maximum length specified. The maximum segment length is in the same units as the input geometry. The max_segment_length can be specified with or without a unit. When specified with a unit, the max_segment_length can be created with ST_CreateDistance or with a tuple containing a number and a unit (e.g., (10, “kilometers”)). When specified without a unit, the max_segment_length can be a single value or a numeric column. It is interpreted as distance in the same units as the input geometry.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Segmentize

Parameters:

linestring (pyspark.sql.Column) – Linestring geometry column.
max_segment_length (pyspark.sql.Column/int/float/tuple) – Maximum length for any segment created. When specified without a unit, it can be a LongType, DoubleType or StringType column or an integer or float value. When specified with a unit, it can be a StructType column or tuple representing the distance with unit.

Returns:

Array column representing an array of linestring segments.

Return type:

pyspark.sql.Column

segments¶

geoanalytics.sql.functions.segments(linestring, num_points=2, step_size=1)¶

Returns an array column. The function creates an array of linestrings by splitting the input linestring at a certain number of vertices using a moving window. By default the function will create segments with two points and the moving window will move one point at a time (step size of 1). You can optionally include a larger number of points in each linestring by specifying a numeric value greater than 2. You can also increase the step size by setting it to a value greater than 1. Setting the step size to one less than the number of points will always result in segments that touch but do not overlap.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Segments

Parameters:

linestring (pyspark.sql.Column) – Linestring geometry column.
num_points (pyspark.sql.Column/int, optional) – Numeric value representing the number of points in each segment, defaults to 2. Can be a LongType or StringType column or an integer value.
step_size (pyspark.sql.Column, optional) – Numeric value representing the number of points between the start of each new segment, defaults to 1. Can be a LongType or StringType column or an integer value.

Returns:

Array column representing an array of linestring segments.

Return type:

pyspark.sql.Column

shear¶

geoanalytics.sql.functions.shear(geometry, proportion_x, proportion_y)¶

Returns a geometry column that generalizes the input linestring or polygon geometry using the Douglas-Peucker algorithm with the specified tolerance. The result is the input geometry generalized to include only a subset of the original geometry’s vertices. Point and multipoint geometry types are not supported as input.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Shear

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
proportion_x (pyspark.sql.Column/int/float) – Numeric value that specifies the proportion of shearing in the X direction. Can be a LongType, DoubleType or StringType column or a numeric value.
proportion_y (pyspark.sql.Column/int/float) – Numeric value that specifies the proportion of shearing in the Y direction. Can be a LongType, DoubleType or StringType column or a numeric value.

Returns:

Geometry column with the sheared geometries. The returned geometry column type will be the same as the input geometry column type.

Return type:

pyspark.sql.Column

shortest_line¶

geoanalytics.sql.functions.shortest_line(geometry1, geometry2)¶

Returns a linestring column representing the shortest line that touches two geometries, using planar distance calculation. This function returns only one shortest line if there are more than one. If the two input geometries intersect, an empty line geometry is returned. If the two geometry columns are in different spatial references, the function automatically transforms the second geometry into the spatial reference of the first. To create a shortest line using geodesic distance calculation, use ST_GeodesicShortestLine.

Refer to the GeoAnalytics guide for examples and usage notes: ST_ShortestLine

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

Geometry column representing the shortest line.

Return type:

pyspark.sql.Column

simplify¶

geoanalytics.sql.functions.simplify(geometry)¶

Returns a geometry column containing the simplified geometries. This function simplifies the input geometry according to the OpenGIS Simple Features Implementation Specification for SQL 1.2.1 (06-103r4).

Refer to the GeoAnalytics guide for examples and usage notes: ST_Simplify

Parameters:: geometry (pyspark.sql.Column) – Geometry column.
Returns:: Geometry column representing the simplified geometry. The returned geometry column type will be the same as the input geometry column type.
Return type:: pyspark.sql.Column

split¶

geoanalytics.sql.functions.split(geometry, splitter)¶

Returns an array column from an input linestring or polygon column. This function splits the geometry with the splitter linestring and returns the resulting parts as an array of geometries. If the input geometry is a linestring, the output will be an array of linestrings. If the input geometry type is a polygon, the output will be an array of polygons. If a linestring is split by an equal linestring, an empty linestring along with the input linestring and the splitter linestring are returned.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Split

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
splitter (pyspark.sql.Column) – Linestring geometry column.

Returns:

Array column representing an array of geometries resulting from the split. The returned geometry type will be the same as the geometry column type.

Return type:

pyspark.sql.Column

square_bin¶

geoanalytics.sql.functions.square_bin(geometry, bin_size)¶

Returns a bin column containing a single square bin for each record in the input column. The specified bin size determines the height of each bin and is in the same units as the input geometry. The centroid of the input geometry is guaranteed to intersect with the bin returned but is not necessarily coincident with the bin center. Use ST_BinGeometry to obtain the geometry of each result bin.

This function can also be called with a long column representing the ID of the bin (see ST_BinId). The bin ID will be cast to a bin column.

Refer to the GeoAnalytics guide for examples and usage notes: ST_SquareBin

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
bin_size (int/float) – Numeric value representing the size of the side of the square bin.

Returns:

Spatial bin (bin2d) column representing a single square bin for each geometry.

Return type:

pyspark.sql.Column

square_bins¶

geoanalytics.sql.functions.square_bins(geometry, bin_size, padding=0.0)¶

Returns an array column containing square bins that cover the spatial extent of each record in the input column. The specified bin size determines the height of each bin and is in the same units as the input geometry. You can optionally specify a numeric value for padding, which conceptually applies a buffer of the specified distance to the input geometry before creating the square bins.

Refer to the GeoAnalytics guide for examples and usage notes: ST_SquareBins

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
bin_size (int/float) – Numerical value representing the size of the side of the square bin.
padding (int/float/str, optional) – Numerical buffer value applied to the geometry before finding the intersecting bins, defaults to 0.0.

Returns:

Array column representing an array of spatial bin (bin2d) square bins.

Return type:

pyspark.sql.Column

sr_text¶

geoanalytics.sql.functions.sr_text(geometry, wkt=None)¶

Can work as a getter or a setter, depending on the inputs.

Getter: Takes a geometry column and returns the spatial reference (WKT) of the column as a string column. If the spatial reference of the input geometry column has not been set, the function returns an empty string.

Setter: Takes a geometry column and a string value and returns the input geometry column with its spatial reference WKT set to the string value. This does not affect the geometry data in the column. To transform your geometry data from one spatial reference to another, use ST_Transform.

Refer to the GeoAnalytics guide for examples and usage notes: ST_SRText

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
wkt (str, optional) – Spatial reference (WKT) to set on the geometry, defaults to None.

Returns:

Getter: StringType column representing the spatial reference (WKT) for the geometry.
Setter: Geometry column representing the geometry with the updated spatial reference. The returned geometry column type will be the same as the input geometry column type.

Return type:

pyspark.sql.Column

srid¶

geoanalytics.sql.functions.srid(geometry, srid=None)¶

Can work as a getter or a setter, depending on the inputs.

Getter: Takes a geometry column and returns the spatial reference (SRID) of the column as an integer column. If the SRID of the input geometry column has not been set, the function returns 0.

Setter: Takes a geometry column and a numeric value and returns the input geometry column with its SRID set to the numeric value. This does not affect the geometry data in the column. To transform your geometry data from one spatial reference to another, use ST_Transform.

Refer to the GeoAnalytics guide for examples and usage notes: ST_SRID

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
srid (int, optional) – Spatial reference (SRID) to set on the geometry, defaults to None.

Returns:

Getter: IntegerType column representing the spatial reference (SRID) for the geometry.
Setter: Geometry column representing the geometry with the updated spatial reference. The returned geometry column type will be the same as the input geometry column type.

Return type:

pyspark.sql.Column

start_point¶

geoanalytics.sql.functions.start_point(linestring)¶

Returns a point column representing the first point of the input linestring.

Refer to the GeoAnalytics guide for examples and usage notes: ST_StartPoint

Parameters:: linestring (pyspark.sql.Column) – Linestring geometry column.
Returns:: Point column representing the starting point.
Return type:: pyspark.sql.Column

sym_difference¶

geoanalytics.sql.functions.sym_difference(geometry1, geometry2)¶

Returns a geometry column containing the geometries that represent the portions of the input geometries that do not intersect. If one of the input geometry types is geometry, the output type will be the same. For all other cases the result geometry type will be the same as the input geometry type with the highest dimension.

Refer to the GeoAnalytics guide for examples and usage notes: ST_SymDifference

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

Geometry column representing the portions of geometry1 and geometry2 that do not intersect. The returned geometry column type will be the highest dimension of the two input geometries. If one or both of the input geometries is a generic geometry type, then a generic geometry column type will be returned. For example linestring and polygon input will return a polygon geometry type.

Return type:

pyspark.sql.Column

symmetric_diff¶

geoanalytics.sql.functions.symmetric_diff(geometry1, geometry2)¶

Esri alias for OGC ST_SymDifference.

Refer to the GeoAnalytics guide for examples and usage notes: ST_SymDifference

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

Geometry column representing the portions of geometry1 and geometry2 that do not intersect. The returned geometry column type will be the highest dimension of the two input geometries. If one or both of the input geometries is a generic geometry type, then a generic geometry column type will be returned. For example linestring and polygon input will return a polygon geometry type.

Return type:

pyspark.sql.Column

touches¶

geoanalytics.sql.functions.touches(geometry1, geometry2)¶

Returns a boolean column where the result is True if the first geometry and the second geometry spatially touch on their boundaries (i.e., their intersection is a single point); otherwise, it returns False.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Touches

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

BooleanType column. True if geometry1 and geometry2 spatially touch on their boundaries, False otherwise.

Return type:

pyspark.sql.Column

transform¶

geoanalytics.sql.functions.transform(geometry, sr, *, extent=None, datum_transform=None)¶

Returns a geometry column. The input geometry column must have a spatial reference set. The sr parameter value must be a valid SRID or WKT string. The function returns the input geometries transformed into the specified spatial reference. It will also set the spatial reference of the result column. To learn more about what it means to transform your geometry data, see Coordinate systems and transformations. To set the spatial reference of a geometry column without transforming the geometries, use ST_SRID or ST_SRText.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Transform

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
sr (int/str) – The spatial reference (SRID or WKT) that the geometry will be projected into.
extent (BoundingBox, optional) – Extent of the area of analysis to use when determining the best transformation to use.
datum_transform (str, optional) – Transformation path to use when transforming between geographic spatial references. This parameter overrides the session level transform settings as well as the extent param.

Returns:

Geometry column representing the projected geometry. The returned geometry column type will be the same as the input geometry column type.

Return type:

pyspark.sql.Column

translate¶

geoanalytics.sql.functions.translate(geometry, x_offset, y_offset)¶

Translates a geometry by given offsets.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Translate

Parameters:

geometry (pyspark.sql.Column) – Geometry column.
x_offset (pyspark.sql.Column/int/float) – Numeric value that specifies the offset in the X direction. Can be a LongType, DoubleType or StringType column or a numeric value.
y_offset (pyspark.sql.Column/int/float) – Numeric value that specifies the offset in the Y direction. Can be a LongType, DoubleType or StringType column or a numeric value.

Returns:

Geometry column with the translated geometries. The returned geometry column type will be the same as the input geometry column type.

Return type:

pyspark.sql.Column

union¶

geoanalytics.sql.functions.union(*geometries)¶

Returns a geometry column containing the geometries that represent the spatial union of the geometries in each row of the input columns. The geometry types of the input columns must be the same. To find the union of all geometries in a group or column use ST_Aggr_Union.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Union

Parameters:: geometries (pyspark.sql.Column) – Multiple geometry columns.
Returns:: Geometry column representing the spatial union of the geometries. The returned geometry column type will be the same type as the input geometries except in the case of point input which will return multipoint. If one or both of the input geometries is a generic geometry type, then a generic geometry column type will be returned.
Return type:: pyspark.sql.Column

within¶

geoanalytics.sql.functions.within(geometry1, geometry2)¶

Returns a boolean column where the result is True if the first geometry is completely inside the second geometry; otherwise, it returns False.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Within

Parameters:

geometry1 (pyspark.sql.Column) – Geometry column.
geometry2 (pyspark.sql.Column) – Geometry column.

Returns:

BooleanType column. True if geometry1 is within geometry2, False otherwise.

Return type:

pyspark.sql.Column

wkb_to_sql¶

geoanalytics.sql.functions.wkb_to_sql(wkb)¶

OGC alias for ST_GeomFromBinary without SRID.

Refer to the GeoAnalytics guide for examples and usage notes: ST_GeomFromBinary

Parameters:: wkb (pyspark.sql.Column) – BinaryType column with the Well-Known Binary (WKB) representation.
Returns:: Generic geometry column from the Well-Known Binary (WKB) representation.
Return type:: pyspark.sql.Column

wkt_to_sql¶

geoanalytics.sql.functions.wkt_to_sql(wkt)¶

OGC alias for ST_GeomFromText without SRID.

Refer to the GeoAnalytics guide for examples and usage notes: ST_GeomFromText

Parameters:: wkt (pyspark.sql.Column) – StringType column with the well-known text (WKT) representation of geometries.
Returns:: Generic geometry column from the well-known text (WKT) representation.
Return type:: pyspark.sql.Column

x¶

geoanalytics.sql.functions.x(point, new_value=None)¶

Can work as a getter or a setter, depending on the inputs.

Getter: Takes a point column and returns a double column containing the x-coordinate of the input points.

Setter: Takes a point column and a numeric value or column and returns a point column containing the input points with the x-coordinates set to the numeric value.

Refer to the GeoAnalytics guide for examples and usage notes: ST_X

Parameters:

point (pyspark.sql.Column) – Point geometry column.
new_value (pyspark.sql.Column/int/float, optional) – X-coordinate to set. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.

Returns:

Getter: DoubleType column representing the x-coordinate.
Setter: Point column representing the updated x-coordinate.

Return type:

pyspark.sql.Column

y¶

geoanalytics.sql.functions.y(point, new_value=None)¶

Can work as a getter or a setter, depending on the inputs.

Getter: Takes a point column and returns a double column containing the y-coordinate of the input points.

Setter: Takes a point column and a numeric value or column and returns a point column containing the input points with the y-coordinates set to the numeric value.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Y

Parameters:

point (pyspark.sql.Column) – Point geometry column.
new_value (pyspark.sql.Column/int/float, optional) – Y-coordinate to set. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.

Returns:

Getter: DoubleType column representing the y-coordinate.
Setter: Point column representing the updated y-coordinate.

Return type:

pyspark.sql.Column

z¶

geoanalytics.sql.functions.z(point, new_value=None)¶

Can work as a getter or a setter, depending on the inputs.

Getter: Takes a point column and returns a double column containing the z-coordinates of the input points. If a point does not have a z-coordinate the function returns NaN.

Setter: Takes a point column and a numeric value or column and returns a point column containing the input points with the z-coordinates set to the numeric value.

Refer to the GeoAnalytics guide for examples and usage notes: ST_Z

Parameters:

point (pyspark.sql.Column) – Point geometry column.
new_value (pyspark.sql.Column/int/float, optional) – Z-coordinate to set. Can be a LongType, DoubleType or StringType column with numeric values or a numeric value, defaults to None.

Returns:

Getter: DoubleType column representing the z-coordinate.
Setter: Point column representing the updated z-coordinate.

Return type:

pyspark.sql.Column