geoanalytics_fabric.STDataFrameAccessor

create_optimal_sr

geoanalytics_fabric.extensions.STDataFrameAccessor.create_optimal_sr(self, property, custom_name=None, geometry=None)

Creates a spatial reference with a custom projected coordinate system optimal for the data extent and intended purpose of your analysis.

Supported Properties:

  • EQUAL_AREA - Preserves the relative area of regions everywhere on earth. Shapes and distances will be distorted.

  • CONFORMAL - Preserves angles in small areas. Shapes, sizes, and distances will be distorted.

  • EQUIDISTANT_ONE_POINT - Preserves distances when measured through the center of the projection. Areas, shapes, and other distances will be distorted.

  • EQUIDISTANT_MERIDIANS - Preserves distances when measured along meridians. Area, shape, and other distances will be distorted.

  • COMPROMISE_WORLD - Does not preserve areas, shapes, or distances specifically, but creates a balance between these geometric properties. Compromise projections are only suggested for very large areas.

Parameters:
  • property (str) – A property that represents the purpose of the projection. Choose from EQUAL_AREA, CONFORMAL, EQUIDISTANT_ONE_POINT, EQUIDISTANT_MERIDIANS, COMPROMISE_WORLD.

  • custom_name (str, optional) – The name of the custom projected coordinate system. If unspecified, the name will be Custom_Projection.

  • geometry (str, optional) – Geometry field name. Required if there is more than one geometry field and the default is not set.

Returns:

A spatial reference object

Return type:

SpatialReference

get_extent

geoanalytics_fabric.extensions.STDataFrameAccessor.get_extent(self, geometry=None)

Computes the spatial extent of a geometry column in the dataframe and returns it as a BoundingBox.

Parameters:

geometry (str, optional) – Geometry field name. Required if there is more than one geometry field and the default is not set.

Returns:

a bounding box representing the extent

Return type:

BoundingBox

get_geometry_field

geoanalytics_fabric.extensions.STDataFrameAccessor.get_geometry_field(self, *, infer=True)

Returns the set geometry field for the Spark DataFrame.

Parameters:

infer (Boolean, optional, by name only) – If there is exactly one geometry column, then infer that it is the geometry field.

Returns:

the geometry field name if set

Return type:

str

get_spatial_reference

geoanalytics_fabric.extensions.STDataFrameAccessor.get_spatial_reference(self, geometry_field=None)

Returns the spatial reference for the geometry field.

Parameters:

geometry_field (str, optional) – Geometry column name.

Returns:

NamedTuple containing the srid, if projected (PCS), and spatial reference unit.

Return type:

geoanalytics.sql.SpatialReference

get_time_fields

geoanalytics_fabric.extensions.STDataFrameAccessor.get_time_fields(self, *, infer=True)

Returns the set time field(s) for the Spark DataFrame.

Parameters:

infer (Boolean, optional, by name only) – If there is exactly one timestamp column, then infer that it is the start time field.

Returns:

a list of time field names if set

Return type:

list

plot

geoanalytics_fabric.extensions.STDataFrameAccessor.plot(self, geometry=None, cmap_values=None, is_categorical=None, vmin=None, vmax=None, ax=None, cmap=None, figsize=None, dpi=None, aspect='equal', max_geoms=1000000, legend=False, legend_kwds=None, classification_method=None, classification_kwds=None, basemap=None, xmargin=None, ymargin=None, sr=None, extent=None, quantize=False, **style_kwds)

Plot a geometry column from a PySpark DataFrame.

Parameters:
  • geometry (str, optional) – Name of the geometry column to plot. Required if the DataFrame has more than one geometry column.

  • cmap_values (str, optional) – Name of the column to use for color mapping.

  • classification_method (str) – The name of the classification method for MapClassify

  • classification_kwds (dict) – keyword arguments to pass to mapclassify.classify such a ‘k’

  • is_categorical (bool, optional) – Set to True when the cmap_values column is categorical. The default is False.

  • vmin (float, optional) – Cmap minimum value.

  • vmax (float, optional) – Cmap maximum value.

  • ax (matplotlib.axes.Axes, optional) – The axes on which to plot. By default new axes are created.

  • cmap (str, optional) – Name of the matplotlib colormap to use.

  • figsize ((float, float), optional) – Tuple representing the width and height of the resulting matplotlib.figure.Figure in inches. This parameter is ignored when the ax parameter is set.

  • dpi (float, optional) – The resolution of the figure in dots-per-inch.

  • aspect (str or float, optional) – Aspect of the axes. Choose from “equal” (default), “auto”, or set a number representing the ratio of the height to the width.

  • max_geoms (int, optional) – Maximum number of geometries to plot. The default is 1,000,000.

  • legend (bool, optional) – Adds a legend to the plot for the cmap_values values if set to True. The default is False.

  • legend_kwds (dict, optional) – A dictionary of legend keyword arguments. For categorical legends, any argument accepted by matplotlib.axes.Axes.legend is supported. For continuous legends, see the arguments for matplotlib.pyplot.colorbar.

  • basemap (str, optional) – Adds a basemap to the plot. Choose from “light” (Light Gray Canvas), “dark” (Dark Gray Canvas), “streets” (Esri Streets Basemap) or “osm” (OpenStreetMap Vector Basemap). Basemap labels are not supported.

  • xmargin (float, optional) – Sets padding of X data. For more information see matplotlib.axes.Axes.set_xmargin.

  • ymargin (float, optional) – Sets padding of Y data. For more information see matplotlib.axes.Axes.set_ymargin.

  • sr (SpatialReference, optional) – Spatial reference (SRID or WKT) to set or transform to on the resulting plot.

  • extent (BoundingBox, optional) – Sets the extent for plotting geometries. Only geometries that intersect the extent will be visible in the plot.

  • quantize (bool, optional) – If True, geometries will be quantized to reduce the number of points plotted and may decrease plotting time. The default is False.

  • **style_kwds

Returns:

Matplotlib axes

Return type:

matplotlib.axes.Axes

set_geometry_field

geoanalytics_fabric.extensions.STDataFrameAccessor.set_geometry_field(self, geometry_field)

Returns a Spark DataFrame with the set geometry field.

Parameters:

geometry_field (str) – Geometry column name.

Returns:

Spark DataFrame with the set geometry field.

Return type:

pyspark.sql.dataframe.DataFrame

set_spatial_reference

geoanalytics_fabric.extensions.STDataFrameAccessor.set_spatial_reference(self, sr, geometry_field=None)

Sets the spatial reference on the geometry field.

Parameters:
  • sr (int, str, SpatialReference) – spatial reference wkid or wkt.

  • geometry_field (str) – Geometry column name.

Returns:

Spark DataFrame with the spatial reference set on the geometry field.

Return type:

pyspark.sql.dataframe.DataFrame

set_time_fields

geoanalytics_fabric.extensions.STDataFrameAccessor.set_time_fields(self, start_time_field, end_time_field=None)

Returns a Spark DataFrame with the set time field(s).

Parameters:
  • start_time_field (str) – TimestampType column name or StringType column name that will be cast to TimestampType.

  • end_time_field (str, optional) – TimestampType column name or StringType column name that will be cast to TimestampType.

Returns:

Spark DataFrame with the set time field(s).

Return type:

pyspark.sql.dataframe.DataFrame