geoanalytics.STDataFrameAccessor

get_geometry_field

geoanalytics.extensions.STDataFrameAccessor.get_geometry_field(self)

Returns the set geometry field for the Spark DataFrame.

get_spatial_reference

geoanalytics.extensions.STDataFrameAccessor.get_spatial_reference(self, geometry_field=None)

Returns the spatial reference for the geometry field.

Parameters

geometry_field (pyspark.sql.Column, optional) – Geometry type column.

Returns

NamedTuple containing the srid, if projected (PCS), and spatial reference unit.

Return type

geoanalytics.sql.SpatialReference

get_time_fields

geoanalytics.extensions.STDataFrameAccessor.get_time_fields(self)

Returns the set time field(s) for the Spark DataFrame.

plot

geoanalytics.extensions.STDataFrameAccessor.plot(self, geometry=None, cmap_values=None, is_categorical=False, vmin=None, vmax=None, ax=None, cmap=None, figsize=None, dpi=None, aspect='auto', max_geoms=1000000, legend=False, legend_kwds=None, **style_kwds)

Plot a geometry column from a PySpark DataFrame.

Parameters
  • geometry (str, optional) – Name of the geometry column to plot. Required if the DataFrame has more than one geometry column.

  • cmap_values (str, optional) – Name of the column to use for color mapping.

  • is_categorical (bool, optional) – Set to True when the cmap_values column is categorical. The default is False.

  • vmin (float, optional) – Cmap minimum value.

  • vmax (float, optional) – Cmap maximum value.

  • ax (matplotlib.axes.Axes, optional) – The axes on which to plot. By default new axes are created.

  • cmap (str, optional) – Name of the matplotlib colormap to use.

  • figsize ((float, float), optional) – Tuple representing the width and height of the resulting matplotlib.figure.Figure in inches. This parameter is ignored when the ax parameter is set.

  • dpi (float, optional) – The resolution of the figure in dots-per-inch.

  • aspect (str or float, optional) – Aspect of the axes. Choose from “auto” (default), “equal”, or set a number representing the ratio of the height to the width.

  • max_geoms (int, optional) – Maximum number of geometries to plot. The default is 1,000,000.

  • legend (bool, optional) – Adds a legend to the plot for the cmap_values values if set to True. The default is False.

  • legend_kwds (dict, optional) – A dictionary of legend keyword arguments. For categorical legends, any argument accepted by matplotlib.axes.Axes.legend is supported. For continuous legends, see the arguments for matplotlib.pyplot.colorbar.

  • style_kwds (kwargs, optional) – Style keyword arguments. If plotting points and multipoints, any argument accepted by matplotlib.pyplot.scatter is supported. For linestrings and polygons, see the arguments for matplotlib.collections.LineCollection and matplotlib.collections.PatchCollection respectively.

Style Keyword Arguments
  • zorder (float) – Sets the drawing order when multiple geometry columns are plotted on the same axes.

Returns

Matplotlib axes

Return type

matplotlib.axes.Axes

set_geometry_field

geoanalytics.extensions.STDataFrameAccessor.set_geometry_field(self, geometry_field)

Returns a Spark DataFrame with the set geometry field.

Parameters

geometry_field (pyspark.sql.Column) – Geometry type column.

Returns

Spark DataFrame with the set geometry field.

Return type

pyspark.sql.dataframe.DataFrame

set_time_fields

geoanalytics.extensions.STDataFrameAccessor.set_time_fields(self, start_time_field, end_time_field=None)

Returns a Spark DataFrame with the set time field(s).

Parameters
  • start_time_field (pyspark.sql.Column) – TimestampType column or StringType column that will be cast to TimestampType.

  • end_time_field (pyspark.sql.Column, optional) – TimestampType column or StringType column that will be cast to TimestampType.

Returns

Spark DataFrame with the set time field(s).

Return type

pyspark.sql.dataframe.DataFrame

to_pandas_sdf

geoanalytics.extensions.STDataFrameAccessor.to_pandas_sdf(self, geometry=None)

Converts a Spark DataFrame to a Pandas Spatially Enabled DataFrame.

Note

The map viewer widget is only supported in Jupyter Notebooks.

Parameters

geometry (pyspark.sql.Column) – Geometry type column to use for the Pandas Spatial Enabled DataFrame geometry column, defaults to None. If no column is specified, the first valid geometry type column will be used.

Returns

A Pandas Spatially Enabled DataFrame.

Return type

pandas.core.frame.DataFrame