Skip to content

RT_ZonalStatistics takes a raster column and a polygon geometry column, and returns a struct column containing statistics of the raster's pixel values within each polygon zone. Each polygon in the geometry column defines a distinct zone, and the function calculates statistics for pixels that fall inside each zone. Statistics include count, minimum, maximum, range, mean, standard deviation, sum, median, the 90th percentile, variety, majority, majority count, majority percent, minority, minority count, and minority percent.

You can optionally specify the band index to analyze from the input raster. By default, the first band of the raster is used for statistical calculations.

The cell_assignment parameter determines how pixels are included in a zone. It has two options:

Center—A pixel is included if its center point lies within the polygon zone. This is the default.

Extent—A pixel is included if any part of the pixel overlaps the polygon zone. Use this option when you want to ensure all partially overlapping pixels are counted, especially for coarse raster resolutions relative to polygon size.

If both circular_wrap_low and circular_wrap_high are provided, the function computes circular statistics, ensuring values at the wrap boundary are treated as adjacent. The circular statistics are appropriate for angular or cyclic data such as aspect, wind direction, or time-of-day values. The calculation includes count, mean, standard deviation, variety, majority, majority count, majority percent, minority, minority count, and minority percent.

The function also supports column-based input for the optional parameters including band_id, cell_assignment, circular_wrap_low, and circular_wrap_high, allowing dynamic per-row configuration.

If the band ID is out of range, the function will return null.

FunctionSyntax
Pythonzonal_statistics(raster_col, zone_col, band_id=1, cell_assignment="center", circular_wrap_low=None, circular_wrap_high=None)
SQLRT_ZonalStatistics(raster_col, zone_col, band_id, cell_assignment, circular_wrap_low, circular_wrap_high)
ScalazonalStatistics(raster, zone, bandIndex, cellAssignment, circularWrapLow, circularWrapHigh)

For more details, go to the GeoAnalytics Engine API reference for zonal_statistics.

Examples

PythonPythonSQLScala
Use dark colors for code blocksCopy
1
2
3
4
5
6
7
8
9
10
11
12

from geoanalytics.raster import functions as RT
from geoanalytics.sql import functions as ST

data = [(list(range(100)), "POLYGON ((1.5 -1.5, 7.5 -1.5, 7.5 -7.5, 4.5 -7.5, 4.5 -4.5, 1.5 -4.5))")]
df = spark.createDataFrame(data, ["pixels", "poly_wkt"]) \
     .withColumn("raster", RT.create_raster("pixels", 10, 10, "float32")) \
     .withColumn("polygon", ST.poly_from_text("poly_wkt"))

zonal_stats = df.select(RT.zonal_statistics("raster", zone_col="polygon", band_id=1, cell_assignment="extent").alias("zonal_stats"))

zonal_stats.select("zonal_stats.*").show()
Result
Use dark colors for code blocksCopy
1
2
3
4
5
+-----+----+----+-----+----+-----------------+------+------+-----------------+-------+--------+-------------+---------------+--------+-------------+---------------+
|count| min| max|range|mean|            stdev|   sum|median|       percentile|variety|majority|majorityCount|majorityPercent|minority|minorityCount|minorityPercent|
+-----+----+----+-----+----+-----------------+------+------+-----------------+-------+--------+-------------+---------------+--------+-------------+---------------+
|   27|22.0|77.0| 55.0|45.0|17.00980109623077|1215.0|  43.0|70.19999999999999|   NULL|    NULL|         NULL|           NULL|    NULL|         NULL|           NULL|
+-----+----+----+-----+----+-----------------+------+------+-----------------+-------+--------+-------------+---------------+--------+-------------+---------------+

Version table

ReleaseNotes

2.0.0

Python, SQL, and Scala functions introduced

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.