Skip to content

RT_Statistics takes a raster column and returns an array column. This array includes statistics about the raster such as minimum, maximum, mean, and standard deviation. The statistics provide valuable estimates of the raster's statistical properties but it is important to note that these estimates may not always reflect the exact statistics of the raster data.

FunctionSyntax
Pythonstatistics(raster_col)
SQLRT_Statistics(raster_col)
Scalastatistics(raster)

For more details, go to the GeoAnalytics Engine API reference for statistics.

Examples

PythonPythonSQLScala
Use dark colors for code blocksCopy
1
2
3
4
5
6
7
8
9
10
11

from geoanalytics.raster import functions as RT
from pyspark.sql import functions as F

data = [(list(range(100)), )]
df = spark.createDataFrame(data, ["pixels"]) \
     .withColumn("raster", RT.create_raster("pixels", 10, 10, "float32"))

stats = df.select(RT.statistics("raster").alias("statistics"))

stats.withColumn("item", F.explode("statistics")).select("item.*").show()
Result
Use dark colors for code blocksCopy
1
2
3
4
5
+---+----+----+------------------+
|min| max|mean|             stdev|
+---+----+----+------------------+
|0.0|99.0|49.5|29.011491975882016|
+---+----+----+------------------+

Version table

ReleaseNotes

2.0.0

Python, SQL, and Scala functions introduced

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.