Skip to content

RT_ToBinary takes a raster column and a string value that represents the file format and returns a binary representation of the raster in the specified file format as a binary column. The function supports the following file formats: 'GTiff', 'JPEG', and 'PNG'.

You can optionally specify a masking policy or the NoData value. A mask is used to determine whether cells are valid in the raster.

A mask is essentially a boolean array of the same length as the data. If the value at the index is false, the cell is not valid and should not be included for analysis.

The following policies are supported:

  • promotion: Converts the raster to the next largest pixel type and sets the NoData value to the largest value for the new type. If the pixel type is Float64, no conversion will occur and the NoData value will be set to 1.7976931348623157E308, the maximum value for Float64.

  • <numeric value>: Sets the NoData value to the numeric value provided. Masked pixels will be set to this value.

    • WARNING any unmasked pixels with this value will be masked when read back in.
  • maximum: Sets the NoData value to the maximum value for the pixel type.

  • minimum: Sets the NoData value to the minimum value for the pixel type.

  • best-effort: No policy is applied. The NoData value present will be used if it exists.

It is not guaranteed that NoData metadata is handled when processing an input raster. The best way to correctly preserve the NoData value when writing out the raster is to either explicitly specify the NoData value in the RT_ToBinary function's parameter or use the promotion policy.

The table below lists the supported pixel types when writing a raster column to a binary column in the specified format:

FormatSupported Pixel Types
GTiff (.tif)uint1, uint2, uint4, uint8, int8, uint16, int16, uint32, int32, float32, float64
JPEG (.jpg)uint8, int8
PNG (.png)uint8, int8, uint16, int16

The short version of pixel type is also supported. For example, u1 can be used instead of uint1 or f32 instead of float32. Also, the pixel type representation is bit-based. For example, u1 is one bit and not one byte, which means that it uses one bit per pixel.

FunctionSyntax
Pythonto_binary(raster_col, format="gtiff", policy=None, compression=None)
SQLRT_ToBinary(raster_col, format, policy, compression)
ScalatoBinary(raster, format, noDataPolicy, compression)

For more details, go to the GeoAnalytics Engine API reference for to_binary.

Examples

PythonPythonSQLScala
Use dark colors for code blocksCopy
1
2
3
4
5
6
7
8
9
10
11
12
13
14

from geoanalytics.raster import functions as RT

data = [([1,2,3,4], )]
df = spark.createDataFrame(data, ["pixels"]) \
     .withColumn("raster", RT.create_raster("pixels", 2, 2, "int32"))

binary = df.select(RT.to_binary("raster", "gtiff", "promotion").alias("to_binary"))
binary.show()

# Optional: write binary to file
# gtiff_bytes = binary.collect()[0]["to_binary"]
# with open("example.tif", "wb") as f:
#     f.write(gtiff_bytes)
Result
Use dark colors for code blocksCopy
1
2
3
4
5
+--------------------+
|           to_binary|
+--------------------+
|[49 49 2A 00 08 0...|
+--------------------+

Version table

ReleaseNotes

2.0.0

Python, SQL, and Scala functions introduced

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.