Skip to content

This tutorial will show you how to access the properties and functions of the Python raster class.

The raster class provides a Python interface for working with raster data in a Spark DataFrame. It enables users to extract raster properties and band values through a comprehensive set of functions and properties, as well as to create new raster datasets. Additionally, the Raster class is compatible with Python UDFs. To be able to use the raster functions, a reference raster should be used.

Prerequisites

To complete the following steps, you will need:

  1. A running Spark session configured with ArcGIS GeoAnalytics Engine 2.0.0 or later.
  2. A notebook connected to your Spark session (e.g. Jupyter, JupyterLab, Databricks, EMR, etc.).

Steps

Import

  1. In your notebook, import geoanalytics and authorize the module using a username and password, an API key, or a license file. Also, import the modules required to run the examples below.

    Python
    Use dark colors for code blocksCopy
    1
    2
    3
    4
    5
    6
    7
    8
    9
    
    import geoanalytics
    geoanalytics.auth(username="user1", password="p@ssword")
    
    from geoanalytics.raster import functions as RT
    from geoanalytics.sql import Raster as Raster
    import numpy as np
    from pyspark.sql.functions import udf
    import matplotlib.pyplot as plt

Create a PySpark Dataframe and collect the raster object

  1. Create a Dataframe with a 3x3 raster.

    Python
    Use dark colors for code blocksCopy
    1
    2
    3
    4
    5
    6
    
    data = [(list(range(9)), )]
    df = spark.createDataFrame(data, ["pixels"]) \
         .withColumn("raster", RT.srid(RT.create_raster("pixels", 3, 3, "float32"), 4326))
    
    df = df.withColumn("raster", RT.materialize("raster"))
  2. Collect the raster object.

    Python
    Use dark colors for code blocksCopy
    1
    2
    3
    
    raster = df.first().raster
    print(raster)
    Result
    Use dark colors for code blocksCopy
    1
    Raster(columns=3, rows=3, bands=1, pixel_type=Float32)

Extract raster properties

  1. In this example, the available raster properties are printed out.

    Python
    Use dark colors for code blocksCopy
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    
    print(f"Shape: {(raster.num_bands, raster.num_columns, raster.num_rows)}")
    print(f"Extent: {raster.extent}")
    print(f"Spatial reference: {raster.spatial_reference}")
    print(f"Is reference raster: {raster.is_reference}")
    print(f"Pixel type: {raster.pixel_type}")
    print(f"No data values: {raster.no_data_values}")
    print(f"Colormap values: {raster.colormap_values}")
    print(f"Colormap colors: {raster.colormap_colors}")
    print(f"Attribute table: {raster.attribute_table}")
    Result
    Use dark colors for code blocksCopy
    1
    2
    3
    4
    5
    6
    7
    8
    9
    Shape: (1, 3, 3)
    Extent: BoundingBox(min_x=-0.5, min_y=-2.5, max_x=2.5, max_y=0.5)
    Spatial reference: 4326
    Is reference raster: False
    Pixel type: PixelType.Float32
    No data values: [None]
    Colormap values: None
    Colormap colors: None
    Attribute table: None

Extract band values

  1. Extract the raster's band values as a list
    Python
    Use dark colors for code blocksCopy
    1
    2
    
    print(raster.band_values(1))
    Result
    Use dark colors for code blocksCopy
    1
    [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0]
  2. Extract the raster's band values as a numpy array
    Python
    Use dark colors for code blocksCopy
    1
    2
    3
    4
    
    band_values = raster.np_band_values(1)
    print(band_values.shape)
    print(band_values)
    Result
    Use dark colors for code blocksCopy
    1
    2
    3
    4
    (3, 3)
    [[0. 1. 2.]
    [3. 4. 5.]
    [6. 7. 8.]]

Draw

  1. In this example, the band values are drawn.
    Python
    Use dark colors for code blocksCopy
    1
    2
    3
    
    (_, ax) = plt.subplots(figsize=(5,5))
    ax.imshow(raster.np_band_values(1));
    draw np values

Using the Python Raster Class with User-Defined Functions (UDFs)

  1. Use a UDF that returns the max numpy band value in the raster object.

    Python
    Use dark colors for code blocksCopy
    1
    2
    3
    4
    5
    6
    
    @udf(returnType="double")
    def max_band_value(raster):
        return float(np.max(raster.np_band_values(1)))
    
    df.select(max_band_value("raster")).show()
    Result
    Use dark colors for code blocksCopy
    1
    2
    3
    4
    5
    +----------------------+
    |max_band_value(raster)|
    +----------------------+
    |                   8.0|
    +----------------------+
  2. Use a UDF that creates a raster object using the Raster.create() function.

    Python
    Use dark colors for code blocksCopy
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    
    print(df.first().raster.np_values())
    
    @udf(returnType=Raster.__UDT__)
    def process_raster(raster):
        values = raster.np_values()
    
        # Add one to values which is a numpy array
        values += 1
    
        # use the Raster.create to create a raster object
        return Raster.create(values, raster.extent, raster.spatial_reference)
    
    df = df.select(process_raster("raster").alias("raster"))
    print(df.first().raster.np_values())
    Result
    Use dark colors for code blocksCopy
    1
    2
    3
    4
    5
    6
    [[[0. 1. 2.]
    [3. 4. 5.]
    [6. 7. 8.]]]
    [[[1. 2. 3.]
    [4. 5. 6.]
    [7. 8. 9.]]]

Create a new raster object

  1. This uses the spark.createDataFrame along with Raster.create() and creates a new raster using a numpy array as an input.

    Python
    Use dark colors for code blocksCopy
    1
    2
    3
    4
    5
    6
    
    data_ndarray = np.array([1,2,3,4,5,6,7,8,9], dtype="uint8").reshape(3,3)
    
    spark.createDataFrame([(Raster.create(data_ndarray, extent=(10, 10, 20, 20), sr=4326), )], ["raster"])\
         .select(RT.info("raster").alias("info")).select("info.*")\
         .show()
    Result
    Use dark colors for code blocksCopy
    1
    2
    3
    4
    5
    +----------+-------+--------+------------------+------------------+---------+----+--------------------+--------------------+
    |numColumns|numRows|numBands|         cellSizeX|         cellSizeY|pixelType|srid|              srText|              extent|
    +----------+-------+--------+------------------+------------------+---------+----+--------------------+--------------------+
    |         3|      3|       1|3.3333333333333335|3.3333333333333335|    UInt8|4326|GEOGCS["GCS_WGS_1...|[10.0, 10.0, 20.0...|
    +----------+-------+--------+------------------+------------------+---------+----+--------------------+--------------------+

What's next?

Learn about how to analyze your data through raster functions and tools:

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.