Read from feature services

Feature services are data sources that are hosted online. This tutorial shows how to read from and manage feature service datasets. You can create Spark DataFrames from feature service data sources and use them with any operations supported on a DataFrame.

In this tutorial you will learn how to access public and protected feature services. You will create DataFrames from feature services and perform basic queries.

Steps

Import

  1. In your notebook, import geoanalytics and authorize the module using a username and password, or a license file.

    PythonPythonScala
    Use dark colors for code blocksCopy
    1
    2
    3
    4
    import geoanalytics
    geoanalytics.auth(username="user1", password="p@ssword")

Read from a public feature service

Read a public feature service into a DataFrame and query for countries where the average population of the listed cities are greater than 50,000.

  1. Read a feature service containing major world cities and create a DataFrame.

    PythonPythonScala
    Use dark colors for code blocksCopy
    1
    2
    3
    4
    myFS="https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/World_Cities/FeatureServer/0"
    myFSDataFrame = spark.read.format('feature-service').load(myFS)
  2. Group the DataFrame by country and find the average population per country.

    PythonPythonScala
    Use dark colors for code blocksCopy
    1
    2
    3
    group = myFSDataFrame.selectExpr("CNTRY_NAME", "POP").groupBy("CNTRY_NAME").avg("POP")
  3. Query for countries where the average population of major cities is greater than 50,000.

    PythonPythonScala
    Use dark colors for code blocksCopy
    1
    2
    3
    group.where("avg(POP) > 50000").show()
    Result
    Use dark colors for code blocksCopy
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    +-------------+------------------+
    |   CNTRY_NAME|          avg(POP)|
    +-------------+------------------+
    |       Brazil|1536742.7333333334|
    |    Argentina|1014038.2222222222|
    |         Peru|          614617.5|
    |      Bolivia| 487888.8888888889|
    |        Chile| 684300.4285714285|
    |      Ecuador|         276217.55|
    |     Colombia|      627823.90625|
    |      Uruguay| 72796.57894736843|
    |United States| 577661.3978494623|
    |       Canada| 548105.2307692308|
    |       Mexico| 1670713.138888889|
    |    Guatemala|           75000.0|
    |         Cuba|          317952.0|
    |   Costa Rica| 172696.2857142857|
    |       Panama|           52301.2|
    |    Venezuela|        369595.625|
    |    Nicaragua|           74500.0|
    |     Honduras|111499.16666666667|
    |  El Salvador| 189369.7142857143|
    |        Haiti| 319333.3333333333|
    +-------------+------------------+
    only showing top 20 rows

Read from a protected feature service

Feature services that are protected can be read by either registering a GIS using the register_gis function or by passing a token. Below is a tutorial to show how to get access to Subscriber content in ArcGIS Living Atlas of the World with an organizational subscription account.

Register a GIS to get access to a secured layer

  1. You can define a GIS name (for example, myGIS), which will be used as a reference when loading feature service layers. Pass the username and password of your account to log in to ArcGIS Online.

    PythonPythonScala
    Use dark colors for code blocksCopy
    1
    2
    3
    geoanalytics.register_gis("myGIS", "https://arcgis.com", username="User", password="p@ssw0rd")
  2. Read in a service using the registered GIS to create a DataFrame. After loading the secured feature service layer into a DataFrame, the data can be used for further analysis using GeoAnalytics tools and functions.

    PythonPythonScala
    Use dark colors for code blocksCopy
    1
    2
    3
    4
    5
    6
    7
    # Example layer: United States ZIP Code Boundaries 2021
    url = r"https://services.arcgis.com/P3ePLMYs2RVChkJx/arcgis/rest/services/USA_Boundaries_2021/FeatureServer/0"
    df = spark.read.format("feature-service") \
              .option("gis", "myGIS") \
              .load(url)
  3. Unregister a GIS. You can unregister the GIS after the feature service layer is loaded.

    PythonPythonScala
    Use dark colors for code blocksCopy
    1
    2
    3
    geoanalytics.unregister_gis("myGIS")

Use a token to get access to a secured layer

  1. Read a service in with an example token. The URL and token below are for example only and will need to be updated to reflect your feature service URL and a valid token from your organization.
    PythonPythonScala
    Use dark colors for code blocksCopy
    1
    2
    3
    4
    5
    6
    7
    8
    # Example layer: United States ZIP Code Boundaries 2021
    url = r"https://services.arcgis.com/P3ePLMYs2RVChkJx/arcgis/rest/services/USA_Boundaries_2021/FeatureServer/0"
    token = 'ABC123deFghIJKlmNOPQrs456'
    df = spark.read.format('feature-service') \
              .option('token', token) \
              .load(url)

What's next?

Learn about how to read in other data types or analyze your data through SQL functions and analysis tools:

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.