Read from feature services

Feature services are data sources that are hosted online. This tutorial shows how to read from and manage feature service datasets. You can create Spark DataFrames from feature service data sources and use them with any operations supported on a DataFrame. Writing to feature services is not supported.

In this tutorial you will learn how to access public and protected feature services. You will create DataFrames from feature services and perform basic queries.

Steps

Read from a public feature service

Read a public feature service into a DataFrame and query for countries where the average population of the listed cities are greater than 50,000.

  1. Read a feature service containing major USA cities and create a DataFrame.

    Python
    Use dark colors for code blocksCopy
                       
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    import geoanalytics
    myFS="https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/World_Cities/FeatureServer/0"
    myFSDataFrame = spark.read.format('feature-service').load(myFS)
    
  2. Group the DataFrame by country and find the average population per country.

    Python
    Use dark colors for code blocksCopy
                       
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    group = myFSDataFrame.selectExpr("CNTRY_NAME", "POP").groupBy("CNTRY_NAME").avg("POP")
    
  3. Query for countries where the average population of major cities is greater than 50,000.

    Python
    Use dark colors for code blocksCopy
                       
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    group.where("avg(POP) > 50000").show()
    
    Result
    Use dark colors for code blocksCopy
                             
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    +-----------+------------------+
    | CNTRY_NAME|          avg(POP)|
    +-----------+------------------+
    |       Chad|108714.28571428571|
    |     Russia| 533285.3092783506|
    |      Yemen|389769.23076923075|
    |    Senegal|          207347.7|
    |     Sweden| 66458.33333333333|
    |   Kiribati|           63017.0|
    |Philippines|2116418.3333333335|
    |   Malaysia|       310692.3125|
    |  Singapore|         5703569.0|
    |     Turkey| 408656.7164179105|
    |     Malawi| 670284.6666666666|
    |       Iraq| 817833.3333333334|
    |    Germany| 660316.9166666666|
    |Afghanistan| 142620.6896551724|
    |   Cambodia|108777.77777777778|
    |     Jordan|         892820.25|
    |     Rwanda|           85933.2|
    |      Sudan| 861142.8571428572|
    |     France| 279596.6538461539|
    |     Greece|122750.53846153847|
    +-----------+------------------+
    only showing top 20 rows
    

Read from a protected feature service

Feature services that are protected can be read by passing a token.

  1. Read a service in with an example token. The URL and token below are for example only and will need to be updated to reflect your feature service URL and a valid token from your organization.
    Python
    Use dark colors for code blocksCopy
                       
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    myProtectedFS='https://myserver.domain.com/server/rest/services/Hosted/counties/FeatureServer/0'
    token = 'ABC123deFghIJKlmNOPQrs456'
    myProtectedFSDataFrame = spark.read.format('feature-service').option('token', token).load(myProtectedFS)

What's next?

Learn about how to read in other data types or analyze your data through SQL functions and analysis tools:

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.