Reading Feature Services

ArcGIS Online and Enterprise web services can easily be read into R usingarcgislayers. Supported service types include:

Metadata for all of the above service types can be accessed using arc_open(). Feature data can be read in using arc_select() for FeatureLayer, Table, and ImageServer.

This tutorial will teach you the basics of reading data from hosted Feature Layers into R as {sf} objects usingarcgislayers.

Objective

The objective of this tutorial is to teach you how to:

  • find a Feature Layer url from ArcGIS Online
  • read in the data from the Feature Layer
  • select the Feature Layer data by column
  • filter the Feature Layer data by attributes

Obtaining a feature layer url

For this example, you will read in population data of major US cities from ArcGIS Online.

You will use the functions arc_open() and arc_select() to read data from ArcGIS Online into R. However, these functions require the url of the hosted feature service. To find this, navigate to the item in ArcGIS Online.

usa cities

When you scroll down, on the right hand side, you will see a button to view the service itself.

view url

Clicking this will bring you to the Feature Service. Inside of a Feature Server there may be many layers or tables that you can use. In this case, there is only one layer. Click the hyperlinked USA Major Cities.

usa cities server

This reveals the Feature Layer of interest.

usa cities layer

Navigate to your browser’s search bar and copy the url.

https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/USA_Major_Cities_/FeatureServer/0

Opening a Feature Layer

Before you can read in the Feature Layer, you need to load the arcgis R package. If you do not have arcgis installed, install it with pak::pak("r-arcgis/arcgis") or install.packages("arcgis").

pak is an R package that makes it faster and easier to install R packages. If you do not have it installed, run install.packages("pak") first.

library(arcgis)

Use the below code to store the Feature Layer url in an object called furl (as in feature layer url).

furl <- "https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/USA_Major_Cities_/FeatureServer/0"

Then pass this variable to arc_open() and save it to flayer (feature layer).

flayer <- arc_open(furl)
flayer
#> <FeatureLayer>
#> Name: USA Major Cities
#> Geometry Type: esriGeometryPoint
#> CRS: 4326
#> Capabilities: Query,Extract

arc_open() will create a FeatureLayer object. Under the hood, this is really just a list containing the feature layer’s metadata.

The FeatureLayer object is obtained by adding ?f=json to the feature layer url and processing the json. All of the metadata is stored in the FeatureLayer object. You can see this by running unclass(flayer). Be warned! It gets messy.

With this FeatureLayer object, you can read data from the service into R!

Reading from a Feature Layer

Once you have a FeatureLayer object, you can read its data into memory using the arc_select() function. By default, if you use arc_select() on a FeatureLayer without any additional arguments, the entire service will be brought into memory.

Avoid reading in more data than you need! Reading an entire feature service is fine for datasets with fewer than 5,000 features. But when there are more than 10,000 features, performance and memory may be throttled.

Exceptionally detailed geometries require more data to be transferred across the web and may be slower to process and may require adjustment of the page_size argument of arc_select().

Store the results of arc_select() in the object cities.

cities <- arc_select(flayer)
cities
#> Simple feature collection with 4186 features and 11 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -159.3191 ymin: 19.58272 xmax: -68.67922 ymax: 64.86928
#> Geodetic CRS:  WGS 84
#> First 10 features:
#>    OBJECTID           NAME CLASS STATE_ABBR STATE_FIPS PLACE_FIPS POPULATION POP_CLASS POP_SQMI   SQMI CAPITAL
#> 1         1      Alabaster  city         AL         01    0100820      33284         6   1300.7  25.59        
#> 2         2    Albertville  city         AL         01    0100988      22386         6    827.9  27.04        
#> 3         3 Alexander City  city         AL         01    0101132      14843         6    337.4  43.99        
#> 4         4       Anniston  city         AL         01    0101852      21564         6    469.9  45.89        
#> 5         5         Athens  city         AL         01    0102956      25406         6    625.8  40.60        
#> 6         6         Atmore  city         AL         01    0103004       8391         5    382.5  21.94        
#> 7         7         Auburn  city         AL         01    0103076      76143         7   1234.5  61.68        
#> 8         8       Bessemer  city         AL         01    0105980      26019         6    641.8  40.54        
#> 9         9     Birmingham  city         AL         01    0107000     200733         8   1342.2 149.55        
#> 10       10         Calera  city         AL         01    0111416      16494         6    674.0  24.47        
#>                      geometry
#> 1   POINT (-86.81782 33.2445)
#> 2  POINT (-86.21205 34.26421)
#> 3  POINT (-85.95631 32.94309)
#> 4   POINT (-85.81986 33.6565)
#> 5   POINT (-86.9508 34.78484)
#> 6  POINT (-87.49009 31.02226)
#> 7  POINT (-85.48999 32.60691)
#> 8   POINT (-86.9563 33.40092)
#> 9   POINT (-86.79647 33.5288)
#> 10  POINT (-86.74549 33.1244)

The result is an sf object that you can now work with using sf and any other R packages.

Specifying output fields

In some cases, you may have Feature Layers with many extraneous fields. You can specify which fields to return to R using the fields argument.

Remember to only read in the data that you need. Adding unneeded fields uses more memory and takes longer to process.

fields takes a character vector of field names. To see which fields are available in a Feature Layer, you can use the utility function list_fields().

fields <- list_fields(flayer)
fields[, 1:4]
#> # A data frame: 11 × 4
#>    name       type                      alias                  sqlType     
#>  * <chr>      <chr>                     <chr>                  <chr>       
#>  1 OBJECTID   esriFieldTypeOID          OBJECTID               sqlTypeOther
#>  2 NAME       esriFieldTypeString       Name                   sqlTypeOther
#>  3 CLASS      esriFieldTypeString       Class                  sqlTypeOther
#>  4 STATE_ABBR esriFieldTypeString       State Abbreviation     sqlTypeOther
#>  5 STATE_FIPS esriFieldTypeString       State FIPS             sqlTypeOther
#>  6 PLACE_FIPS esriFieldTypeString       Place FIPS             sqlTypeOther
#>  7 POPULATION esriFieldTypeInteger      2020 Total Population  sqlTypeOther
#>  8 POP_CLASS  esriFieldTypeSmallInteger Population Class       sqlTypeOther
#>  9 POP_SQMI   esriFieldTypeDouble       People per square mile sqlTypeOther
#> 10 SQMI       esriFieldTypeDouble       Area in square miles   sqlTypeOther
#> 11 CAPITAL    esriFieldTypeString       Capital                sqlTypeOther

For the sake of readability, only the first 4 columns are displayed.

Let’s try reading in only the "STATE_ABBR", "POPULATION", and "NAME" fields.

arc_select(
  flayer, 
  fields = c("STATE_ABBR", "POPULATION", "NAME")
)
#> Simple feature collection with 4186 features and 3 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -159.3191 ymin: 19.58272 xmax: -68.67922 ymax: 64.86928
#> Geodetic CRS:  WGS 84
#> First 10 features:
#>    STATE_ABBR POPULATION           NAME                   geometry
#> 1          AL      33284      Alabaster  POINT (-86.81782 33.2445)
#> 2          AL      22386    Albertville POINT (-86.21205 34.26421)
#> 3          AL      14843 Alexander City POINT (-85.95631 32.94309)
#> 4          AL      21564       Anniston  POINT (-85.81986 33.6565)
#> 5          AL      25406         Athens  POINT (-86.9508 34.78484)
#> 6          AL       8391         Atmore POINT (-87.49009 31.02226)
#> 7          AL      76143         Auburn POINT (-85.48999 32.60691)
#> 8          AL      26019       Bessemer  POINT (-86.9563 33.40092)
#> 9          AL     200733     Birmingham  POINT (-86.79647 33.5288)
#> 10         AL      16494         Calera  POINT (-86.74549 33.1244)

Using SQL where clauses

Not only can you limit the number of columns returned from a Feature Layer, but you can also limit the number of rows returned. This is very handy in the case of Feature Layers with hundreds of thousands of features. Reading all of those features into memory would be slow, costly (in terms of memory), and, in many cases, unnecessary!

The where argument of arc_select() permits you to provide a very simple SQL where clause to limit the features returned. Let’s explore the use of the where argument.

Let’s modify the above arc_select() statement to return only the features in California, using the where clause STATE_ABBR = 'CA'

arc_select(
  flayer,
  where = "STATE_ABBR = 'CA'",
  fields = c("STATE_ABBR", "POPULATION", "NAME")
)
#> Simple feature collection with 498 features and 3 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -124.1662 ymin: 32.57388 xmax: -114.5903 ymax: 40.93734
#> Geodetic CRS:  WGS 84
#> First 10 features:
#>    STATE_ABBR POPULATION         NAME                   geometry
#> 1          CA      38046     Adelanto  POINT (-117.4384 34.5792)
#> 2          CA      20299 Agoura Hills POINT (-118.7601 34.15363)
#> 3          CA      78280      Alameda  POINT (-122.2614 37.7672)
#> 4          CA      15314        Alamo POINT (-122.0307 37.84998)
#> 5          CA      20271       Albany POINT (-122.3002 37.88985)
#> 6          CA      82868     Alhambra POINT (-118.1355 34.08398)
#> 7          CA      52176  Aliso Viejo POINT (-117.7289 33.57922)
#> 8          CA      14696       Alpine POINT (-116.7585 32.84388)
#> 9          CA      42846     Altadena POINT (-118.1356 34.19342)
#> 10         CA      12042    Alum Rock  POINT (-121.8239 37.3694)

You can also consider finding only the places in the US with more than 1,000,000 people.

arc_select(
  flayer,
  where = "POPULATION > 1000000",
  fields = c("STATE_ABBR", "POPULATION", "NAME")
)
#> Simple feature collection with 10 features and 3 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -121.8864 ymin: 29.42354 xmax: -74.01013 ymax: 41.75649
#> Geodetic CRS:  WGS 84
#>    STATE_ABBR POPULATION         NAME                   geometry
#> 1          AZ    1608139      Phoenix POINT (-112.0739 33.44611)
#> 2          CA    3898747  Los Angeles POINT (-118.2706 34.05279)
#> 3          CA    1386932    San Diego POINT (-117.1456 32.72033)
#> 4          CA    1013240     San Jose POINT (-121.8864 37.33941)
#> 5          IL    2746388      Chicago POINT (-87.64715 41.75649)
#> 6          NY    8804190     New York POINT (-74.01013 40.71057)
#> 7          PA    1603797 Philadelphia POINT (-75.16099 39.95136)
#> 8          TX    1304379       Dallas POINT (-96.79576 32.77865)
#> 9          TX    2304580      Houston POINT (-95.36751 29.75876)
#> 10         TX    1434625  San Antonio  POINT (-98.4925 29.42354)

Now try combining both where clauses using and to find only the cities in California with a population greater than 1,000,000.

arc_select(
  flayer,
  where = "POPULATION > 1000000 and STATE_ABBR = 'CA'",
  fields = c("STATE_ABBR", "POPULATION", "NAME")
)
#> Simple feature collection with 3 features and 3 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -121.8864 ymin: 32.72033 xmax: -117.1456 ymax: 37.33941
#> Geodetic CRS:  WGS 84
#>   STATE_ABBR POPULATION        NAME                   geometry
#> 1         CA    3898747 Los Angeles POINT (-118.2706 34.05279)
#> 2         CA    1386932   San Diego POINT (-117.1456 32.72033)
#> 3         CA    1013240    San Jose POINT (-121.8864 37.33941)

Map and Feature Servers

This example has only illustrated how to work with FeatureLayer objects. However, often times you may wish to work with a collection of layers in a FeatureServer, MapServer, or GroupLayer. All of these are collections of multiple layers. Like a FeatureLayer, these are accessed with arc_open().

furl <- "https://services3.arcgis.com/ZvidGQkLaDJxRSJ2/arcgis/rest/services/PLACES_LocalData_for_BetterHealth/FeatureServer"

fsrv <- arc_open(furl)
fsrv
#> <FeatureServer <5 layers, 0 tables>>
#> CRS: 3785
#> Capabilities: Query,Extract
#>   0: PlacePoints (esriGeometryPoint)
#>   1: PlaceBoundaries (esriGeometryPolygon)
#>   2: Counties (esriGeometryPolygon)
#>   3: Tracts (esriGeometryPolygon)
#>   4: ZCTAs (esriGeometryPolygon)

This FeatureServer contains 5 layers. The individual layers can be fetched using get_layer() which lets us specify the layer by ID or by name. It is recommended to use the ID as that will be less prone to human error (for example a space is secretly a tab). The result of the function is a FeatureLayer object that can be used with arc_select() as illustrated above.

get_layer(fsrv, id = 2)
#> <FeatureLayer>
#> Name: Counties
#> Geometry Type: esriGeometryPolygon
#> CRS: 3785
#> Capabilities: Query,Extract

Some FeatureServers will also contain tables.

furl <- "https://services.arcgis.com/P3ePLMYs2RVChkJx/arcgis/rest/services/USA_Wetlands/FeatureServer"
fsrv2 <- arc_open(furl)
fsrv2
#> <FeatureServer <1 layer, 1 table>>
#> CRS: 3857
#> Capabilities: Query,Extract
#>   0: USA_Wetlands (esriGeometryPolygon)
#>   1: Pop_Up_Table (Table)

This can be fetched using get_layer() as well.

get_layer(fsrv2, 1)
#> <Table>
#> Name: Pop_Up_Table
#> Capabilities: Query,Extract

If you would like to fetch multiple items at one time there is a plural get_layers() which will fetch multiple items based on name or id and return a list.

get_layers(fsrv, id = c(0, 2, 4))
#> [[1]]
#> <FeatureLayer>
#> Name: PlacePoints
#> Geometry Type: esriGeometryPoint
#> CRS: 3785
#> Capabilities: Query,Extract
#> 
#> [[2]]
#> <FeatureLayer>
#> Name: Counties
#> Geometry Type: esriGeometryPolygon
#> CRS: 3785
#> Capabilities: Query,Extract
#> 
#> [[3]]
#> <FeatureLayer>
#> Name: ZCTAs
#> Geometry Type: esriGeometryPolygon
#> CRS: 3785
#> Capabilities: Query,Extract

There is also a helper get_all_layers() to fetch all of layers of a FeatureServer, MapServer, or GroupLayer into a list. The list has two elements layers and tables. The former containing all of the FeatureLayers and the latter containing all of the Tables in the FeatureServer.

get_all_layers(fsrv2)
#> $layers
#> $layers$`0`
#> <FeatureLayer>
#> Name: USA_Wetlands
#> Geometry Type: esriGeometryPolygon
#> CRS: 3857
#> Capabilities: Query,Extract
#> 
#> 
#> $tables
#> $tables$`1`
#> <Table>
#> Name: Pop_Up_Table
#> Capabilities: Query,Extract

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.