ArcGIS Online and Enterprise web services can easily be read into R usingarcgislayers. Supported service types include:
Metadata for all of the above service types can be accessed using arc_open()
. Feature data can be read in using arc_select()
for FeatureLayer, Table, and ImageServer.
This tutorial will teach you the basics of reading data from hosted Feature Layers into R as {sf}
objects usingarcgislayers.
Objective
The objective of this tutorial is to teach you how to:
- find a Feature Layer url from ArcGIS Online
- read in the data from the Feature Layer
- select the Feature Layer data by column
- filter the Feature Layer data by attributes
Obtaining a feature layer url
For this example, you will read in population data of major US cities from ArcGIS Online.
You will use the functions arc_open()
and arc_select()
to read data from ArcGIS Online into R. However, these functions require the url of the hosted feature service. To find this, navigate to the item in ArcGIS Online.
When you scroll down, on the right hand side, you will see a button to view the service itself.
Clicking this will bring you to the Feature Service. Inside of a Feature Server there may be many layers or tables that you can use. In this case, there is only one layer. Click the hyperlinked USA Major Cities.
This reveals the Feature Layer of interest.
Navigate to your browser’s search bar and copy the url.
https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/USA_Major_Cities_/FeatureServer/0
Opening a Feature Layer
Before you can read in the Feature Layer, you need to load the arcgis
R package. If you do not have arcgis
installed, install it with pak::pak("r-arcgis/arcgis")
or install.packages("arcgis")
.
pak is an R package that makes it faster and easier to install R packages. If you do not have it installed, run install.packages("pak")
first.
library(arcgis)
Use the below code to store the Feature Layer url in an object called furl
(as in feature layer url).
furl <- "https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/USA_Major_Cities_/FeatureServer/0"
Then pass this variable to arc_open()
and save it to flayer
(feature layer).
flayer <- arc_open(furl) flayer
#> <FeatureLayer> #> Name: USA Major Cities #> Geometry Type: esriGeometryPoint #> CRS: 4326 #> Capabilities: Query,Extract
arc_open()
will create a FeatureLayer
object. Under the hood, this is really just a list containing the feature layer’s metadata.
The FeatureLayer
object is obtained by adding ?f=json
to the feature layer url and processing the json. All of the metadata is stored in the FeatureLayer
object. You can see this by running unclass(flayer)
. Be warned! It gets messy.
With this FeatureLayer
object, you can read data from the service into R!
Reading from a Feature Layer
Once you have a FeatureLayer
object, you can read its data into memory using the arc_select()
function. By default, if you use arc_select()
on a FeatureLayer
without any additional arguments, the entire service will be brought into memory.
Avoid reading in more data than you need! Reading an entire feature service is fine for datasets with fewer than 5,000 features. But when there are more than 10,000 features, performance and memory may be throttled.
Exceptionally detailed geometries require more data to be transferred across the web and may be slower to process and may require adjustment of the page_size
argument of arc_select()
.
Store the results of arc_select()
in the object cities
.
cities <- arc_select(flayer) cities
#> Simple feature collection with 4186 features and 11 fields #> Geometry type: POINT #> Dimension: XY #> Bounding box: xmin: -159.3191 ymin: 19.58272 xmax: -68.67922 ymax: 64.86928 #> Geodetic CRS: WGS 84 #> First 10 features: #> OBJECTID NAME CLASS STATE_ABBR STATE_FIPS PLACE_FIPS POPULATION POP_CLASS POP_SQMI SQMI CAPITAL #> 1 1 Alabaster city AL 01 0100820 33284 6 1300.7 25.59 #> 2 2 Albertville city AL 01 0100988 22386 6 827.9 27.04 #> 3 3 Alexander City city AL 01 0101132 14843 6 337.4 43.99 #> 4 4 Anniston city AL 01 0101852 21564 6 469.9 45.89 #> 5 5 Athens city AL 01 0102956 25406 6 625.8 40.60 #> 6 6 Atmore city AL 01 0103004 8391 5 382.5 21.94 #> 7 7 Auburn city AL 01 0103076 76143 7 1234.5 61.68 #> 8 8 Bessemer city AL 01 0105980 26019 6 641.8 40.54 #> 9 9 Birmingham city AL 01 0107000 200733 8 1342.2 149.55 #> 10 10 Calera city AL 01 0111416 16494 6 674.0 24.47 #> geometry #> 1 POINT (-86.81782 33.2445) #> 2 POINT (-86.21205 34.26421) #> 3 POINT (-85.95631 32.94309) #> 4 POINT (-85.81986 33.6565) #> 5 POINT (-86.9508 34.78484) #> 6 POINT (-87.49009 31.02226) #> 7 POINT (-85.48999 32.60691) #> 8 POINT (-86.9563 33.40092) #> 9 POINT (-86.79647 33.5288) #> 10 POINT (-86.74549 33.1244)
The result is an sf
object that you can now work with using sf
and any other R packages.
Specifying output fields
In some cases, you may have Feature Layers with many extraneous fields. You can specify which fields to return to R using the fields
argument.
Remember to only read in the data that you need. Adding unneeded fields uses more memory and takes longer to process.
fields
takes a character vector of field names. To see which fields are available in a Feature Layer, you can use the utility function list_fields()
.
fields <- list_fields(flayer) fields[, 1:4]
#> # A data frame: 11 × 4 #> name type alias sqlType #> * <chr> <chr> <chr> <chr> #> 1 OBJECTID esriFieldTypeOID OBJECTID sqlTypeOther #> 2 NAME esriFieldTypeString Name sqlTypeOther #> 3 CLASS esriFieldTypeString Class sqlTypeOther #> 4 STATE_ABBR esriFieldTypeString State Abbreviation sqlTypeOther #> 5 STATE_FIPS esriFieldTypeString State FIPS sqlTypeOther #> 6 PLACE_FIPS esriFieldTypeString Place FIPS sqlTypeOther #> 7 POPULATION esriFieldTypeInteger 2020 Total Population sqlTypeOther #> 8 POP_CLASS esriFieldTypeSmallInteger Population Class sqlTypeOther #> 9 POP_SQMI esriFieldTypeDouble People per square mile sqlTypeOther #> 10 SQMI esriFieldTypeDouble Area in square miles sqlTypeOther #> 11 CAPITAL esriFieldTypeString Capital sqlTypeOther
For the sake of readability, only the first 4 columns are displayed.
Let’s try reading in only the "STATE_ABBR"
, "POPULATION"
, and "NAME"
fields.
arc_select( flayer, fields = c("STATE_ABBR", "POPULATION", "NAME") )
#> Simple feature collection with 4186 features and 3 fields #> Geometry type: POINT #> Dimension: XY #> Bounding box: xmin: -159.3191 ymin: 19.58272 xmax: -68.67922 ymax: 64.86928 #> Geodetic CRS: WGS 84 #> First 10 features: #> STATE_ABBR POPULATION NAME geometry #> 1 AL 33284 Alabaster POINT (-86.81782 33.2445) #> 2 AL 22386 Albertville POINT (-86.21205 34.26421) #> 3 AL 14843 Alexander City POINT (-85.95631 32.94309) #> 4 AL 21564 Anniston POINT (-85.81986 33.6565) #> 5 AL 25406 Athens POINT (-86.9508 34.78484) #> 6 AL 8391 Atmore POINT (-87.49009 31.02226) #> 7 AL 76143 Auburn POINT (-85.48999 32.60691) #> 8 AL 26019 Bessemer POINT (-86.9563 33.40092) #> 9 AL 200733 Birmingham POINT (-86.79647 33.5288) #> 10 AL 16494 Calera POINT (-86.74549 33.1244)
Using SQL where clauses
Not only can you limit the number of columns returned from a Feature Layer, but you can also limit the number of rows returned. This is very handy in the case of Feature Layers with hundreds of thousands of features. Reading all of those features into memory would be slow, costly (in terms of memory), and, in many cases, unnecessary!
The where
argument of arc_select()
permits you to provide a very simple SQL where clause to limit the features returned. Let’s explore the use of the where
argument.
Let’s modify the above arc_select()
statement to return only the features in California, using the where clause STATE_ABBR = 'CA'
arc_select( flayer, where = "STATE_ABBR = 'CA'", fields = c("STATE_ABBR", "POPULATION", "NAME") )
#> Simple feature collection with 498 features and 3 fields #> Geometry type: POINT #> Dimension: XY #> Bounding box: xmin: -124.1662 ymin: 32.57388 xmax: -114.5903 ymax: 40.93734 #> Geodetic CRS: WGS 84 #> First 10 features: #> STATE_ABBR POPULATION NAME geometry #> 1 CA 38046 Adelanto POINT (-117.4384 34.5792) #> 2 CA 20299 Agoura Hills POINT (-118.7601 34.15363) #> 3 CA 78280 Alameda POINT (-122.2614 37.7672) #> 4 CA 15314 Alamo POINT (-122.0307 37.84998) #> 5 CA 20271 Albany POINT (-122.3002 37.88985) #> 6 CA 82868 Alhambra POINT (-118.1355 34.08398) #> 7 CA 52176 Aliso Viejo POINT (-117.7289 33.57922) #> 8 CA 14696 Alpine POINT (-116.7585 32.84388) #> 9 CA 42846 Altadena POINT (-118.1356 34.19342) #> 10 CA 12042 Alum Rock POINT (-121.8239 37.3694)
You can also consider finding only the places in the US with more than 1,000,000 people.
arc_select( flayer, where = "POPULATION > 1000000", fields = c("STATE_ABBR", "POPULATION", "NAME") )
#> Simple feature collection with 10 features and 3 fields #> Geometry type: POINT #> Dimension: XY #> Bounding box: xmin: -121.8864 ymin: 29.42354 xmax: -74.01013 ymax: 41.75649 #> Geodetic CRS: WGS 84 #> STATE_ABBR POPULATION NAME geometry #> 1 AZ 1608139 Phoenix POINT (-112.0739 33.44611) #> 2 CA 3898747 Los Angeles POINT (-118.2706 34.05279) #> 3 CA 1386932 San Diego POINT (-117.1456 32.72033) #> 4 CA 1013240 San Jose POINT (-121.8864 37.33941) #> 5 IL 2746388 Chicago POINT (-87.64715 41.75649) #> 6 NY 8804190 New York POINT (-74.01013 40.71057) #> 7 PA 1603797 Philadelphia POINT (-75.16099 39.95136) #> 8 TX 1304379 Dallas POINT (-96.79576 32.77865) #> 9 TX 2304580 Houston POINT (-95.36751 29.75876) #> 10 TX 1434625 San Antonio POINT (-98.4925 29.42354)
Now try combining both where clauses using and
to find only the cities in California with a population greater than 1,000,000.
arc_select( flayer, where = "POPULATION > 1000000 and STATE_ABBR = 'CA'", fields = c("STATE_ABBR", "POPULATION", "NAME") )
#> Simple feature collection with 3 features and 3 fields #> Geometry type: POINT #> Dimension: XY #> Bounding box: xmin: -121.8864 ymin: 32.72033 xmax: -117.1456 ymax: 37.33941 #> Geodetic CRS: WGS 84 #> STATE_ABBR POPULATION NAME geometry #> 1 CA 3898747 Los Angeles POINT (-118.2706 34.05279) #> 2 CA 1386932 San Diego POINT (-117.1456 32.72033) #> 3 CA 1013240 San Jose POINT (-121.8864 37.33941)
Map and Feature Servers
This example has only illustrated how to work with FeatureLayer
objects. However, often times you may wish to work with a collection of layers in a FeatureServer
, MapServer
, or GroupLayer
. All of these are collections of multiple layers. Like a FeatureLayer
, these are accessed with arc_open()
.
furl <- "https://services3.arcgis.com/ZvidGQkLaDJxRSJ2/arcgis/rest/services/PLACES_LocalData_for_BetterHealth/FeatureServer" fsrv <- arc_open(furl) fsrv
#> <FeatureServer <5 layers, 0 tables>> #> CRS: 3785 #> Capabilities: Query,Extract #> 0: PlacePoints (esriGeometryPoint) #> 1: PlaceBoundaries (esriGeometryPolygon) #> 2: Counties (esriGeometryPolygon) #> 3: Tracts (esriGeometryPolygon) #> 4: ZCTAs (esriGeometryPolygon)
This FeatureServer
contains 5 layers. The individual layers can be fetched using get_layer()
which lets us specify the layer by ID or by name. It is recommended to use the ID as that will be less prone to human error (for example a space is secretly a tab). The result of the function is a FeatureLayer
object that can be used with arc_select()
as illustrated above.
get_layer(fsrv, id = 2)
#> <FeatureLayer> #> Name: Counties #> Geometry Type: esriGeometryPolygon #> CRS: 3785 #> Capabilities: Query,Extract
Some FeatureServer
s will also contain tables.
furl <- "https://services.arcgis.com/P3ePLMYs2RVChkJx/arcgis/rest/services/USA_Wetlands/FeatureServer" fsrv2 <- arc_open(furl) fsrv2
#> <FeatureServer <1 layer, 1 table>> #> CRS: 3857 #> Capabilities: Query,Extract #> 0: USA_Wetlands (esriGeometryPolygon) #> 1: Pop_Up_Table (Table)
This can be fetched using get_layer()
as well.
get_layer(fsrv2, 1)
#> <Table> #> Name: Pop_Up_Table #> Capabilities: Query,Extract
If you would like to fetch multiple items at one time there is a plural get_layers()
which will fetch multiple items based on name or id and return a list.
get_layers(fsrv, id = c(0, 2, 4))
#> [[1]] #> <FeatureLayer> #> Name: PlacePoints #> Geometry Type: esriGeometryPoint #> CRS: 3785 #> Capabilities: Query,Extract #> #> [[2]] #> <FeatureLayer> #> Name: Counties #> Geometry Type: esriGeometryPolygon #> CRS: 3785 #> Capabilities: Query,Extract #> #> [[3]] #> <FeatureLayer> #> Name: ZCTAs #> Geometry Type: esriGeometryPolygon #> CRS: 3785 #> Capabilities: Query,Extract
There is also a helper get_all_layers()
to fetch all of layers of a FeatureServer
, MapServer
, or GroupLayer
into a list. The list has two elements layers
and tables
. The former containing all of the FeatureLayer
s and the latter containing all of the Table
s in the FeatureServer
.
get_all_layers(fsrv2)
#> $layers #> $layers$`0` #> <FeatureLayer> #> Name: USA_Wetlands #> Geometry Type: esriGeometryPolygon #> CRS: 3857 #> Capabilities: Query,Extract #> #> #> $tables #> $tables$`1` #> <Table> #> Name: Pop_Up_Table #> Capabilities: Query,Extract