GeoAnalytics Engine supports geocoding and network analysis tools. To use these tools, you need to setup the required components described below.
The geocoding tools require a locator and the network analysis tools require a network dataset. The locator or network dataset must be locally accessible to all nodes in your Spark cluster. There are two supported workflows for setting up these files to run Geocoding or Network Analysis tools:
- You can first upload the locator or network dataset to a cloud storage like Amazon S3 and then mount or copy it to each node's local system. The location storing these files in each node's file system needs to have enough disk space to store the locator or network dataset. Note that copying the data files to each node needs to happen at the initialization of Spark cluster
- In GeoAnalytics Engine 1.5.x and above, you can also load the locator or network dataset using
Spark. This allows you to distribute files across the cluster and access them on each node. Note that you can add files usingContext.add File Sparkafter Spark cluster is initialized.Context.add File
See the section below for examples demonstrating these approaches.
Examples
Setting up the locator or network dataset using init script:
- Upload the locator or network dataset to a cloud file system like Azure Blob Storage.
- Install GeoAnalytics Engine on Databricks.
- On a notebook, mount the locator or network dataset to DBFS using the
dbutils.fs.mountcommand. - Update the Cluster-scoped init script to copy files from the mounted location to
/databricks/.Use dark colors for code blocks Copy cp -r /dbfs/mnt/locators/. /databricks/locators/ cp -r /dbfs/mnt/network_datasets/. /databricks/network_datasets/
Setting up the locator or network dataset using Spark Context.add File:
- Install GeoAnalytics Engine on Databricks.
- In a notebook cell, load the locator or network dataset using
Spark:Context.add File Use dark colors for code blocks Copy sc.addFile("s3://data/example.mmpk") - Run the geocoding or network analysis tools with the locator or network dataset file:
Python Python Scala Use dark colors for code blocks Copy result = CreateServiceAreas() \ .setNetwork("example.mmpk") \ .setCutoffs(5, "minutes") \ .run(facilities)
Using StreetMap Premium data
A valid data license is required to use any ArcGIS StreetMap Premium locator or
network dataset with GeoAnalytics Engine tools.
To install a StreetMap Premium data license, save the runtime licensing string to a text file and store the file somewhere accessible to all nodes in your Spark cluster.
This configuration property must be set when starting Spark and cannot be changed after the Spark Context has been created.
In the Spark configuration, set the spark.geoanalytics.smp.license.file property to the path of the file containing the licensing string, for example:
spark.geoanalytics.smp.license.file /data/engine/smp_license.txt