Install and set up
ArcGIS GeoAnalytics Engine can be installed on a personal computer, a standalone Spark cluster, or a managed Spark service in the cloud. If you have a GeoAnalytics Engine subscription with a username and password, you can download the ArcGIS GeoAnalytics Engine distribution here after signing in. If you have a license file, follow the instructions provided with your license file to download the GeoAnalytics Engine distribution.
The ArcGIS GeoAnalytics Engine 1.0.0 distribution includes the following files and directories:
geoanalytics_- ArcGIS GeoAnalytics Engine plugin for Apache Spark.
geoanalytics-1.0.0.zip- ArcGIS GeoAnalytics Engine Python distribution in zip format.
geoanalytics-1.0.0-py3-none-any.whl- ArcGIS GeoAnalytics Engine Python distribution in wheel format.
help/samples/- Sample notebooks with example workflows using GeoAnalytics Engine.
help/doc/- Documentation for offline users.
License- Copyright information, licenses, and user agreements.
You can also choose to install supplementary projection data with ArcGIS GeoAnalytics Engine. For more information see Coordinate systems and transformations and the README included with the ArcGIS GeoAnalytics Engine Projection Engine Data distribution.
GeoAnalytics Engine must be authorized before running any tool or function. For more information see Licensing and Authorization.
Apache Spark supports a local deployment mode that is useful for testing in a shell or notebook prior to using resources on a larger Spark cluster. This deployment mode lets you run PySpark code using your personal computer's resources as a single node cluster.
See this guide for instructions on using GeoAnalytics Engine in Spark local mode.
For working with large datasets, a cluster or managed Spark service offers the ability to scale out compute resources and utilize the true potential of Spark. Spark cluster mode allows you to configure Apache Spark on any number of nodes in a cluster of machines that you deploy. See this guide for instructions on using GeoAnalytics Engine in Spark Cluster mode.
GeoAnalytics Engine supports use with the following managed Spark services:
Within each service you can deploy customized Spark clusters and PySpark notebooks. The advantages of deploying a Spark cluster in the cloud include a small startup cost, the ability to deploy and shut down resources quickly, and the option to scale up or scale down resources as needed.
GeoAnalytics Engine extends Spark and thus requires Spark and its dependencies to be installed prior to using the API. The table below summarizes which versions of Spark and its dependencies are supported by each version of GeoAnalytics Engine.
Support for new versions of Spark or its dependencies may be added with any minor release while support for older versions may be dropped with any major release. For more information see Versioning policy. Managed Spark services hosted in the cloud are often pre-configured with Spark dependencies and ready to use. See the install guide for each cloud provider for the list of runtimes supported by GeoAnalytics Engine.