Skip To Content ArcGIS for Developers Sign In Dashboard

Run Python Script

The Run Python Script task executes a Python script on your ArcGIS GeoAnalytics Server site. In the script, you can create an analysis pipeline by chaining together multiple GeoAnalytics Tools without writing intermediate results to a data store. You can also use other Python functionality in the script that can be distributed across your GeoAnalytics Server.

For example, suppose that each week you receive a new dataset of vehicle locations containing billions of point features. Each time you receive a new dataset, you must perform the same workflow involving multiple GeoAnalytics Tools to create an information product that you share within your organization. This workflow creates several large intermediate layers that take up lots of space in your data store. By scripting this workflow in Python and executing the code in the Run Python Script task, you can avoid creating these unnecessary intermediate layers, while simplifying the steps to create the information product.

When you use Run Python Script, the Python code is executed on your GeoAnalytics Server. The script runs with the Python 3.6 environment that is installed with GeoAnalytics Server, and all console output is returned as job messages. Some Python modules can be used in your script to execute code across multiple cores of one or more machines in your GeoAnalytics Server using Spark 2.2.0(the compute platform that distributes analysis for GeoAnalytics Tools).

A geoanalytics module is available and allows you to run GeoAnalytics Tools in the script. This package is imported automatically when you use Run Python Script. To learn more, see Using GeoAnalytics Tools in Run Python Script.

To interact directly with Spark in the Run Python Script task, use the pyspark module, which is imported automatically when you run the task. The pyspark module is the Python API for Spark and provides a collection of distributed analysis tools for data management, clustering, regression, and more that can be called in Run Python Script and run across your GeoAnalytics Server .

For examples demonstrating how to use the geoanalytics and pyspark packages, see Examples: Scripting custom analysis with the Run Python Script task.

When using the geoanalytics and pyspark packages, most functions return analysis results in memory as Spark DataFrames. Spark data frames can be written to a data store or used in the script. This allows for the chaining together of multiple geoanalytics and pyspark tools, while only writing out the final result to a data store, eliminating the need to create any intermediate result layers. To learn more, see Reading and writing layers in pyspark.

For advanced users, an instance of SparkContext is instantiated automatically as sc and can be used in the script to interact with Spark. This allows for the execution of custom distributed analysis across your GeoAnalytics Server.

It is recommended that you use an integrated development environment (IDE) to write your Python script, and copy the script text into the Run Python Script tool. This makes it easier to identify syntax errors and typos prior to running your script. It is also recommended that you run your script using a small subset of the input data first to verify that there are no logic errors or exceptions. You can use the Describe Dataset task to create a sample layer for this purpose.

Note:
The user running the Run Python Script task must have the administrative privilege to publish web tools.

Note:

When ArcGIS GeoAnalytics Server is installed on Linux, additional configuration steps are required prior to using the Run Python Script task. These steps are not required in Windows environments. To use Run Python Script on Linux, install and configure Python 3.6 for Linux on each machine in your GeoAnalytics Server site, ensuring that Python is installed into the same directory on each machine. Then, update the ArcGIS Server Properties on your GeoAnalytics Server site with the pysparkPython property. The value of this property should be the path to the Python executable on your GeoAnalytics server machines, for example {"pysparkPython":"/usr/bin/python"}.

Request URL

https://<analysis url>/RunPythonScript/submitJob

Request parameters

ParameterDescription

pythonScript

(Required)

The Python script that will run on your GeoAnalytics Server. This must be the full script as a string.

The layers provided in inputLayers can be accessed in the script using the layers object. To learn more, see Reading and writing layers in pyspark.

GeoAnalytics Tools can be accessed with the geoanalytics object, which is instantiated in the script environment automatically. To learn more, see Using GeoAnalytics Tools in Run Python Script.

For a collection of example scripts, see Examples: Scripting custom analysis with the Run Python Script task.

REST web example:

  • print("Hello world!")

REST scripting example:

  • "print(/"Hello world!/")"

inputLayers

A list of input layers that will be used in the Python script. Each input layer follows the same formatting as described in the Feature Input topic. This can be one of the following:

  • A URL to a feature service layer with an optional filter to select specific features
  • A URL to a big data catalog service layer with an optional filter to select specific features
  • A feature collection

In the REST web example for inputLayers shown below, two layers are used in the analysis. The layers provided can be accessed in the script using the layers object. The layer at index 0 will be filtered to only use features where OID > 2.

REST web example for inputLayers

[
   {
      "url":"https://myportal.domain.com/server/rest/services/Hosted/hurricaneTrack/FeatureServer/0",
      "filter":"OID > 2"
   },
   {
      "url":"https://myportal.domain.com/server/rest/services/Hosted/weatherPoints/FeatureServer/0"
   }
]

context

This parameter is not used by the Run Python Script tool.

To control the output data store, use the "dataStore" option when writing DataFrames.

To set the processing or output spatial reference, use the project tool in the geoanalytics package.

To filter a layer when converting it to a DataFrame, use the "where" or "fields" option when loading the layer's URL.

To limit the extent of a layer when converting it to a DataFrame, use the "extent" option when loading the layer's URL.

f

The response format. The default response format is html.

Values: html | json

Response

When you submit a request, the service assigns a unique job ID for the transaction.

Syntax:
{
"jobId": "<unique job identifier>",
"jobStatus": "<job status>"
}

After the initial request is submitted, you can use jobId to periodically check the status of the job and messages as described in Checking job status. Once the job has successfully completed, use jobId to retrieve the results. To track the status, you can make a request of the following form:

https://<analysis url>/RunPythonScript/jobs/<jobId>

Any Python console output will be returned as an informative job message. In the following example, "Hello World!" is printed to the console in pythonScript and a job message containing the print statement is returned as shown:

{
   "type": "esriJobMessageTypeInformative",
   "description": "{\"messageCode\":\"BD_101138\",\"message\":\"[Python] Hello World!\",\"params\":{\"text\":\"Hello World!\"}}"
}

Access results

All results written to ArcGIS Enterprise are available in your portal contents.