A file geodatabase is an Esri geospatial data format that stores and manages spatial and nonspatial data. It can store various types of geographic data, including nonspatial tables, feature classes, feature datasets, and raster datasets.
GeoAnalytics for Microsoft Fabric supports loading tables and feature classes of point
, multipoint
, line
, and polygon
geometries.
After loading the file geodatabase into a Spark DataFrame, you can perform analysis and visualize the data by
using the SQL functions and tools available in GeoAnalytics for Microsoft Fabric in addition to functions offered in Spark.
Reading File Geodatabase in Spark
The following examples demonstrate how to load a file geodatabase into Spark DataFrames using both Python and Scala.
# Load the file geodatabase catalog from an S3 bucket
spark.read.format("filegdb").option("gdbPath", "s3a://my-bucket/my-folder/example.gdb").load().show()
When you load the file geodatabase, specify the path of the file geodatabase and the name of the table or feature class using the below options.
DataFrameReader option | Example | Description |
---|---|---|
gdb | .option("gdb | The path to the file geodatabase. It is required for loading the file geodatabase. |
gdb | .option("gdb | The name of the table or feature class in the file geodatabase. |
The table or feature class name is unique in a file geodatabase. To load the table or feature class in a feature dataset,
you can access the data with gdb
without specifying the name or path to the feature dataset.
In the above example, you can load the feature class us
with syntax -
us_lakes = spark.read.format("filegdb").option("gdbPath", "s3a://my-bucket/my-folder/example.gdb").option("gdbName", "us_lakes").load()
If you don't specify the gdb
, the complete catalog of the datasets in the file geodatabase will be loaded.
+-------------+-------------+------------+
| Name| DatasetType|GeometryType|
+-------------+-------------+------------+
|ca_population| Table| null|
| ca_parks|Feature Class| Point|
| us_lakes|Feature Class| Polygon|
| us_rivers|Feature Class| Polyline|
| calls|Feature Class| MultiPoint|
+-------------+-------------+------------+
For example, the output table shown above is the result of loading a file geodatabase without specifying gdb
.
The table includes the dataset name, Name
, the dataset type, Dataset
, and the geometry type, Geometry
,
for each table and feature class. This output should match the view of the file geodatabase in the
ArcGIS Pro Catalog pane.

Usage notes
-
GeoAnalytics for Microsoft Fabric will load the date or the timestamp offset data type in a table or feature class as a TimestampType column. For the timestamp offset data type, the time offset will be applied and not maintained in the TimestampType column after loading to a Spark DataFrame. If there is one TimestampType column in the Spark DataFrame, it will be automatically set as the time field. If there are multiple TimestampType columns, you can call
st.set
to enable time._time _fields() -
GeoAnalytics for Microsoft Fabric will load the date only data type as a DateType column, and will map the time only data type to a string column representing the 24-hour time in the format
HH
.:mm :ss -
When GeoAnalytics for Microsoft Fabric accesses a file geodatabase, it will not lock the table, feature class, or feature dataset. You can freely edit or modify the file geodatabase with other processes such as ArcGIS Pro.
-
When loading the catalog of the file geodatabase, the name of the feature dataset is not included in the table catalog.
-
GeoAnalytics for Microsoft Fabric does not support loading mosaic datasets or raster datasets stored in a file geodatabase.
-
GeoAnalytics for Microsoft Fabric does not support saving data into file geodatabases.