Search and Query Knowledge Graphs

There are a couple different ways to access the data of the knowledge graph. search provides results based on a full-text search string. query provides results based on an openCypher query string. The goal of this guide is to provide examples of various searches and queries that could be performed on a knowledge graph.

Searching the knowledge graph reaches out to the search endpoint which searches for the given string on any properties of both entities or relationships by default. To search for properties on only entities or only relationships you can use the category parameter on search and set the value to either entities, relationships, or both (default).

knowledge_graph.search("Esri", category="entities")

These results are in the form of a list of lists such as:

[
   [{'_objectType': 'entity',
   '_typeName': 'Company',
   '_id': UUID('f54c49c5-0aff-49ad-b38d-883b1323c709'),
   '_properties': {
      'globalid': UUID('f54c49c5-0aff-49ad-b38d-883b1323c709'),
      'objectid': 2,
      'name': 'Esri'
      }
   }]
]

Search uses Lucene syntax, which allows for more advanced searches such as using a wildcard (*), searching a specific property (name:Esri), and boolean operators like AND and OR. For more information about Lucene syntax, see the syntax guide from Apache.

Query

Querying the knowledge graph reaches out to the query endpoint which runs the openCypher query on the graph and provides the results.

knowledge_graph.query("MATCH (n) RETURN n LIMIT 2")

These results are in the form of a list of lists, such as:

[
   [{'_objectType': 'entity',
   '_typeName': 'Person',
   '_id': UUID('33c1915e-2169-4a95-b07a-b141fc684a39'),
   '_properties': {
      'globalid': UUID('33c1915e-2169-4a95-b07a-b141fc684a39'),
      'objectid': 1,
      'name': 'Megan'
      }
   }],
   [{'_objectType': 'entity',
   '_typeName': 'Company',
   '_id': UUID('f54c49c5-0aff-49ad-b38d-883b1323c709'),
   '_properties': {
      'globalid': UUID('f54c49c5-0aff-49ad-b38d-883b1323c709'),
      'objectid': 2,
      'name': 'Esri'
      }
   }]
]

Query Streaming

Query streaming accepts a query string the same way query does, but also allows the additional parameters:

  • bind_param, which accepts any number of key: value pairs of parameters you would like to include in the query that are created outside of the query. This includes any primitive types as well as geometries, lists, and anonymous objects.
  • include_provenance, a boolean parameter used to determine whether provenance records will be returned as part of the query results.

Another benefit to using query streaming is the resulting records are not limited to the server's query limits, but rather returned in chunks and presented as a generator which can be used to retrieve all results at once using list() or go through each record one at a time using next().

# using bind parameters in queries

# list example
query_list = ['Megan', 'Emma', 'Cameron', 'Noah']
results = knowledge_graph.query_streaming("MATCH (p:Person) WHERE p.name IN $list RETURN p", bind_param={"list": query_list})

# anonymous object example
query_obj = {"props": {"name": "Megan"}, "list": ['Emma', 'Cameron', 'Noah']}
results = knowledge_graph.query_streaming("MATCH (n:Person)-[:FriendsWith]-(e:Person) WHERE n.name = $object.props.name AND e.name in $object.list RETURN n, e", bind_param={"object": query_obj})

The output of these queries are a generator, so they need to be handled slightly different from the regular query output.

# handling results - get all results
for result in list(results):
    # do something with each result
    print(result)

# handling results - get results one at a time
next(results)

# or loop through all results using next
while True:
    try:
        # do something with each result
        print(next(results))
    except StopIteration:
        break
# including provenance in query results
results = knowledge_graph.query_streaming("MATCH (n:Provenance) RETURN n LIMIT 1", include_provenance=True)
list(results)

The result of this query would look similar to:

[
    [
        {'_objectType': 'entity',
        '_typeName': 'Provenance',
        '_id': UUID('1794b6b2-4d91-48ad-b51f-5dfb80e58c01'),
        '_properties': {'instanceID': UUID('3e16d8fe-7f68-45ef-805a-a54d78995411'),
            'propertyName': 'name',
            'sourceType': 'String',
            'typeName': 'Document',
            'globalid': UUID('1794b6b2-4d91-48ad-b51f-5dfb80e58c01'),
            'sourceName': 'MySourceName',
            'source': 'MySource',
            'objectid': 2
            }
        }
    ]
]

Using Results

Query responses can get much more complex depending on what is returned from the query. openCypher allows many different types of returns including entities, relationships, properties, anonymous objects, lists, and more. Providing this response as a list of lists guarantees the response can be used once returned.

If entities are returned that have a shape (are spatial) it can be useful to view the results of that query in a map. To do so, you can create a data frame from the properties in the results and spatially enable that data frame using the shape field to plot it on a map.

import pandas as pd

# openCypher query matches all entities (assume all entities returned are spatial for this example, OneType represents returning entities of a single type)
query_results = knowledge_graph.query("MATCH (n:OneType) RETURN n")

# create a list of all properties of the type to use as columns of our data frame
props_list = []
for prop in query_results[0][0]['_properties']:
    props_list.append(prop)

# iterate through the results of the query, writing those results to a list to be used in the data frame
results_list = []
for result in query_results:
    single_result = []
    # write each property value to a list
    for prop in props_list:
        single_result.append(result[0]['_properties'][prop])
    # append the list of properties to the data list
    results_list.append(single_result)

# create a data frame that holds all properties of the observation entities
obs_df = pd.DataFrame(data=results_list, columns=props_list)
# set the spatial column to shape
obs_df.spatial.set_geometry('shape')

# create a map of the results
new_map = gis.map()
new_map.basemap = 'gray-vector'
obs_df.spatial.plot(map_widget=new_map, renderer_type='s', marker_size=5, symbol_type='simple', colors=[252,226,5,90], outline_color=[0,0,0,90], line_width=0.5)
new_map

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.