Persistence, stores and visualisation#

atomRDF graphs are backed by RDFLib stores, so the same KnowledgeGraph API can target an in-memory store (the default), an on-disk SQL-backed store, or a high-performance Oxigraph store.

This notebook covers:

  1. Writing and re-loading a graph in Turtle / JSON-LD.

  2. Choosing a store at construction time.

  3. Visualising the graph in the notebook.

from atomrdf import KnowledgeGraph
import atomrdf.build as build
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 1
----> 1 from atomrdf import KnowledgeGraph
      2 import atomrdf.build as build

File ~/checkouts/readthedocs.org/user_builds/pyscal-rdf/conda/latest/lib/python3.11/site-packages/atomrdf/__init__.py:21
      1 """atomRDF — ontology-based knowledge graphs for atomistic simulation data.
      2 
      3 atomRDF combines `pyscal3 <https://github.com/pyscal/pyscal3>`_,
   (...)     17 documentation at https://atomrdf.pyscal.org.
     18 """
     20 from atomrdf._version import __version__
---> 21 from atomrdf.graph import KnowledgeGraph
     22 from atomrdf.io.workflow_parser import WorkflowParser
     24 __all__ = [
     25     "__version__",
     26     "KnowledgeGraph",
     27     "WorkflowParser",
     28 ]

File ~/checkouts/readthedocs.org/user_builds/pyscal-rdf/conda/latest/lib/python3.11/site-packages/atomrdf/graph.py:46
     44 from atomrdf.stores import create_store, purge
     45 import atomrdf.json_io as json_io
---> 46 import atomrdf.mp as amp
     49 from atomrdf.namespace import (
     50     CMSO,
     51     PLDO,
   (...)     56     Literal,
     57 )
     59 # read element data file

File ~/checkouts/readthedocs.org/user_builds/pyscal-rdf/conda/latest/lib/python3.11/site-packages/atomrdf/mp.py:5
      1 """
      2 Wrapper around Materials Project to query structures and get it as a KG
      3 """
----> 5 from mp_api.client import MPRester
      6 import numpy as np
      8 def query_mp(api_key, chemical_system=None, material_ids=None, is_stable=True):

ModuleNotFoundError: No module named 'mp_api'

1. In-memory graph (the default)#

kg = KnowledgeGraph()
_ = build.bulk("Fe", cubic=True, graph=kg)
_ = build.bulk("Cu", cubic=True, graph=kg)
kg.n_samples

Write the graph to disk and round-trip it:

kg.write("demo.ttl", format="ttl")
kg2 = KnowledgeGraph(graph_file="demo.ttl")
kg2.n_samples

JSON-LD also works, which is often easier to consume from web tooling:

kg.write("demo.jsonld", format="json-ld")

2. Picking a different store#

atomRDF ships connectors for three stores. They are selected via the store argument:

from atomrdf import KnowledgeGraph

# Default in-process memory store:
kg = KnowledgeGraph(store="Memory")

# SQLAlchemy-backed (requires `pip install "atomrdf[sqlalchemy]"`):
kg = KnowledgeGraph(store="db", store_file="atomrdf.db")

# Oxigraph (requires `pip install "atomrdf[oxigraph]"`):
kg = KnowledgeGraph(store="Oxigraph", store_file="oxidir/")

All three stores expose the same Python API; pick the in-memory store for quick experiments and one of the persistent stores for long-running projects.

3. Visualising the graph#

kg.visualise() renders the RDF graph inline using GraphViz. hide_types=True collapses rdf:type edges so only the data relationships remain.

kg.visualise(hide_types=True, size=(40, 25))

For very large graphs the inline view can become unwieldy; in that case export to a stand-alone HTML / SVG file and open it in a browser.