SPARQL queries against the atomRDF graph#
Every sample, simulation cell and crystallographic property created by atomRDF is a real RDF triple, so anything you can express in SPARQL is available to you. This notebook shows three styles of querying:
Raw SPARQL (
kg.query("SELECT ...")).Term-builder for ontology-aware queries without writing SPARQL (
kg.query_sample,kg.query).Returning a sample object and operating on it.
from atomrdf import KnowledgeGraph
import atomrdf.build as build
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 1
----> 1 from atomrdf import KnowledgeGraph
2 import atomrdf.build as build
File ~/checkouts/readthedocs.org/user_builds/pyscal-rdf/conda/latest/lib/python3.11/site-packages/atomrdf/__init__.py:21
1 """atomRDF — ontology-based knowledge graphs for atomistic simulation data.
2
3 atomRDF combines `pyscal3 <https://github.com/pyscal/pyscal3>`_,
(...) 17 documentation at https://atomrdf.pyscal.org.
18 """
20 from atomrdf._version import __version__
---> 21 from atomrdf.graph import KnowledgeGraph
22 from atomrdf.io.workflow_parser import WorkflowParser
24 __all__ = [
25 "__version__",
26 "KnowledgeGraph",
27 "WorkflowParser",
28 ]
File ~/checkouts/readthedocs.org/user_builds/pyscal-rdf/conda/latest/lib/python3.11/site-packages/atomrdf/graph.py:46
44 from atomrdf.stores import create_store, purge
45 import atomrdf.json_io as json_io
---> 46 import atomrdf.mp as amp
49 from atomrdf.namespace import (
50 CMSO,
51 PLDO,
(...) 56 Literal,
57 )
59 # read element data file
File ~/checkouts/readthedocs.org/user_builds/pyscal-rdf/conda/latest/lib/python3.11/site-packages/atomrdf/mp.py:5
1 """
2 Wrapper around Materials Project to query structures and get it as a KG
3 """
----> 5 from mp_api.client import MPRester
6 import numpy as np
8 def query_mp(api_key, chemical_system=None, material_ids=None, is_stable=True):
ModuleNotFoundError: No module named 'mp_api'
Build a small heterogeneous database#
kg = KnowledgeGraph()
_ = build.bulk("Fe", cubic=True, graph=kg)
_ = build.bulk("Cu", cubic=True, graph=kg)
_ = build.bulk("Si", cubic=True, graph=kg)
_ = build.bulk("Mg", crystalstructure="hcp", graph=kg)
kg.n_samples
1. Raw SPARQL#
What are the chemical species in the graph?
q = """
PREFIX cmso: <http://purls.helmholtz-metadaten.de/cmso/>
SELECT DISTINCT ?symbol
WHERE {
?species cmso:hasElementSymbol ?symbol .
}
"""
kg.query(q)
Every sample with a cubic Bravais lattice and exactly two atoms in the unit cell:
q = """
PREFIX cmso: <http://purls.helmholtz-metadaten.de/cmso/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?sample ?symbol
WHERE {
?sample cmso:hasNumberOfAtoms ?n ;
cmso:hasMaterial ?m .
?m cmso:hasStructure ?s .
?s cmso:hasSpaceGroupSymbol ?symbol .
FILTER (?n = "2"^^xsd:integer)
}
"""
kg.query(q)
2. Term builder (when the ontology network is available)#
kg.terms.cmso.AtomicScaleSample lets you express the same query without typing SPARQL. It requires the ontology network to be reachable at construction time — if it is not (e.g. behind a strict firewall), kg.terms will be None and you should fall back to the raw SPARQL form above.
kg.query(
kg.terms.cmso.AtomicScaleSample,
[
kg.terms.cmso.hasSpaceGroupSymbol,
kg.terms.cmso.hasNumberOfAtoms == 2,
],
)
3. Return a single sample and write it out#
q = """
PREFIX cmso: <http://purls.helmholtz-metadaten.de/cmso/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?sample
WHERE {
?sample cmso:hasNumberOfAtoms ?n .
FILTER (?n = "2"^^xsd:integer)
}
"""
df = kg.query(q)
df
sample = df['sample'].values[0]
kg.to_file(sample, "selected.poscar", format="vasp")
! head -10 selected.poscar