SPARQL queries against the atomRDF graph#

Every sample, simulation cell and crystallographic property created by atomRDF is a real RDF triple, so anything you can express in SPARQL is available to you. This notebook shows three styles of querying:

Raw SPARQL (kg.query("SELECT ...")).
Term-builder for ontology-aware queries without writing SPARQL (kg.query_sample, kg.query).
Returning a sample object and operating on it.

from atomrdf import KnowledgeGraph
import atomrdf.build as build

Build a small heterogeneous database#

kg = KnowledgeGraph()
_ = build.bulk("Fe", cubic=True, graph=kg)
_ = build.bulk("Cu", cubic=True, graph=kg)
_ = build.bulk("Si", cubic=True, graph=kg)
_ = build.bulk("Mg", crystalstructure="hcp", graph=kg)
kg.n_samples

1. Raw SPARQL#

What are the chemical species in the graph?

q = """
PREFIX cmso: <http://purls.helmholtz-metadaten.de/cmso/>
SELECT DISTINCT ?symbol
WHERE {
    ?species cmso:hasElementSymbol ?symbol .
}
"""
kg.query(q)

	symbol

Every sample with a cubic Bravais lattice and exactly two atoms in the unit cell:

q = """
PREFIX cmso: <http://purls.helmholtz-metadaten.de/cmso/>
PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#>
SELECT ?sample ?symbol
WHERE {
    ?sample  cmso:hasNumberOfAtoms ?n ;
             cmso:hasMaterial      ?m .
    ?m       cmso:hasStructure     ?s .
    ?s       cmso:hasSpaceGroupSymbol ?symbol .
    FILTER (?n = "2"^^xsd:integer)
}
"""
kg.query(q)

	sample	symbol
0	sample:04478872-1802-45f3-9093-ed60d06de64d	Im-3m
1	sample:a9737877-3034-4e45-893c-4853d410d757	P6_3/mmc

2. Term builder (when the ontology network is available)#

kg.terms.cmso.AtomicScaleSample lets you express the same query without typing SPARQL. It requires the ontology network to be reachable at construction time — if it is not (e.g. behind a strict firewall), kg.terms will be None and you should fall back to the raw SPARQL form above.

kg.query(
    kg.terms.cmso.AtomicScaleSample,
    [
        kg.terms.cmso.hasSpaceGroupSymbol,
        kg.terms.cmso.hasNumberOfAtoms == 2,
    ],
)

3. Return a single sample and write it out#

q = """
PREFIX cmso: <http://purls.helmholtz-metadaten.de/cmso/>
PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#>
SELECT ?sample
WHERE {
    ?sample cmso:hasNumberOfAtoms ?n .
    FILTER (?n = "2"^^xsd:integer)
}
"""
df = kg.query(q)
df

	sample
0	sample:04478872-1802-45f3-9093-ed60d06de64d
1	sample:a9737877-3034-4e45-893c-4853d410d757

sample = df['sample'].values[0]
kg.to_file(sample, "selected.poscar", format="vasp")
! head -10 selected.poscar

Fe
 1.0000000000000000
     2.8700000000000001    0.0000000000000000    0.0000000000000000
     0.0000000000000000    2.8700000000000001    0.0000000000000000
     0.0000000000000000    0.0000000000000000    2.8700000000000001
 Fe 
   2
Cartesian
  0.0000000000000000  0.0000000000000000  0.0000000000000000
  1.4350000000000001  1.4350000000000001  1.4350000000000001