You can run this notebook online in a Binder session or view it on Github.

Getting Molecules

This example shows how to get a molecule from QCArchive in a number of contexts.

From an ID

Every molecule computed with QCArchive is assigned a unique ID. If a molecule’s ID is known, it can be queried from the Molecules table.

[1]:
import qcportal as ptl
client = ptl.FractalClient()

For example, molecule 1234 is 1,2,3-trimethylbenzene.

[2]:
mol = client.query_molecules(1234)[0]
mol

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

[2]:
<Molecule(name='C9H12' formula='C9H12' hash='572b510')>
[3]:
print(mol)
    Geometry (in Angstrom), charge = 0.0, multiplicity = 1:

       Center              X                  Y                   Z
    ------------   -----------------  -----------------  -----------------
    C                 0.776479871994     1.156134463385     0.121542591228
    C                 0.438429690334     0.679567908122    -1.141595091975
    C                 0.439577078821     0.423533055514     1.255585387764
    C                -0.363723536834    -0.465178778108    -1.279725991730
    C                -0.415502828385    -0.685937227907     1.160631416613
    C                -0.792912983429    -1.170236644458    -0.121804279943
    C                -0.744392084678    -0.917923156500    -2.666766549983
    C                -0.856925058179    -1.374181477949     2.427060703777
    C                -1.703936690413    -2.374380900784    -0.246989621254
    H                 1.380610203168     2.049406423411     0.216714048921
    H                 0.770290662964     1.232461941773    -2.011963177510
    H                 0.769502950936     0.784464203584     2.222141623291
    H                -0.238962510978    -1.878436765084    -2.898916777516
    H                -0.447809351101    -0.177691478927    -3.439954373507
    H                -1.844638825192    -1.050455805875    -2.735084841327
    H                -1.962016543060    -1.480103641644     2.438815782834
    H                -0.562925111565    -0.802128403465     3.332572326307
    H                -0.383242656300    -2.377541231755     2.485353500027
    H                -2.761425129123    -2.038610393380    -0.229251405356
    H                -1.542976842368    -3.097214459361     0.578338572599
    H                -1.519884697209    -2.938478658464    -1.182927991461

The following sections show how to find molecule IDs from Collections.

From a ReactionDataset

Load a ReactionDataset:

[4]:
import qcportal as ptl
client = ptl.FractalClient()

ds = client.get_collection("ReactionDataset", "S22")
ds.df  # list available reactions
[4]:
S220 S22a S22b
Ammonia Dimer -3.17 -3.15 -3.133
Water Dimer -5.02 -5.07 -4.989
Formic Acid Dimer -18.61 -18.81 -18.753
Formamide Dimer -15.96 -16.11 -16.062
Uracil Dimer HB -20.65 -20.69 -20.641
2-Pyridone-2-Aminopyridine Complex -16.71 -17.00 -16.934
Adenine-Thymine Complex WC -16.37 -16.74 -16.660
Methane Dimer -0.53 -0.53 -0.527
Ethene Dimer -1.51 -1.48 -1.472
Benzene-Methane Complex -1.50 -1.45 -1.448
Benzene Dimer PD -2.73 -2.62 -2.654
Pyrazine Dimer -4.42 -4.20 -4.255
Uracil Dimer Stack -10.12 -9.74 -9.805
Indole-Benzene Complex Stack -5.22 -4.59 -4.524
Adenine-Thymine Complex Stack -12.23 -11.66 -11.730
Ethene-Ethine Complex -1.53 -1.50 -1.496
Benzene-Water Complex -3.28 -3.29 -3.275
Benzene-Ammonia Complex -2.35 -2.32 -2.312
Benzene-HCN Complex -4.46 -4.55 -4.541
Benzene Dimer T-Shape -2.74 -2.71 -2.717
Indole-Benzene Complex T-Shape -5.73 -5.62 -5.627
Phenol Dimer -7.05 -7.09 -7.097

Each reaction has a stoichiometry describing which molecules are involved in the reactants and products:

[5]:
ds.get_rxn('Adenine-Thymine Complex WC').stoichiometry
[5]:
{'default1': {'25': 1.0, '26': 1.0},
 'cp1': {'27': 1.0, '28': 1.0},
 'default': {'29': 1.0},
 'cp': {'29': 1.0}}

For the case of the S22 dataset, default corresponds to the dimer (molecule ID 29) and default1 corresponds to the monomers (molecule IDs 25 and 26) without counterpoise corrections.

[6]:
client.query_molecules('25')[0]

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

[6]:
<Molecule(name='C10H11N7O2 ((0,),[])' formula='C5H5N5' hash='c0e7ed3')>
[7]:
client.query_molecules('26')[0]

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

[7]:
<Molecule(name='C10H11N7O2 ((1,),[])' formula='C5H6N2O2' hash='a4f9749')>
[8]:
client.query_molecules('29')[0]

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

[8]:
<Molecule(name='C10H11N7O2' formula='C10H11N7O2' hash='5357c2c')>

Monomers used in counterpoise-corrected calculations contain ghost atoms:

[9]:
client.query_molecules('27')[0]

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

[9]:
<Molecule(name='C10H11N7O2 ((0,),[1])' formula='C10H11N7O2' hash='d3955aa')>
[10]:
client.query_molecules('28')[0]

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

[10]:
<Molecule(name='C10H11N7O2 ((1,),[0])' formula='C10H11N7O2' hash='e63c41f')>

From an OptimizationDataset

Load an OptimizationDataset:

[11]:
import qcportal as ptl
client = ptl.FractalClient()

client.list_collections()
ds = client.get_collection("OptimizationDataset", "SMIRNOFF Coverage Set 1")

Show some available molecules:

[12]:
ds.df.head()
[12]:
COC(O)OC-0
C[S-]-0
CS-0
CO-0
CCO-0

Show available specifications:

[13]:
ds.list_specifications()
[13]:
Description
Name
default Standard OpenFF optimization quantum chemistry...

Obtain a specific record from a molecule and specification:

[14]:
r = ds.get_record("CCO-0","default")

Get the optimized molecule:

[15]:
r.get_final_molecule()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

[15]:
<Molecule(name='C2H6O' formula='C2H6O' hash='422ad57')>

Get the optimization trajectory:

[16]:
r.get_molecular_trajectory()
[16]:
[<Molecule(name='C2H6O' formula='C2H6O' hash='29df3ae')>,
 <Molecule(name='C2H6O' formula='C2H6O' hash='93989e4')>,
 <Molecule(name='C2H6O' formula='C2H6O' hash='14261f7')>,
 <Molecule(name='C2H6O' formula='C2H6O' hash='3b6db86')>,
 <Molecule(name='C2H6O' formula='C2H6O' hash='b35d632')>,
 <Molecule(name='C2H6O' formula='C2H6O' hash='c900f12')>,
 <Molecule(name='C2H6O' formula='C2H6O' hash='a1e9d7a')>,
 <Molecule(name='C2H6O' formula='C2H6O' hash='422ad57')>]

From a TorsionDriveDataset

[17]:
import qcportal as ptl
client = ptl.FractalClient()

ds = client.get_collection("TorsionDriveDataset", "SMIRNOFF Coverage Torsion Set 1")

Show some available torsions:

[18]:
ds.df.head()
[18]:
[CH3:1][O:2][CH:3]([OH:4])OC
[CH3:1][O:2][CH:3](O)[O:4]C
CO[CH:3]([OH:4])[O:2][CH3:1]
C[O:4][CH:3](O)[O:2][CH3:1]
[H:4][C:3](O)([O:2][CH3:1])OC

Show available specifications:

[19]:
ds.list_specifications()
[19]:
Description
Name
default Standard OpenFF torsiondrive specification.

Get a specific torsiondrive:

[20]:
td = ds.get_record("CO[CH:3]([OH:4])[O:2][CH3:1]", "default")

Get molecules for each angle along the torsion scan:

[21]:
td.get_final_molecules()
[21]:
{(-75,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='60e16ca')>,
 (-90,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='c337c03')>,
 (-60,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='b4ff4d4')>,
 (-105,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='5b05d3a')>,
 (-45,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='240c817')>,
 (-120,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='399d214')>,
 (-30,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='8737c8f')>,
 (-135,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='530c77d')>,
 (-15,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='05c30a0')>,
 (-150,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='1c56b54')>,
 (0,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='f1b0dd1')>,
 (-165,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='c81a1fc')>,
 (15,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='f329f87')>,
 (30,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='99156ab')>,
 (180,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='d299528')>,
 (45,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='e1f13fa')>,
 (165,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='df216e3')>,
 (60,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='e69654b')>,
 (150,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='810b759')>,
 (75,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='5c12648')>,
 (135,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='35f87a2')>,
 (90,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='cdbfa17')>,
 (120,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='5271be0')>,
 (105,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='c0f46d7')>}
[22]:
td.get_final_molecules()[(30,)]

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

[22]:
<Molecule(name='C3H8O3' formula='C3H8O3' hash='99156ab')>