You can run this notebook online in a Binder session or view it on Github.

Exploring Datasets and ReactionDatasets

When examining a Dataset or ReactionDataset, it can be helpful to see which fields and associated values are available. That is, this section will discuss how to list what basis sets, methods, programs, etc. are represented in a dataset’s results.

[1]:
import qcportal as ptl
client = ptl.FractalClient()

ds = client.get_collection("ReactionDataset", "S22")

To see available search parameters:

[2]:
ds.list_values().reset_index().columns
[2]:
Index(['native', 'driver', 'program', 'method', 'basis', 'keywords',
       'stoichiometry', 'name'],
      dtype='object')

To see available search values (e.g. which basis sets and methods are available in the dataset):

[3]:
ds.list_values().reset_index()['method'].unique()
[3]:
array(['Unknown', 'b2plyp', 'b2plyp-d3', 'b2plyp-d3(bj)', 'b2plyp-d3m',
       'b2plyp-d3m(bj)', 'b3lyp', 'b3lyp-d3', 'b3lyp-d3(bj)', 'b3lyp-d3m',
       'b3lyp-d3m(bj)', 'hf', 'mp2', 'pbe', 'sapt0', 'wb97m-v', 'wb97x-d'],
      dtype=object)
[4]:
ds.list_values().reset_index()['basis'].unique()
[4]:
array(['Unknown', 'aug-cc-pvdz', 'aug-cc-pvtz', 'def2-svp', 'def2-tzvp',
       'sto-3g', 'jun-cc-pvdz'], dtype=object)

List combinations of method and basis set:

[5]:
ds.list_values().reset_index().groupby(['method','basis']).size().reset_index()
[5]:
method basis 0
0 Unknown Unknown 3
1 b2plyp aug-cc-pvdz 2
2 b2plyp aug-cc-pvtz 2
3 b2plyp def2-svp 2
4 b2plyp def2-tzvp 2
5 b2plyp-d3 aug-cc-pvdz 2
6 b2plyp-d3 aug-cc-pvtz 2
7 b2plyp-d3 def2-svp 2
8 b2plyp-d3 def2-tzvp 2
9 b2plyp-d3(bj) aug-cc-pvdz 2
10 b2plyp-d3(bj) aug-cc-pvtz 2
11 b2plyp-d3(bj) def2-svp 2
12 b2plyp-d3(bj) def2-tzvp 2
13 b2plyp-d3m aug-cc-pvdz 2
14 b2plyp-d3m aug-cc-pvtz 2
15 b2plyp-d3m def2-svp 2
16 b2plyp-d3m def2-tzvp 2
17 b2plyp-d3m(bj) aug-cc-pvdz 2
18 b2plyp-d3m(bj) aug-cc-pvtz 2
19 b2plyp-d3m(bj) def2-svp 2
20 b2plyp-d3m(bj) def2-tzvp 2
21 b3lyp aug-cc-pvdz 2
22 b3lyp aug-cc-pvtz 2
23 b3lyp def2-svp 2
24 b3lyp def2-tzvp 2
25 b3lyp-d3 aug-cc-pvdz 2
26 b3lyp-d3 aug-cc-pvtz 2
27 b3lyp-d3 def2-svp 2
28 b3lyp-d3 def2-tzvp 2
29 b3lyp-d3(bj) aug-cc-pvdz 2
30 b3lyp-d3(bj) aug-cc-pvtz 2
31 b3lyp-d3(bj) def2-svp 2
32 b3lyp-d3(bj) def2-tzvp 2
33 b3lyp-d3m aug-cc-pvdz 2
34 b3lyp-d3m aug-cc-pvtz 2
35 b3lyp-d3m def2-svp 2
36 b3lyp-d3m def2-tzvp 2
37 b3lyp-d3m(bj) aug-cc-pvdz 2
38 b3lyp-d3m(bj) aug-cc-pvtz 2
39 b3lyp-d3m(bj) def2-svp 2
40 b3lyp-d3m(bj) def2-tzvp 2
41 hf sto-3g 2
42 mp2 aug-cc-pvdz 2
43 mp2 aug-cc-pvtz 2
44 mp2 def2-svp 2
45 mp2 def2-tzvp 2
46 pbe aug-cc-pvdz 2
47 pbe aug-cc-pvtz 2
48 pbe def2-svp 2
49 pbe def2-tzvp 2
50 sapt0 aug-cc-pvdz 1
51 sapt0 aug-cc-pvtz 1
52 sapt0 jun-cc-pvdz 1
53 wb97m-v def2-svp 2
54 wb97m-v def2-tzvp 2
55 wb97x-d def2-svp 2
56 wb97x-d def2-tzvp 2

Most datasets support default programs and keywords. For those cases, the above can also be achieved with:

[7]:
ds.list_values(native=True).reset_index()['name']

[7]:
0      cp-B2PLYP/aug-cc-pvdz
1         B2PLYP/aug-cc-pvdz
2      cp-B2PLYP/aug-cc-pvtz
3         B2PLYP/aug-cc-pvtz
4            B2PLYP/def2-svp
               ...
104        WB97M-V/def2-tzvp
105      cp-WB97X-D/def2-svp
106         WB97X-D/def2-svp
107        WB97X-D/def2-tzvp
108     cp-WB97X-D/def2-tzvp
Name: name, Length: 109, dtype: object