You can run this notebook online in a Binder session or view it on Github.
Exploring Datasets and ReactionDatasets¶
When examining a Dataset
or ReactionDataset
, it can be helpful to see which fields and associated values are available. That is, this section will discuss how to list what basis sets, methods, programs, etc. are represented in a dataset’s results.
[1]:
import qcportal as ptl
client = ptl.FractalClient()
ds = client.get_collection("ReactionDataset", "S22")
To see available search parameters:
[2]:
ds.list_values().reset_index().columns
[2]:
Index(['native', 'driver', 'program', 'method', 'basis', 'keywords',
'stoichiometry', 'name'],
dtype='object')
To see available search values (e.g. which basis sets and methods are available in the dataset):
[3]:
ds.list_values().reset_index()['method'].unique()
[3]:
array(['Unknown', 'b2plyp', 'b2plyp-d3', 'b2plyp-d3(bj)', 'b2plyp-d3m',
'b2plyp-d3m(bj)', 'b3lyp', 'b3lyp-d3', 'b3lyp-d3(bj)', 'b3lyp-d3m',
'b3lyp-d3m(bj)', 'hf', 'mp2', 'pbe', 'sapt0', 'wb97m-v', 'wb97x-d'],
dtype=object)
[4]:
ds.list_values().reset_index()['basis'].unique()
[4]:
array(['Unknown', 'aug-cc-pvdz', 'aug-cc-pvtz', 'def2-svp', 'def2-tzvp',
'sto-3g', 'jun-cc-pvdz'], dtype=object)
List combinations of method and basis set:
[5]:
ds.list_values().reset_index().groupby(['method','basis']).size().reset_index()
[5]:
method | basis | 0 | |
---|---|---|---|
0 | Unknown | Unknown | 3 |
1 | b2plyp | aug-cc-pvdz | 2 |
2 | b2plyp | aug-cc-pvtz | 2 |
3 | b2plyp | def2-svp | 2 |
4 | b2plyp | def2-tzvp | 2 |
5 | b2plyp-d3 | aug-cc-pvdz | 2 |
6 | b2plyp-d3 | aug-cc-pvtz | 2 |
7 | b2plyp-d3 | def2-svp | 2 |
8 | b2plyp-d3 | def2-tzvp | 2 |
9 | b2plyp-d3(bj) | aug-cc-pvdz | 2 |
10 | b2plyp-d3(bj) | aug-cc-pvtz | 2 |
11 | b2plyp-d3(bj) | def2-svp | 2 |
12 | b2plyp-d3(bj) | def2-tzvp | 2 |
13 | b2plyp-d3m | aug-cc-pvdz | 2 |
14 | b2plyp-d3m | aug-cc-pvtz | 2 |
15 | b2plyp-d3m | def2-svp | 2 |
16 | b2plyp-d3m | def2-tzvp | 2 |
17 | b2plyp-d3m(bj) | aug-cc-pvdz | 2 |
18 | b2plyp-d3m(bj) | aug-cc-pvtz | 2 |
19 | b2plyp-d3m(bj) | def2-svp | 2 |
20 | b2plyp-d3m(bj) | def2-tzvp | 2 |
21 | b3lyp | aug-cc-pvdz | 2 |
22 | b3lyp | aug-cc-pvtz | 2 |
23 | b3lyp | def2-svp | 2 |
24 | b3lyp | def2-tzvp | 2 |
25 | b3lyp-d3 | aug-cc-pvdz | 2 |
26 | b3lyp-d3 | aug-cc-pvtz | 2 |
27 | b3lyp-d3 | def2-svp | 2 |
28 | b3lyp-d3 | def2-tzvp | 2 |
29 | b3lyp-d3(bj) | aug-cc-pvdz | 2 |
30 | b3lyp-d3(bj) | aug-cc-pvtz | 2 |
31 | b3lyp-d3(bj) | def2-svp | 2 |
32 | b3lyp-d3(bj) | def2-tzvp | 2 |
33 | b3lyp-d3m | aug-cc-pvdz | 2 |
34 | b3lyp-d3m | aug-cc-pvtz | 2 |
35 | b3lyp-d3m | def2-svp | 2 |
36 | b3lyp-d3m | def2-tzvp | 2 |
37 | b3lyp-d3m(bj) | aug-cc-pvdz | 2 |
38 | b3lyp-d3m(bj) | aug-cc-pvtz | 2 |
39 | b3lyp-d3m(bj) | def2-svp | 2 |
40 | b3lyp-d3m(bj) | def2-tzvp | 2 |
41 | hf | sto-3g | 2 |
42 | mp2 | aug-cc-pvdz | 2 |
43 | mp2 | aug-cc-pvtz | 2 |
44 | mp2 | def2-svp | 2 |
45 | mp2 | def2-tzvp | 2 |
46 | pbe | aug-cc-pvdz | 2 |
47 | pbe | aug-cc-pvtz | 2 |
48 | pbe | def2-svp | 2 |
49 | pbe | def2-tzvp | 2 |
50 | sapt0 | aug-cc-pvdz | 1 |
51 | sapt0 | aug-cc-pvtz | 1 |
52 | sapt0 | jun-cc-pvdz | 1 |
53 | wb97m-v | def2-svp | 2 |
54 | wb97m-v | def2-tzvp | 2 |
55 | wb97x-d | def2-svp | 2 |
56 | wb97x-d | def2-tzvp | 2 |
Most datasets support default programs and keywords. For those cases, the above can also be achieved with:
[7]:
ds.list_values(native=True).reset_index()['name']
[7]:
0 cp-B2PLYP/aug-cc-pvdz
1 B2PLYP/aug-cc-pvdz
2 cp-B2PLYP/aug-cc-pvtz
3 B2PLYP/aug-cc-pvtz
4 B2PLYP/def2-svp
...
104 WB97M-V/def2-tzvp
105 cp-WB97X-D/def2-svp
106 WB97X-D/def2-svp
107 WB97X-D/def2-tzvp
108 cp-WB97X-D/def2-tzvp
Name: name, Length: 109, dtype: object