Torsion angle scan with rdkit & xtb

Log of the hassle doing torsion angle scans with rdkit & xtb - ultimately comparing results to the crystallography open database
torsion
dihedral
oss
opensource
rdkit
xtb
energy
Author

Peter Schmidtke

Published

February 16, 2021

Post #4 will be along the lines of dihedral / torsion angle analysis again. The aim is to log some of the hurdles I had to overcome to run a torsion angle analysis with xtb and / or rdkit.

What I’m trying to accomplish here is mainly to see how torsion angle scans can be performed - easily and robustly - using open source tools available. This is in preparation of a larger work track on analysing the data from the COD, where first steps have already been set here.

Aim

Given an input molecule and a particular torsion angle I’d like to see what the energy langscape of the molecule looks like when rotating around that torsion angle. I’d like to know how easy/complicated this is using two different tools:

  1. rdkit with the integrated MMFF
  2. xtb from the Grimme lab

The post is also inspired by an older post done by iwatobipen analyzing openforcefield with Ani2 on some torsion energy predictions using the torsion drive dataset from the openforcefield initiative

All code used for this post is available on this separate repo, as it uses a slightly different environment (you’ll see why).

Dihedral scan with rdkit

I don’t really know why, but I started out with this molecule here:

Code
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import Draw
mol=Chem.AddHs(Chem.MolFromSmiles('CCCOC1=CC=C(Cl)C(C)=C1'))
for i, a in enumerate(mol.GetAtoms()):
        a.SetAtomMapNum(i)
AllChem.EmbedMolecule(mol,randomSeed=10)    #generate an initial random conformation (randomSeed is fixed to have something reproducible)
#mol
Draw.MolToFile(mol,"mol.png",includeAtomNumbers=True,highlightAtoms=(1,2,3,4))        #sorry - rdkit 2019.03 issues with newer python versions (I guess)

The torsion angle I’m interested in is highlighted in red and situated between carbons 1,2,3 & 4. Now let’s try to set up a torsion angle scan in rdkit using MMFF (UFF should be similar procedure … )

from rdkit.Chem import rdMolTransforms
import copy
import pandas as pd
from rdkit.Chem import rdForceFieldHelpers
from rdkit.Chem import ChemicalForceFields
from rdkit.Chem import rdMolTransforms


conformer=mol.GetConformer(0)

m2=copy.deepcopy(mol)
mp2 = AllChem.MMFFGetMoleculeProperties(m2)

energy=[]
confid=0
angles=range(-180,182,2)

print("Initial angle:")
print(rdMolTransforms.GetDihedralDeg(conformer,1,2,3,4))
for angle in angles:
    confid+=1
    ff2 = AllChem.MMFFGetMoleculeForceField(m2, mp2)
    ff2.MMFFAddTorsionConstraint(1,2,3,4, False, angle - .2, angle + .2, 1000.0)
    ff2.Minimize()
    energy.append(ff2.CalcEnergy())

    xyz=ff2.Positions()
    new_conf = Chem.Conformer(mol.GetNumAtoms())
    for i in range(mol.GetNumAtoms()):
        new_conf.SetAtomPosition(i, (m2.GetConformer(-1).GetAtomPosition(i)))
    new_conf.SetId(confid)
    mol.AddConformer(new_conf)


dfrdkit = pd.DataFrame({'angle':angles, 'energy':energy})
Initial angle:
179.99983170835984
Code
#! echo: false
import seaborn as sns

sns.lineplot(
    data=dfrdkit,
    x="angle", y="energy",
    markers=True, dashes=False
)
<Axes: xlabel='angle', ylabel='energy'>

I’m printing the angle of the initial conformation here as well & start the scan from around that angle. Before I didn’t and obviously it created a lot of issues with the minimization of the subsequent conformers. Here is the plot from iwatobipen’s post on the same molecule & same torsion scan:

Let’s first have a look at the minimized conformers. Here’s a bit of code to browse through them. I collected all conformations generated with the torsion scan before.

Code
patt = Chem.MolFromSmarts('c1ccccc1');patt
match = mol.GetSubstructMatch(patt)

AllChem.AlignMolConformers(mol,atomIds=match)

conformerIds=[conf.GetId() for conf in mol.GetConformers()]
w = Chem.SDWriter("conformers.sdf")
for cid in range(mol.GetNumConformers()):
    w.write(mol, confId=cid)
w.close()

ojs_define(conformerIds=list(conformerIds))

You can use the slider above to browse through the first conformers. They look fairly reasonable to me so far.

Even though we can observe a peak around 0° that is common between what I have here & iwatobipens post. The wells around -75° and 75° are not that easily distinguishible unfortunately.

Dihedral scan with xtb

XTB is a toolkit implementing semiempirical quantum mechanics and you can do quite a lot of things. Among these: energy optimization, dihedral scans, constrained optimizations, metadynamics etc…It’s conda packaged, so easy to deploy anywhere. In contrary to things like Gaussian, Jaguar etc, it’s:

  1. free
  2. opensource
  3. actually quite fast

Running an energy optimization with xtb is rather straightforward and would work like this:

xtb mol.sdf --opt --charge 0

This will write the optimized molecule in an SD file, together with the energy (in hartree).

XTB also supports dihedral scans as described in the documentation and the examples work well on ethane. The thing is, as usual … we are not working with ethane or 1-Bromo-2-chloroethane (the other example).

Long story short, I tried to integrate a dihedral scan as described in the documentation, but on my molecule above (which still remains rather simple)

This resulted in a ton of segmentation faults (fond memories of Gaussian came back to me) after a few dihedral scan cycles. My suspicion is that the xtb optimizer is not robust enough to allow to resolve really ugly clashes generated during the scan (I’m just guessing here).

So here’s the workaround I came up with: preparing “good enough” starting conformations with rdkit and constrained optimizing with xtb and last, gathering the final energy values.

Code
import os
# angles=range(-180,180,5)
xtbenergy=[]
w = Chem.SDWriter('mol.sdf')
w.write(mol,confId=1)
w.close()
#loop over the previous conformations we obtained with rdkit
for idx,deg in enumerate(angles):
#   Chem.MolToMolFile(mol,'molecule.mol')
  atoms = '2,3,4,5' #set atoms to define the dihedral - NB: xtb indexes start at 1, rdkit at 0
  # Now write the xtb input file:
  fh = open("dihedral_constraint.inp","w")
  fh.write("""$constrain
    force constant=1.0
    dihedral: {},{}
  $end""".format(atoms,float(deg)))
  fh.close()
  # run xtb
  if idx==1: 
    inputfilename="mol.sdf"
  else:
    inputfilename="xtbopt.sdf"
  os.popen("obabel -isdf "+inputfilename+" -osdf -Oobabelmol.sdf && export OMP_STACKSIZE=48G && export OMP_NUM_THREADS=12 && xtb obabelmol.sdf --opt vtight --charge 0 --input dihedral_constraint.inp").read()
  with open('xtbopt.sdf') as f:
    first_line = f.readline()
    xtbenergy.append(float(first_line.split("gnorm")[0].split(":")[1]))

print(xtbenergy)
print(angles)
print(dfrdkit["energy"])

sns.lineplot(
    data=dfrdkit,
    x="angle", y="energy",
    markers=True, dashes=False
)
[-36.667513607102, -36.667309140177, -36.667281342484, -36.667239620489, -36.667184526773, -36.667117066126, -36.667037048311, -36.666945068211, -36.666842035773, -36.666727935772, -36.666604060224, -36.666472299839, -36.666331713852, -36.666184708292, -36.666031740213, -36.665874503518, -36.665713626478, -36.665550482059, -36.66538825599, -36.665225539683, -36.665065448331, -36.664908207787, -36.664756200554, -36.664609973167, -36.664471049714, -36.664340711715, -36.664220117532, -36.664110543681, -36.664013292087, -36.663929734669, -36.663861827436, -36.663811930617, -36.663785358953, -36.663798737338, -36.664979277748, -36.66520794, -36.665435059634, -36.665662744478, -36.665885124937, -36.666103090883, -36.66631293046, -36.666513718719, -36.666703584448, -36.666881133011, -36.66704466986, -36.667193605558, -36.667326090946, -36.667441655993, -36.667539140451, -36.667617824736, -36.667676838596, -36.667715513415, -36.667733191444, -36.667729259909, -36.667703174391, -36.667654498956, -36.667582681905, -36.667487621575, -36.667369122472, -36.667227133502, -36.667061760273, -36.666873378488, -36.666662375664, -36.666429356721, -36.666175209343, -36.665900780799, -36.665607020687, -36.665295092731, -36.664966346322, -36.664622108195, -36.664264015668, -36.663893894368, -36.663513617705, -36.663125457688, -36.662731505843, -36.662334108898, -36.661935609452, -36.661538373839, -36.661144767473, -36.660757102954, -36.660377606895, -36.660008348515, -36.659651419864, -36.659308746758, -36.658982148344, -36.65867330741, -36.658383810707, -36.658115108876, -36.657868511125, -36.657644953523, -36.657446596701, -36.657273573907, -36.65712674186, -36.657007487457, -36.656916139656, -36.656853449108, -36.656820030572, -36.65681652035, -36.656843563867, -36.656901487252, -36.656991065908, -36.657113141754, -36.657268773622, -36.657460253667, -36.657691643264, -36.657995009871, -36.658377284791, -36.658734779587, -36.659101032723, -36.65947082494, -36.65983907601, -36.660205762865, -36.660568445789, -36.660925547559, -36.661276290406, -36.661622045455, -36.661961151255, -36.662293542142, -36.662618617756, -36.662935539654, -36.663243196539, -36.663539682536, -36.663825629787, -36.664097088781, -36.66435469442, -36.664595178208, -36.664818301433, -36.66502278909, -36.665207515844, -36.66537160691, -36.665514564786, -36.665636254032, -36.665736540925, -36.66581574069, -36.665874412542, -36.665913153582, -36.665933064251, -36.665933964303, -36.665917822483, -36.665884812253, -36.665836013797, -36.665772163003, -36.665693615985, -36.665604836221, -36.665503223322, -36.665391578063, -36.665270323778, -36.665142434253, -36.665007952167, -36.664869075563, -36.664728209023, -36.664586960773, -36.664449514058, -36.664316973593, -36.664201080863, -36.664966618261, -36.665079218636, -36.665198049586, -36.665324534034, -36.665457378424, -36.665594877696, -36.665735947942, -36.665879336208, -36.666023480717, -36.666167690138, -36.666309918055, -36.666449518595, -36.666584866403, -36.666714785838, -36.666838170524, -36.666953893713, -36.667061007153, -36.667158583164, -36.667245753906, -36.667321770371, -36.667385989984, -36.667437794223, -36.667476728724, -36.667502584037, -36.667514883009, -36.667513577254]
range(-180, 182, 2)
0      19.936138
1      19.942480
2      19.958324
3      19.989942
4      20.018098
         ...    
176    19.900734
177    19.870128
178    19.849113
179    19.837853
180    19.835944
Name: energy, Length: 181, dtype: float64
<Axes: xlabel='angle', ylabel='energy'>

This runs for a while, but it’s still reasonably fast. Also, you might have noticed that I specified an argument to the --opt flag. This argument allows you to tweak how loose or precise the optimisation should be. More information on that can be found in the xtb documentation here. In this example I specified a rather precise method. Feel free to play around with them and check the outcome (rather interesting as well).

import numpy as np
# angle=np.array(range(180,-180,-1))

dfxtb = pd.DataFrame({'angle':angles, 'xtb':xtbenergy,'MMFF':dfrdkit["energy"]})
# dfxtb = pd.DataFrame({'angle':angles, 'xtb':xtbenergy})

import matplotlib.pyplot as plt
sns.lineplot(x="angle",y="xtb",data=dfxtb, color="g")
ax2 = plt.twinx()
sns.lineplot(x="angle",y="MMFF",data=dfxtb, color="b", ax=ax2)
<Axes: xlabel='angle', ylabel='MMFF'>

On this plot we can see both results, from rdkit’s MMFF implementation and xtb. Good news…at least they agree on the maximum around 0°. Other than that there are quite important discrepancies on the location of the global minimum and the importance & extent of the energy barriers between them. Interestingly, openFF and ani2 also predict 180° as a global minimum.

Comparing vs COD

Let’s double check these results now with the initial work done on the COD (not yet cleaned and curated - so there will be some noise in here still). This is for sure not the only experimental data-source one should use, but I’ll define it as my golden source of truth here within the scope of this post. First I’ll try to identify the smarts patterns from the torsion library that match the dihedral under investigation here:

patterns=pd.read_table("list_torsion_patterns.txt",header=None,usecols=[1])
selectedPatterns=[]
for torsionSmarts in patterns[1]:
    torsionQuery = Chem.MolFromSmarts(torsionSmarts)
    matches = mol.GetSubstructMatches(torsionQuery)
    if(len(matches)>0):
        if (matches==((1,2,3,4),)):
            selectedPatterns.append(torsionQuery)
            print("selected: ",torsionSmarts)
selected:  [C:1][CX4H2:2]!@;-[OX2:3][c:4]
selected:  [!#1:1][CX4H2:2]!@;-[OX2:3][c:4]
selected:  [C:1][CX4H2:2]!@;-[OX2:3][!#1:4]
selected:  [!#1:1][CX4H2:2]!@;-[OX2:3][!#1:4]
selected:  [!#1:1][CX4:2]!@;-[OX2:3][!#1:4]

Now we have the selected patterns, let’s run these through the prepared COD molecules (a huge local sd file right now) and gather statistics on angles. NB: the smart patterns used here might be redundant and map the same molecules. So here I’m keeping track of which molecule was previously selected and don’t include it in a subsequent calculation anymore:

[C:1][C&X4&H2:2]!@&-[O&X2:3][c:4]
[!#1:1][C&X4&H2:2]!@&-[O&X2:3][c:4]
[C:1][C&X4&H2:2]!@&-[O&X2:3][!#1:4]
[!#1:1][C&X4&H2:2]!@&-[O&X2:3][!#1:4]
[!#1:1][C&X4:2]!@&-[O&X2:3][!#1:4]
[21:00:06] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:00:06] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:00:07] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:00:10] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:00:12] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:00:19] 

****
Post-condition Violation
Element '6 H' not found
Violation occurred on line 93 in file /Users/runner/miniforge3/conda-bld/rdkit_1678104751770/work/Code/GraphMol/PeriodicTable.h
Failed Expression: anum > -1
****

[21:00:19] ERROR: Element '6 H' not found
[21:00:19] ERROR: moving to the beginning of the next molecule
[21:00:19] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:00:21] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:00:26] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:00:31] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:00:37] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:00:44] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:00:50] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:00:50] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:00:50] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:00:50] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:00:50] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:00:50] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:00:50] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:00:58] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:01:06] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:01:06] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:01:06] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:01:07] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:01:07] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:01:07] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:01:11] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:01:16] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:01:20] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:01:20] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:01:20] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:01:21] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:01:57] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:02:13] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:02:29] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:02:29] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:02:29] Conflicting single bond directions around double bond at index 5.
[21:02:29]   BondStereo set to STEREONONE and single bond directions set to NONE.
[21:02:31] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:02:36] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:02:46] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:02:47] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:02:48] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:02:51] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:02:54] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:03:01] 

****
Post-condition Violation
Element '6 H' not found
Violation occurred on line 93 in file /Users/runner/miniforge3/conda-bld/rdkit_1678104751770/work/Code/GraphMol/PeriodicTable.h
Failed Expression: anum > -1
****

[21:03:01] ERROR: Element '6 H' not found
[21:03:01] ERROR: moving to the beginning of the next molecule
[21:03:02] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:03:04] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:03:10] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:03:14] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:03:22] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:03:30] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:03:36] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:03:36] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:03:36] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:03:36] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:03:36] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:03:36] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:03:36] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:03:45] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:03:54] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:03:54] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:03:54] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:03:54] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:03:55] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:03:55] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:03:59] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:04:05] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:04:09] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:04:09] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:04:09] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:04:10] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:04:49] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:05:04] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:05:21] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:05:22] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:05:22] Conflicting single bond directions around double bond at index 5.
[21:05:22]   BondStereo set to STEREONONE and single bond directions set to NONE.
[21:05:23] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:05:29] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:05:39] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:05:40] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:05:41] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:05:44] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:05:47] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:05:55] 

****
Post-condition Violation
Element '6 H' not found
Violation occurred on line 93 in file /Users/runner/miniforge3/conda-bld/rdkit_1678104751770/work/Code/GraphMol/PeriodicTable.h
Failed Expression: anum > -1
****

[21:05:55] ERROR: Element '6 H' not found
[21:05:55] ERROR: moving to the beginning of the next molecule
[21:05:55] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:05:58] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:06:04] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:06:09] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:06:17] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:06:27] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:06:34] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:06:34] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:06:34] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:06:34] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:06:34] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:06:34] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:06:34] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:06:44] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:06:54] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:06:54] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:06:54] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:06:54] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:06:54] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:06:54] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:06:59] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:07:06] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:07:10] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:07:10] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:07:10] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:07:11] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:07:55] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:08:14] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:08:34] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:08:35] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:08:35] Conflicting single bond directions around double bond at index 5.
[21:08:35]   BondStereo set to STEREONONE and single bond directions set to NONE.
[21:08:37] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:08:43] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:08:56] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:08:57] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:08:58] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:09:02] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:09:05] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:09:14] 

****
Post-condition Violation
Element '6 H' not found
Violation occurred on line 93 in file /Users/runner/miniforge3/conda-bld/rdkit_1678104751770/work/Code/GraphMol/PeriodicTable.h
Failed Expression: anum > -1
****

[21:09:14] ERROR: Element '6 H' not found
[21:09:14] ERROR: moving to the beginning of the next molecule
[21:09:15] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:09:17] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:09:25] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:09:32] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:09:42] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:09:52] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:01] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:01] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:01] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:01] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:01] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:01] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:01] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:12] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:23] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:23] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:23] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:24] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:10:24] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:24] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:29] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:36] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:41] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:41] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:41] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:10:42] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:11:30] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:11:50] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:12:11] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:12:12] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:12:12] Conflicting single bond directions around double bond at index 5.
[21:12:12]   BondStereo set to STEREONONE and single bond directions set to NONE.
[21:12:14] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:12:21] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:12:33] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:12:34] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:12:36] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:12:40] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:12:43] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:12:54] 

****
Post-condition Violation
Element '6 H' not found
Violation occurred on line 93 in file /Users/runner/miniforge3/conda-bld/rdkit_1678104751770/work/Code/GraphMol/PeriodicTable.h
Failed Expression: anum > -1
****

[21:12:54] ERROR: Element '6 H' not found
[21:12:54] ERROR: moving to the beginning of the next molecule
[21:12:55] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:12:58] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:13:07] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:13:14] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:13:27] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:13:37] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:13:47] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:13:47] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:13:47] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:13:47] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:13:47] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:13:47] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:13:47] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:13:59] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:14:11] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:14:11] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:14:11] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:14:11] Warning: molecule is tagged as 3D, but all Z coords are zero
[21:14:12] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:14:12] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:14:17] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:14:25] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:14:30] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:14:30] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:14:31] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:14:32] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:15:26] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:15:49] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:16:14] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:16:14] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:16:14] Conflicting single bond directions around double bond at index 5.
[21:16:14]   BondStereo set to STEREONONE and single bond directions set to NONE.
[21:16:17] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
[21:16:24] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.

This runs again for quite some time…so patience is needed. Once this is done, you should see something like that:

#lets convert that to numbers we show in the line-plot together with the previous results
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
angles=range(-180,182,2)
intervals=pd.interval_range(start=-180, end=180,periods=181)
binangles=pd.cut(angles, bins=intervals).value_counts()

dfall = pd.DataFrame({'angle':angles, 'xtb':xtbenergy,'MMFF':dfrdkit["energy"],'cod':np.array(binangles)})

fig, ax = plt.subplots()
sns.histplot(codangles)
ax.set(xlabel='Angle',ylabel= 'Count')
ax2 = ax.twinx()
sns.lineplot(x="angle",y="xtb",data=dfall, color="g")
<Axes: xlabel='angle', ylabel='xtb'>

So here we have in blue the results from the COD, in green xtb. Good news is that the positions of the wells seem to be estimated rather well with xtb. MMFF seems slightly off for the preference on 280°. Analysing relative energy differences between local minima, xtb doesn’t follow the trend observed using MMFF & results from the COD here.

Conclusion

Initially I wanted to run this on a larger set of molecules, but as usual things turn out to be much less robust than anticipated. So more to come in an upcoming post, on other torsion angles - especially the challenging ones.

The encouraging aspect here is, that despite all difficulties, one could potentially use this on a larger set of molecules. The instability of xtb on the integrated dihedral scan should be investigated a bit further as well … essentially to make sure that this is not only due to my totally “noob” use of the toolkit - which is very likely.