openff-toolkit: Custom exceptions
Is your feature request related to a problem? Please describe. The toolkit, and our software in general, should raise descriptive, custom exceptions instead of re-using built-in exceptions.
Describe the solution you’d like We should have our own exceptions that are fairly specific and provide detailed feedback to the user as to what went wrong and why, and possibly hints at how to resolve the error. For example, this behavior (currently in the code) checks all the boxes:
In [1]: from openforcefield.topology.molecule import Molecule
Warning: Unable to load toolkit 'AmberTools'.
In [2]: from openforcefield.utils.toolkits import BuiltInToolkitWrapper
In [3]: mol = Molecule.from_smiles('CCO')
In [4]: BuiltInToolkitWrapper().assign_partial_charges(molecule=mol, partial_charge_method='am1bcc')
---------------------------------------------------------------------------
ChargeMethodUnavailableError Traceback (most recent call last)
<ipython-input-4-5e6d5f4e9e1e> in <module>
----> 1 BuiltInToolkitWrapper().assign_partial_charges(molecule=mol, partial_charge_method='am1bcc')
~/software/openforcefield/openforcefield/utils/toolkits.py in assign_partial_charges(self, molecule, partial_charge_method, use_conformers, strict_n_conformers)
466 if partial_charge_method not in PARTIAL_CHARGE_METHODS:
467 raise ChargeMethodUnavailableError(
--> 468 f'Partial charge method "{partial_charge_method}"" is not supported by '
469 f"the Built-in toolkit. Available charge methods are "
470 f"{list(PARTIAL_CHARGE_METHODS.keys())}"
ChargeMethodUnavailableError: Partial charge method "am1bcc"" is not supported by the Built-in toolkit. Available charge methods are ['zeros', 'formal_charge']
Today, I went through most of the codebase and recorded when we raise built-in exceptions to get a better picture of how much would be changed by implementing this (nearly) everywhere in the codebase. Some of these (maybe a little less than a half?) probably fit well with existing exceptions, but many will require new exceptions. I have yet to fill out the last column.
| Context | Existing exception | Proposed change |
|---|---|---|
| In many places: | ||
| Method not implemented | NotImplementedError |
|
In topology/topology.py: |
||
| Aromaticity model not in list of known, allowed atomaticity models | ValueError |
|
| Charge model not in … | ValueError |
|
| Fractional bond order model not in … | ValueError |
|
chemical_environment_matches got an argument that can’t be converted to SMARTS |
ValueError |
|
| Input data source looks to be missing connectivity, and parametrization will probably be bad | ValueError |
|
| After adding in atoms from the chains/residues in an OFFMol, OEMol has incorrect number of atoms | Exception |
|
Bad arguments passed to get_bond_between |
Exception |
|
| Trying to constrain a pair of atoms that are already constrained | Exception |
|
In topology/molecule.py: |
||
| Trying to set an OFFMol’s name to something not a string | Exception |
|
Trying to get an atom’s molecule_atom_index or molecule_particle_index when it does not belong to any molecules |
ValueError |
|
| Trying to make a virtual site with a different number of atoms and charge increments | Exception |
|
| Trying to make a virtual site with both or neither of sigma, epsilon | Exception |
|
Trying to get a bonds’s molecule_bond_index when it does not belong to any molecules |
ValueError |
|
Some broad set of things went wrong in FrozenMolecule.__init__() |
ValueError |
|
| Trying to add a virtual site to a molecule when one of the same type already exists | Exception |
|
Bad arguments to FrozenMolecule._add_bond |
Exception |
|
| Trying to add a bond when one already exists | Exception |
|
| Trying to add a conformer with bad units or wrong shape | Exception |
|
chemical_environment_matches got an argument that can’t be converted to SMARTS |
ValueError |
|
Invalid toolkit registry passed to FrozenMolecule.{to|from}_iupac |
Exception |
|
| Trying to make an OFFMol from an OFFTop that has multiple molecules | Exception |
|
from_file needs file_format argument specifed |
Exception |
|
to_file couldn’t find a toolkit to do its writing for it |
ValueError |
|
| Tried to import QCElemental, couldn’t | ImportError |
|
from_qcschema got something that’s not JSON-encodable |
AttributeError |
|
from_qcschema was passed something without explicit hydrogen mapped SMILES or client otherwise failed to convert input to OFFMol |
KeyError |
|
FrozenMolecule.remap was given mapping with a different number of hydrogens |
ValueError |
|
Bad arguments passed to get_bond_between |
TypeError |
|
| Tried to visualize with NGLView sans conformers, or otherwise couldn’t get a backend | ValueError |
|
In typing/engines/smirnoff/io.py: |
||
| Couldn’t convert given unit to a SimTK unit | ValueError |
|
In typing/engines/smirnoff/parameters.py: |
||
| An attribute seems to be specified with and without indices | TypeError |
|
| Different indexed attributes have different numbers of terms | TypeError |
|
| Trying to access an indexed attributes out of the bounds of the attribute | TypeError |
|
| An object, or possibly a subclass, does not have a requested attribute | AttributeError |
|
Trying to ParameterList.extend with something not another instance of it |
TypeError |
|
Impossible combination of arguments passed to .add_parameter |
TypeError/ValueError |
|
Something that can’t be turned into a parameter passed to .add_parameter |
ValueError |
|
Some molecules passed to check_partial_bond_orders_from_molecules_duplicates are isomorphic |
ValueError |
|
assign_partial_bond_orders_from_molecules was told to use user bond orders, but not given any |
ValueError |
|
Trying to set up bond WBO with only one value of k |
ValueError |
|
Either ElectrostaticsHandler or ToolkitAM1BCCHandler found a particle that’s not a TopologyAtom or TopologyVirtualSite |
ValueError |
|
Some collection of input failures in VirtualSiteHandler.add_parameter |
ValueError |
|
In typing/engines/smirnoff/forcefield.py: |
||
| Trying to register a parameter handler who tag name has already been registered | Exception |
|
| Missing valence terms were found | Exception |
|
| Could not find a ParameterIOHandler for a given tag name | KeyError |
|
| Something went wrong in file parsing | IOError |
|
| Could not resolve order in which to parameter handlers are meant to run | RuntimeError |
|
Unknown kwargs passed to create_openmm_system |
ValueError |
|
| Tried to look up a parameter handler that was not registered | KeyError |
|
In utils/utils.py: |
||
get_data_file_path failed to get anything |
ValueError |
|
A set of unit incompatibility errors in {a|de}tach_units |
ValueError |
|
| get_molecule_parameterIDs was giving a list of molecules that contain some duplicates | ValueError |
|
In utils/toolkits.py (many are copied code across toolkit wrappers): |
||
| Provided aromaticity model not supported by OpenEyeToolkitWrapper or AmberToolsToolkitWrapper | ValueError |
|
| Provided aromaticity model not recognized by OpenEye or RDKit itself | ValueError |
|
| OpenEye atom or bond stereochemistry assumptions failed | Exception |
|
OpenEye failed to add excplicit hydrogens (possible during from_iupac) |
ValueError |
|
| OpenEye or RDKit failed to parse the InChi string | RuntimeError |
|
| OpenEye Omega conformer generation failed | Exception |
|
assign_fractional_bond_orders was given an OFFMol without conformers |
Exception |
|
| Bond order model not supported | ValueError |
|
| OpenEye was unable to assign charges in the process of calculating fractional bond orders | Exception |
|
| OpenEye or RDKit ran into an error parsing SMARTS | ValueError |
|
| RDKit cannot read PDB files | Exception |
|
OpenEye or RDKit are told hydrogens_are_explicit, but detect implicit hydrogens |
ValueError |
|
| RDKit bond stereochemistry was somehow neither Z nor E | ValueError |
|
| Some atoms in an rdmol have partial charges, but others do not | ValueError |
|
| Bizarre RDKit stereochemistry encountered | RuntimeError |
|
| Unexpected elements found when parsing an sqm.out | ValueError |
Step 2 here could be to do a similar survey on which exceptions are actually used, possibly considering how often and/or how similar any are to others, to inform what sort of inheritance structure we want.
Describe alternatives you’ve considered Continuing with built-in exceptions is not ideal, long-term. We’ve already been slowly moving in this direction - most of our PRs the past few months are fairly aligned with the idea here - but slowly picking away at it won’t provide the benefits of a more unified exception structure.
Additional context This idea has been thrown around in a few places (https://github.com/openforcefield/openforcefield/issues/514#issuecomment-585253391 started it) and a few different contexts but I don’t think there’s a stub issue.
Some other things to consider
- How much we can break the API doing this? A function raising a different exception may not be an issue to our downstream users, or it may be a big problem.
- Would inheriting custom exceptions from multiple built-in classes is worth it?
- Should these exceptions be grouped into a single file (maybe
openforcefield/exceptions.py) or, as it stands now, scattered across files closer to where they’d be raised, or something in between? - Should we have an exception hierarchy in which some of our custom exceptions inherit from each other? How deep should such a tree go?
About this issue
- Original URL
- State: open
- Created 4 years ago
- Comments: 18 (14 by maintainers)
Thanks for the feedback!
This has been in the codebase for ages (I think back when the PIs were the only developers) and I have always been confused by it as well. Things were different back in ~2017 … maybe there’s some Python 2 compatibility that I have never come across? Maybe it was part of a more elaborate idea that never got fully implemented. I don’t know what it does that the built-in exceptions don’t do, and therefore I’d like to ditch it.
Your last code snippet (with
class ToolkitUnavailableException(MissingPackageError):) is exactly the design I’m hoping can be implemented here.I agree that each docstring, on average, does a poor job of communicating which exceptions can be raised. I’d like to say that we could go through everything and make a good effort to list out everything that could be raised by any function/method call, but that’s probably a significant amount of work to wrangle up and a decent amount of work to keep things in sync over time. I’m even skeptical that it can be done well manually - I wonder if there are automated solutions for this?
I think the motivation here is similar to why we’d want to do custom exceptions to begin with; most users will reach for a somewhat broad exception (i.e. one that broadly captures molecule parsing errors) like you describe but some specific cases might require dealing with one or more of the more detailed exceptions. In cases like those, it’s more useful to have a sub(-sub-sub…)classed exception that handles one extremely specific case, even if the contents of the message are similar, just like it’s better to have a
ToolkitUnavailableExceptionin addition toModuleNotFoundError. I would assume that really specific exceptions (maybe one for RDKit’s SMILES parsing failing for reason xyz) are mostly used by developers/powerusers and not of much interest to most users.You can still do this with XYZError, but for people who want to
excepta general class of errors that they know could be raised (e.g. ImportErrors because the toolkit comes with optional dependencies), this requires them to not only dig through the code to locate and correctly import the potential error, but to branch through all possibilities and import all potentially raised errors, which then requires experience and high familiarity with the layout and organization of the code. I would classify anyone who can do this as an OpenFF Toolkit expert who probably wrote some of the code. What I would expect developers to actually do (and what I often do) is run through some test cases in a notebook and just make a long tuple of all exceptions that are eventually raised (except (MissingDependencyError, MissingPackageError, ToolkitUnavailableException, LicenseError)). This also requires a non-trivial amount of work and familiarity. What I expect at least some users to do is a blanktry/except Exception, especially if they’re just doing something quick and dirty, or a little prototype, and don’t have the time to mess around with the toolkit code, or remember what an exception is called.ValueErroris the most generic of the errors –IndexError,KeyError,TypeError,AttributeErrorall have fairly standard meanings, and often the main Python library depends on them in ways that break if you don’t subclass these main errors (e.g. https://github.com/openforcefield/openff-toolkit/issues/986). One intermediate solution could be to avoid subclassingValueErrors where possible.I personally would look at this from a user perspective. Keeping Toolkit-specific errors but subclassing generic ones still allows for approaches 1 and 2 above – they can still just import whichever specific error they want to catch by commenting out the
try/exceptparts and letting the traceback tell them. However, it might minimise approach 3 (blank except) for the quick scripter by allowing them access to a generic class of “expected-things-going-wrong”, especially if they vaguely know which exception is going to be raised but can’t remember the very long name of the specific exception (e.g.FractionalBondOrderInterpolationMethodUnsupportedError)If there’s going to be custom errors, it probably makes more sense for these to (also) inherit from the Python core library errors. E.g.
MissingDependencyErrorin #1157 sounds a lot like anImportErrorAs a smaller task, I think it would be nice to remove some of the
assertlines in the non-testing parts of the toolkit. For example,this isn’t very useful, and could be replaced by a
InvalidAtomIndexErrorwith a descriptive message instead of displaying the internal check and forcing the user to do into what went wrong. (In my case I was just sloppy with atom indexing vs GROMACS, but the point remains)