scikit-learn: Use a safe and short repr in error messages and warning
We print the value of an offending parameter in many of our error messages and warnings. However, when the object failing printing, or when its representation is too long, the resulting message is not useful.
We should:
- Use a safe_repr function to never fail printing (I am pasting an example below, with a test)
- Only print the 300 first characters (something like this) of it’s return, using a "short_repr’
These two functions should be added in the utils submodule and used in error messages and warnings in the codebase.
def safe_repr(value):
"""Hopefully pretty robust repr equivalent."""
# this is pretty horrible but should always return *something*
try:
return pydoc.text.repr(value)
except KeyboardInterrupt:
raise
except:
try:
return repr(value)
except KeyboardInterrupt:
raise
except:
try:
# all still in an except block so we catch
# getattr raising
name = getattr(value, '__name__', None)
if name:
# ick, recursion
return safe_repr(name)
klass = getattr(value, '__class__', None)
if klass:
return '%s instance' % safe_repr(klass)
except KeyboardInterrupt:
raise
except:
return 'UNRECOVERABLE REPR FAILURE'
def short_repr(obj):
msg = safe_repr(obj)
if len(msg) > 300:
return msg = '%s...' % msg
return msg
# For testing (in a test file, not in the same file)
class Vicious(object):
def __repr__(self):
raise ValueError
def test_safe_repr():
safe_repr(Vicious())
safe_repr is borrowed from joblib, but as it is not exposed in the public API, we shouldn’t import it from our vendored version of joblib (elsewhere, the “unvendoring” performed by debian will break the import)
About this issue
- Original URL
- State: open
- Created 7 years ago
- Comments: 16 (10 by maintainers)
pprint is not robust to failing repr:
Hence, I do believe that the problem is not addressed.
While the above example can seem contrived, things like this can happen in repr of estimators in some rare cases.
There is probably some work to salvage from pull request #11601
The length of the error message may be improved now with default
print_changed_only=True
. We should confirm that our pprint is robust to errors before closing this.