datasets-server: Raise specific errors (and error_code) instead of UnexpectedError
The following query on the production database gives the number of datasets with at least one cache entry with error_code “UnexpectedError”, grouped by the underlying “cause_exception”.
For the most common ones (DatasetGenerationError
, HfHubHTTPError
, OSError
, etc.) we would benefit from raising a specific error with its error code. It would allow to:
- retry automatically, if needed
- show an adequate error message to the users
- even: adapt the way we show the dataset viewer on the Hub
null
means it has no details.cause_exception
. These cache entries should be inspected more closely. See https://github.com/huggingface/datasets-server/issues/1123 in particular, which is one of the cases where no cause exception is reported.
db.cachedResponsesBlue.aggregate([
{$match: {error_code: "UnexpectedError"}},
{$group: {_id: {cause: "$details.cause_exception", dataset: "$dataset"}, count: {$sum: 1}}},
{$group: {_id: "$_id.cause", count: {$sum: 1}}},
{$sort: {count: -1}}
])
{ _id: 'DatasetGenerationError', count: 1964 }
{ _id: null, count: 1388 }
{ _id: 'HfHubHTTPError', count: 1154 }
{ _id: 'OSError', count: 433 }
{ _id: 'FileNotFoundError', count: 242 }
{ _id: 'FileExistsError', count: 198 }
{ _id: 'ValueError', count: 186 }
{ _id: 'TypeError', count: 160 }
{ _id: 'ConnectionError', count: 146 }
{ _id: 'RuntimeError', count: 86 }
{ _id: 'NonMatchingSplitsSizesError', count: 83 }
{ _id: 'FileSystemError', count: 62 }
{ _id: 'ClientResponseError', count: 52 }
{ _id: 'ArrowInvalid', count: 45 }
{ _id: 'ParquetResponseEmptyError', count: 43 }
{ _id: 'RepositoryNotFoundError', count: 41 }
{ _id: 'ManualDownloadError', count: 39 }
{ _id: 'IndexError', count: 28 }
{ _id: 'AttributeError', count: 16 }
{ _id: 'KeyError', count: 15 }
{ _id: 'GatedRepoError', count: 13 }
{ _id: 'NotImplementedError', count: 11 }
{ _id: 'ExpectedMoreSplits', count: 9 }
{ _id: 'PermissionError', count: 8 }
{ _id: 'BadRequestError', count: 7 }
{ _id: 'NonMatchingChecksumError', count: 6 }
{ _id: 'AssertionError', count: 4 }
{ _id: 'NameError', count: 4 }
{ _id: 'UnboundLocalError', count: 3 }
{ _id: 'JSONDecodeError', count: 3 }
{ _id: 'ZeroDivisionError', count: 3 }
{ _id: 'InvalidDocument', count: 3 }
{ _id: 'DoesNotExist', count: 3 }
{ _id: 'EOFError', count: 3 }
{ _id: 'ImportError', count: 3 }
{ _id: 'NotADirectoryError', count: 2 }
{ _id: 'RarCannotExec', count: 2 }
{ _id: 'ReadTimeout', count: 2 }
{ _id: 'ChunkedEncodingError', count: 2 }
{ _id: 'ExpectedMoreDownloadedFiles', count: 2 }
{ _id: 'InvalidConfigName', count: 2 }
{ _id: 'ModuleNotFoundError', count: 2 }
{ _id: 'Exception', count: 2 }
{ _id: 'MissingBeamOptions', count: 2 }
{ _id: 'HTTPError', count: 1 }
{ _id: 'BadZipFile', count: 1 }
{ _id: 'OverflowError', count: 1 }
{ _id: 'HFValidationError', count: 1 }
{ _id: 'IsADirectoryError', count: 1 }
{ _id: 'OperationalError', count: 1 }
About this issue
- Original URL
- State: open
- Created a year ago
- Comments: 16 (7 by maintainers)
Current state:
For
KeyError
, see https://github.com/huggingface/huggingface_hub/issues/1853After doing some cache maintenance actions manually (removing obsolete records which config or split no longer exist) this is the updated list mostly AttributeError and ClientResponseError reduced: