array-api: Would it make sense to add ShapeError and BroadcastingError

In 2011, one of the NumPy authors proposed the exceptions ShapeError, BroadcastingError, and ConvergenceError here. Would it make sense to consider these for the array API? They could inherit from ValueError for ease of porting?

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 17 (11 by maintainers)

Commits related to this issue

Most upvoted comments

do you actually have real-world use cases for this?

Not really. You’re right that I don’t think I ever catch any of these exceptions. It would make exception names a bit more readable after an exception.

With that being said, thinking about this, there might be a benefit to specifying that the broadcasting functions should at least raise ValueError (or a subclass of it), if not adding BroadcastingError as you suggest. There’s no clear way in the standard to check if two arrays are broadcast compatible.

Right now it’s except Exception: I think. It’d be nice of course if that could change to ValueError, however a quick check shows that NumPy and JAX indeed yield ValueError but PyTorch doesn’t:

>>> import torch
>>> torch.broadcast_to(torch.ones((4, 3)), (2,5,2))
Traceback (most recent call last):
  Cell In[5], line 1
    torch.broadcast_to(torch.ones((4, 3)), (2,5,2))
RuntimeError: The expanded size of the tensor (2) must match the existing size (3) at non-singleton dimension 2.  Target sizes: [2, 5, 2].  Tensor sizes: [4, 3]

>>> torch.broadcast_shapes((4, 3), (2,5,2))
Traceback (most recent call last):
  Cell In[6], line 1
    torch.broadcast_shapes((4, 3), (2,5,2))
  File ~/mambaforge/envs/torch/lib/python3.11/site-packages/torch/functional.py:126 in broadcast_shapes
    raise RuntimeError("Shape mismatch: objects cannot be broadcast to a single shape")
RuntimeError: Shape mismatch: objects cannot be broadcast to a single shape

This would be quite a pain to get changed, I don’t think it’ll happen in practice - it’s too low-prio to work on, and it’d be bc-breaking if it were to change. It’s also not even completely clear-cut that RuntimeError is wrong.

So I think I’ll stay with my opinion that we should avoid specifying any exception types. At most we can recommend a type for some common causes of exceptions.

For the spec, I’d say we should specify a specific exception if it benefits downstream consumer library authors so that they can catch a uniform exception. If on the other hand, the point of a specific exception subclass is just to make things more legible for users, then I can see the benefit of an individual library like NumPy implementing it, but I don’t see why that needs to be specified in the standard.

With that being said, thinking about this, there might be a benefit to specifying that the broadcasting functions should at least raise ValueError (or a subclass of it), if not adding BroadcastingError as you suggest. There’s no clear way in the standard to check if two arrays are broadcast compatible. The best you can do is call broadcast_arrays, but it doesn’t specify what exception is raised when they aren’t https://data-apis.org/array-api/draft/API_specification/generated/array_api.broadcast_arrays.html. The spec is also missing broadcast_shapes (probably because it is still very new in NumPy), although that would also raise an exception on failure.

ShapeError (and AxisError) seem to be more about user error. Library code can easily check the actual condition rather than relying on an exception, and in most cases the most appropiate thing is to let the exception bubble up to the user. Also note that AxisError is a subclass of IndexError in NumPy, which is already required in a few places https://data-apis.org/array-api/draft/search.html?q=indexerror (perhaps all out of bounds axis arguments should require IndexError or a subclass for consistency?)

For ConvergenceError, what functions would this specifically apply to? This seems like something that is not appropriate to the array API, because it would tie a function to a given implementation. But we generally want libraries to be free to use different algorithms or implementations for any given function.

Thanks for the suggestion @NeilGirdhar. I’d be +1 for the first two for NumPy. For the array API standard I’m less sure - we’ve shied away from specifying any specific exceptions or what should or shouldn’t happen for incorrect/undefined usage of the API. The problems are that (a) it’s impossible to get exhaustive or even uniform coverage or draw a line somewhere, and (b) it’s pretty hard to align on this kind of thing.

E.g., for invalid shape= input, that can already raise ValueError, TypeError or NameError within numpy.ndarray.reshape alone. And what’s an invalid type for one library may not be so for another (spec say tuple of ints, number accepts list of ints).

So I’d be inclined to not go into this.