numba: np.mean ans np.einsum not supported by numba

I want to find the most efficient way to calculate the mean for each row. Three approach presented:

@jit(nopython=True)
def func1(m):
	'''
	mean fo each row in matrix m
	m: matrix 
	'''
	ncol = m.shape[1]
	sumRows = np.einsum('ij->i',m)
	return sumRows/ncol
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Use of unsupported NumPy function 'numpy.einsum' or unsupported use of the function.
@jit(nopython=True)
def func2(m):
	'''
	mean fo each row in matrix m
	m: matrix 
	'''
	ncol = m.shape[1]
	sumRows = m.sum(1)
	return sumRows/ncol
@jit(nopython=True)
def func3(m):
	'''
	mean fo each row in matrix m
	m: matrix 
	'''
	ret = m.mean(1)
	return ret

numba.errors.InternalError: 
[1] During: resolving callee type: BoundFunction(array.mean for array(int64, 2d, C))
[2] During: typing of call at <ipython-input-244-335b7380960c> (27)

with no jit , the func1 is the most efficient , thanks to einstein summation convention but func1 and func3 cannot be supported by numba How can I fix the problem Thanks so much.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 16 (10 by maintainers)

Most upvoted comments

but fundamentally it’s just about looping over data, and Numba can compile and optimize arbitrary loops very effectively.

With this reasoning you might as well use C in the first place.

To be honest I’m quite frustrated by this attitude. Why use numba if I can’t use the elegant features of Python and numpy?

@clemisch In the context of the conversation above, i.e. implementing einsum.

By the way, einstein summation convention is really fast in lots of cases, is there any method to support it in the future?

Eventually I would think that this can be supported. “fast”, however, is subjective, in NumPy this may be fast in some situations, but fundamentally it’s just about looping over data, and Numba can compile and optimize arbitrary loops very effectively.

I am trying to explain that einsum is essentially implemented as loop nests, it is in NumPy (this is the source) and were Numba to implement einsum, internally it too would be implemented as a loop nest generation but the API would still be the same as NumPy. Numba is good at optimising loop nests because it hands them off (practically it operates function at a time) to LLVM to do the optimisation.

With this reasoning you might as well use C in the first place.

You can write loops and compile them with Numba, but most users use a mixture of the supported Python/NumPy features and loops as needed. Python and C are very different languages and are used for different reasons, Numba helps get C-like performance with the ease of Python.

To be honest I’m quite frustrated by this attitude. Why use numba if I can’t use the elegant features of Python and numpy?

Numba is an OpenSource project and has finite resources, a large amount of Python and NumPy are already supported and this continuously improves. Contributions and suggestions are welcomed.

Not sure if I’m going to get to einsum in near future, so seems useful (and perhaps even if I do) to write down what I was planning to do:

  1. opt_einsum package has sketch of doing einsum using np.tensordot, an initial version of which is in #7289. So was going to use that. There’s Python parsing code for the formulas in a bunch of places, including NumPy.
  2. By requiring the formula (e.g. "ji->ij") to be a literal, which probably won’t restrict vast majority of users, at least some of the parsing/preprocessing can be done in normal Python. Unfortunately optimization depends on array shapes, which are not known at this stage.

I am going to attempt einsum.

@stuartarchibald

but fundamentally it’s just about looping over data, and Numba can compile and optimize arbitrary loops very effectively.

With this reasoning you might as well use C in the first place.

To be honest I’m quite frustrated by this attitude. Why use numba if I can’t use the elegant features of Python and numpy?