pythran: Pythran twice as slow compared to native numpy when filling numpy array

Background

We have a function were we cannot use numba as `int()` is not supported yet. Hence we were excited to try pythran! :). This really sped up our code, however, when we perform NumPy operations, such as filling an array, native Numpy is significantly faster. This seems counterintuitive so I wonder whether we are missing something here.

Pythran much slower when we fill numpy arrays

For ease, I also created a google colab notebook here where I put all the code. Generate some testdata

import random
random.seed(10)
np.random.seed(10)
def test_str():
    arr  = np.random.randint(10000, size=10000000)
    sign = np.random.choice(['+','-'], size=10000000)
    return ' '.join(["{}{}".format(a,b) for a,b in zip(arr, sign)])

The native numpy function

def test1(l):
  numbers = np.array([int(k[:-1]) for k in l.split(' ')])
  signs = np.array([(k[-1] == '+') for k in l.split(' ')]) * 2 - 1
  N = len(numbers)
  arr = np.empty(shape = (N, 3), dtype = np.int64)
  arr[:, 0] = numbers
  arr[:, 1] = signs
  arr[:, 2] = np.arange(N)
  return arr

The Pythran function

Just the same function but with the pythran export 😃

%%pythran
#pythran export test2(str)
import numpy as np
def test2(l):
  numbers = np.array([int(k[:-1]) for k in l.split(' ')])
  signs = np.array([(k[-1] == '+') for k in l.split(' ')]) * 2 - 1
  N = len(numbers)
  arr = np.empty(shape = (N, 3), dtype = np.int64)
  arr[:, 0] = numbers
  arr[:, 1] = signs
  arr[:, 2] = np.arange(N)
  return arr

TIMING

ts = test_str()
%timeit test1(ts)
%timeit test2(ts)
1 loop, best of 3: 5.84 s per loop
1 loop, best of 3: 13.7 s per loop

Pythran is MUCH faster when we skip the array filling

def test3(l):
  numbers = np.array([int(k[:-1]) for k in l.split(' ')])
  signs = np.array([(k[-1] == '+') for k in l.split(' ')]) * 2 - 1
  N = len(numbers)
  arr = np.empty(shape = (N, 3), dtype = np.int64)

and:

%%pythran
#pythran export test4(str)
import numpy as np
def test4(l):
  numbers = np.array([int(k[:-1]) for k in l.split(' ')])
  signs = np.array([(k[-1] == '+') for k in l.split(' ')]) * 2 - 1
  N = len(numbers)
  arr = np.empty(shape = (N, 3), dtype = np.int64)

1 loop, best of 3: 6.01 s per loop
The slowest run took 6.42 times longer than the fastest. This could mean that an intermediate result is being cached.
100 loops, best of 3: 6.51 ms per loop

Interestingly, when we add a single fill to the end of the above functions, e.g. arr[:,0] = numbers Pythran will be slower again. Any idea why this occurs? are we missing something?

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 15 (9 by maintainers)

Commits related to this issue

Most upvoted comments

With master:

parse0                           :     1 * norm
norm = 0.000348 s
parse0_pythran                   :   1.4 * norm
parse_nofill                     : 0.989 * norm
parse_nofill_pythran             :   1.4 * norm

With feature/fast-str-chr

parse0                           :     1 * norm
norm = 0.000356 s
parse0_pythran                   : 0.301 * norm
parse_nofill                     : 0.986 * norm
parse_nofill_pythran             :  0.31 * norm

Pythran is no longer slower than CPython & Numpy!