numba: Significant performance regression for datashader in numba 0.49.x
Using latest released numba (0.49.1) I’m seeing a significant performance regression when compared to numba 0.48 in a simple datashader aggregation example. I’m downgrading from 0.49.1 to 0.48 with conda install numba=0.48 --no-deps to ensure no other packages change. The simple test case I’m using is the following:
import timeit
from functools import partial
import datashader as ds
import numba
import numpy as np
import pandas as pd
canvas = ds.Canvas(plot_height=1000, plot_width=1000)
def agg(df):
canvas.points(df, 'x', 'y', agg=ds.mean('value'))
def test_agg_performance(N, repeats=10):
df = pd.DataFrame({'x': np.random.randn(N), 'y': np.random.randn(N), 'value': np.random.rand(N)})
agg(df) # Warm up JIT
return timeit.timeit(partial(agg, df), number=repeats)/repeats
print(f'Numba version: {numba.__version__}')
[(n, test_agg_performance(int(n))) for n in np.logspace(0, 8, 9)]
Here is a graph of the performance difference by the number of points being aggregated:

And here is the notebook which I used to generate the plot: https://anaconda.org/philippjfr/profiling_numba/notebook
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 15 (12 by maintainers)
FWIW I’m bisecting now.