altair: Interactive chart slow with large number of data points

When I create an interactive scatter plot where some text element is displayed on mouseover, I’ve noticed that this becomes unusably slow if the data points go into the thousands. If I take the example from here, then all is nice on the 400 data points from the cars package, but if I increase the number of data points to 4000, then things get very slow, and with 20000 (the size of my data set), it becomes much worse:

cars = alt.load_dataset("cars")
cars = pd.concat([cars]*10)
cars["Horsepower"] = cars["Horsepower"].sample(frac=1).reset_index()

pointer = alt.selection_single(on='mouseover', nearest=True, empty='none')

base = alt.Chart().encode(
    x='Miles_per_Gallon', y='Horsepower'
)

chart = alt.layer(
    base.mark_point().properties(selection=pointer).encode(color='Origin'),
    base.mark_text(dx=8, dy=3, align='left').encode(
    text=alt.condition(pointer, 'Name', alt.value(''))
    ),
    data=cars
)

chart

(It does not necessarily make all that much sense to have 20k data points that can be hovered over for more information, but I’m wondering whether there’s a way to speed this up or if it will generally be a slow.)

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 18 (12 by maintainers)

Most upvoted comments

OK, I’ve implemented chart.display(renderer='svg') in #925

Not all frontends support it yet, but you can pass renderer metadata via the renderers registry… for example:

alt.renderers.enable(renderer_name, embed_options={'renderer': 'svg'})

where renderer_name is jupyterlab, notebook, colab, etc.

I plan to document this better once the next version of JupyterLab is released, because the current release ignores any metadata.