vscode-jupyter: Jupyter for vscode continues to be slow (for large notebooks with mardown cells & large outputs)
Every few months I try to use vscode for jupyter because I would really love to just use vscode for everything. Every few months, I am disappointed and switch back to the web version.
There are two reasons for this:
1) Jupyter for vscode continues, stubbornly, to essentially always be more slow than traditional jupyter lab on localhost. Look at the run times in this screenshot. It took me a minute to run imports; when I ran the exact same code on the localhost version, it took 7.7 seconds (pictures attached). This is an extremely consistent theme in vscode jupyter. Cells will sometimes randomly take minutes to run, and will sometimes not even run at all until you press ‘shift-enter’ on them twice. This has been true for me across multiple computers, in many different dev environments.
Cells also just randomly take forever to run, for god knows what reason. Here is a screenshot of assigning a string to a variable taking 27.4 seconds:
Note that I am not trying to blame the team here, I am just frustrated because this is so close to being a great product, but this one thing holds it back, and it keeps not being fixed for years on end. The very first thing I would do as a product manager if I were in charge of vscode-jupyter is to pause all current tasks and plan, with the team, a multiple-month effort to speed things up, and get cells to run effectively instantly (or as close to the amount of time the python processing of the code takes as possible), every time.
2) Jupyter for vscode sucks at inline documentation, the equivalent of shift+tab
in vscode jupyter. I am aware of the existence of the trigger parameter hints
and show hover
settings in the keyboard shortcuts. These are extremely unreliable, and actually show documentation when I press the button maybe 1/5 of the time. When they do show documentation, there is a ‘loading’ tag for awhile. Browser jupyter, on the other hand, is immediate with this. Basically every time. Below is an example.
The other issue with inline documentation is that, as far as I can tell, hover documentation for methods on instantiated variables simply doesn’t work. When I am using pandas
, for instance, typing df.unique(
and then pressing the show hover
hotkey while my typing carat is to the right of the parenthesis pops up a documentation window saying exactly nothing. In contrast, in the web version, typing the same thing produces full documentation, as expected.
I don’t understand how these two issues aren’t your guys’s top priority. Everyone I’ve spoken to who uses jupyter has had exactly the same experience as I have, and everyone I’ve spoken to who uses jupyter uses the web version exclusively for exactly these issues. Even Kaggle notebooks are better. I love copilot and it’d be great to bring it into my jupyter notebook experience, but it has just never been viable to switch if I don’t want a workflow where I have to wait for 30 seconds every time I press command-enter
, or I am frustratingly making a new cell above the current one and typing function?
just to see documentation.
These issues have been ongoing since vscode jupyter started. They are the only things holding me and everyone else I’ve spoken to back from using it. Without fixing these issues, the whole thing is unusable, and no other features you guys put in matter. Why are you guys working on anything besides this when they are the only things anyone I know cares about?
I should note that this is all running in a docker container with access to 7 of my 8 cpus and 10gb of RAM. I am on a 2022 macbook air. I realize that this is a rant, so thank you for reading it. Nothing personal, I just think this product has a bunch of potential and I hate to see it unusable for so long.
About this issue
- Original URL
- State: closed
- Created 9 months ago
- Reactions: 62
- Comments: 131 (46 by maintainers)
Hi, i think this issue should not be closed because it is not solved.
Or does someone have a solution?
When I run notebooks in jupyter lab in the browser everything is instant but in vscode everything runs delayed.
Closing this issue as its been over 4 weeks, since the information was requested. We’ll be happy to reopen the issue when the requested information has been provided.
I’m having the exact same issue as all here! I don’t know how this could not be related to VS Code since it is happening to all of us when using the editor.
In my experience, jupyter notebooks performance degrades very quickly in the size of the notebook. This is especially true for plotly.express plots, and is independent of whether I am using a .ipynb file or the interactive cell views for a .py file.
Describing the experience for a .py file: when there are no plots and no LaTeX in the interactive window, everything is snappy. But if I have even just a handful of plots (or many lines of rendered LaTeX from Markdown cells), then it takes multiple seconds between when I press Shift+Enter and when the interactive window starts running the command. If I click “clear all”, everything is quick again. This seems to largely depend on how many plots are in the interactive window, not how many are currently visible.
Some other observations:
Experiencing same issue - disabled “all” extensions. Having lags / freezes / delays even on markup cells.
When notebook initially opened it is smoother - gets worse after a few minutes. Restarting vscode is the only thing that helps temporarily - which makes it practically impossible to work.
I have exactly the same issues. The notebooks get especially slow as they get bigger. But many of the problems already exist in an empty notebook.
This issue has been closed automatically because it needs more information and has not had recent activity. See also our issue reporting guidelines.
Happy Coding!
I’m experiencing the same issue and I have exactly the same observation as @JasonGross … which pushed me to switch to web-based version…
@JasonGross All that is true, but the bug where it gets stuck on a cell does not depend on notebook size or plot complexity.
Not sure if this is what you’re seeing, but I’ve noticed a regression of an old bug. I have a code cell that should run in a fraction of a second. I run it. It’s stuck for about 1 minute. Then all of a sudden it runs.
Very annoying. Because of this and other, numerous bugs, I’m thinking to go back to Jupyter Notebook in a browser.
@DonJayamanne those getNeighborFiles & detectCellLanguage calls are from copilot This souds like pretty similar behavior to what I was seeing with copilot trying to gather all that context from a large notebook https://github.com/microsoft/vscode/issues/211154
@rebornix was looking into reducing those calls at a certain point
I’m running into this as well and it is preventing me from continue working in VS Code. Thank you for trying to get to the bottom of the issue. In the meanwhile, is there any setting to toggle as a workaround to turn off the backup? I can only find “Autosave” which is already turned off. In JupyterLab I don’t notice any slow down at all for the same notebook.
Thanks @amunger , the code example and the referenced comment is very helpful. I can reproduce this and this might explain why we see a performance slow down for large notebook, especially when we have widgets or rich media.
My hypothesis is
This is also something we want to look into.
Since I disabled completions for copilot, it is fast (at least for now).
This issue is making VSCode with Jupyter basically unworkable for me. It used to not be like this however, wonder when it changed.
Hi @DonJayamanne , @amunger
I have just tested with the largest notebook I have which includes a lot of markdowns and it indeed runs faster. Although some functions still take time to execute, but I guess it’s just native to the libraries (it would be necessary for someone else to confirm - sns.regplot and clustering functions). Also I was monitoring the use of the CPU in the MacOS’ Activity Monitor and I noticed it now barely goes above 500 MB.
On regards of the issue with the completions, I guess these were solved as there was no lag nor any problems with them after running all the cells (around 180 coding cells alone), when before it would began to stuck after 70 or so.
The only caveat I would add is it was done only with these packages active and no changes to the settings.json:
So, my suggestion would be to just begin to add our normal extensions just to see if any of those would choke the improvement since most likely we all work different ones. In my normal VsCode I run with these and several changes to the settings.json (for font, font size, ligatures, colours, conda path, semantic highlight, tree views, etc):
(2 theme extensions excluded)
Best
Something else suggested that it is indeed the backup taking time is that if I try to exist VS Code while a large slow notebook is open, I see this:
@tlkaufmann @Liam3851 That’s great news, great because we’ve been able to identify the cause and there’s a work around. We will work with copilot to get this resolved
Highlighting this in case it got lost:
@DonJayamanne do you have any idea what would cause almost 90% of the time to not show up in the profile at all?
@tlkaufmann The second json is empty. Also what extensions do you have installed? There are two methods
getNeighborFiles
&detectCellLanguage
that gets invoked and I’m not sure what extensions these are coming from. Please can you share the list of the extensions you have installed.If saving is an issue, then please go to the bottom of the profile view and select the
Bottom Up
tab and sort the list bySelf Time
andTotal Time
as below and send the screen shots.Thank you for your patience and help,
I’d like to see the top items in the sorted list along with the names and file paths.
@DonJayamanne what I meant is that I loaded a csv of 2.5 GB into a notebook (or a Pandas data frame If you’d like). This file is so large because it has over 25 million lines and 10 columns, so a little over 250 million data points. I haven’t made anything yet with the data, but so far the notebook is 80 kb (not sure why…)
Best
Try the insiders’ version. It has been working smoothly for me the last days
Thanks for the extra info and repro notebooks @ale-dg, that sounds like something different than what I’m trying to solve here, so I’ll split it out into another issue.
Thanks @amunger ! I thought I had restarted VS Code but it turns out there was a window open on another desktop and closing that fixed it. So far I’m noticing much better performance on notebooks with large interactive charts (Altair/Vega charts), thanks for all your work on this issue! I will report back when I test it more with larger and longer-running notebooks if I run into issue.
I haven’t got a chance to try the new solution… I was just giving a bit of feedback on the other one.
I’m fairly convinced that the big perf hit comes from serializing the notebook as part of the backup, in which case shrinking the file size isn’t really going to help.
here are some perf snapshots of the backup process, top is for a text editor
Some of above hypothesis validated:
Uint8Array(313597405)
, which is ~313MB.100000
x and100000
y).\t
s before each number, and a line break after:x
andy
axis, the more tabs/spaces and linebreaks we generate in VS CodeI managed to repro the behavior in this comment by just running this cell ~50 times in the interactive window:
It still doesn’t happen every time, but it’s enough to be able to investigate
I have the same issue with Code 1.87.2 on Ubuntu 23.10.
I have the same problem working on Fedora 39 - Linux. It’s driving me nuts.
Will do in the in the next couple days. Just to be clear, this issue is not exclusive to notebooks with markdown cells. I will try to provide example notebooks with no markdown and with markdown that experience this unresponsiveness.
Hi @DonJayamanne
I have run a large notebook, both with MD and without MD. Find below the logs for both.
Best
1-Jupyter-no-MD.log 1-Jupyter-with-MD.log
Is this behavior related? I sometimes see code execution hanging for multiple minutes on trying to write to the interactive window. Maybe there’s a similar blocking IO writing call that is deadlocked or something in the other cases?
@DonJayamanne Could you recording a video to do those instruction above? I tried and it is abstract to follow each step. For example, when I type Developer: Set log level in command palette, I see nothing pop up. If you have time to recording the video, I would be very happy to test it. Thanks.
The output is attached. As far as I could see, the output only changed when the cell started executing. The time between me trying to execute and the actual execution seems not to be logged. logs.txt
I don’t use the powertoys extension at all. Maybe it’s also important to mention that the problems with jupyter notebooks are even more severe when developing on a remote server (via ssh or Kubernetes). However, they still persist when developing locally.