wandb: `_disable_stats` doesn't work. `wandb.init(settings=wandb.Settings(_disable_stats=True))` It still sends stats to WANDB, which in turn leads to BSOD due to incompatibility with the old PYNVML dependency in the vendor folder.
_disable_stats doesn’t work. wandb.init(settings=wandb.Settings(_disable_stats=True)) It still sends stats to WANDB, which in turn leads to BSOD due to incompatibility with the old PYNVML dependency in the vendor folder.
_Originally posted by @CosmicHazel in https://github.com/wandb/client/issues/473#issuecomment-1094362410_
Can confirm that this is causing BSOD on Windows platform with Nvidia GPU with latest drivers. And since there’s no way to disable it there’s practically now way to use wandb on Windows
About this issue
- Original URL
- State: open
- Created 2 years ago
- Reactions: 2
- Comments: 31 (5 by maintainers)
I have great news: I installed Nvidia driver version 516.94 (Before I had 516.59) and now it doesn’t crash anymore! Now I can continue advertising wandb to all my colleagues and friends! I even plan to do a presentation about wandb in one of my courses because nobody knows it, even though they are Deep Learning enthusiasts and wandb is awesome!
@dmitryduev Thank you so much for your efforts! @benjamincburns Also thank you for your valuable inputs! 😃
Hey all, many thanks for bringing this to our attention and please accept my apologies for it taking us so long to properly look into. We have updated the vendored version of nvidia-ml-py here and that PR has been merged into master. Could you please try installing wandb from master and let us know if it works now? Would really appreciate that!
I’m so sorry for the wait! I talked to the engineer in charge of this and they mentioned that they would work on it this week
Right now this is the only thing in the FAQ that addresses crashes caused by WandB’s client. I think the lack of clear resolution here really doesn’t align with the values being conveyed in this FAQ entry, as a BSOD clearly affects my training run.
@lesliewandb @dmitryduev why was this issue closed? Running the
wandbpython client with most nvidia driver versions in use today still causes BSODs.If the issue is going to be closed as completed you should at least capture notes about the workaround on the troubleshooting FAQ page. Given that I don’t see that here, I strongly suspect that many users will continue encountering this problem for quite some time. https://docs.wandb.ai/guides/technical-faq/troubleshooting
Ah interesting. I’m really curious to know why it doesn’t repro for you on all of those boxes. I know Tesla GPUs are using a different driver series, but I wouldn’t expect much of any difference between the 2080 and the 2080 Ti. Thanks for going on such a scavenger hunt!
Unfortunately unless there has been a change, per the title of this issue, running with
_disable_stats=Truewasn’t enough (at the time of writing, anyway) to avoid the BSOD. I’ll give it another try sometime in the next week and report back, however.Edit: oh, I see - we need the extra
_disable_metaarg. Thanks, I’ll make sure to include that when I test next time.