wandb: [Q]Exception: The wandb backend process has shutdown?

INFO: Model loaded from data_models/checkpoint_epoch200.pth INFO: Creating dataset with 260 examples INFO: setting login settings: {} wandb: Currently logged in as: wensimokedoudou (use wandb login --relogin to force relogin) wandb: wandb version 0.12.7 is available! To upgrade, please run: wandb: $ pip install wandb --upgrade wandb: Tracking run with wandb version 0.10.8 wandb: Syncing run stellar-firefly-3 wandb: ⭐️ View project at https://wandb.ai/wensimokedoudou/u-net wandb: 🚀 View run at https://wandb.ai/wensimokedoudou/u-net/runs/175ajqdz wandb: Run data is saved locally in wandb/run-20211215_195933-175ajqdz wandb: Run wandb off to turn off syncing.

Traceback (most recent call last): File “train.py”, line 206, in <module> amp=args.amp) File “train.py”, line 64, in train_net amp=amp)) File “/usr/local/lib/python3.6/dist-packages/wandb/sdk/wandb_config.py”, line 87, in update self._callback(data=self._as_dict()) File “/usr/local/lib/python3.6/dist-packages/wandb/sdk/wandb_run.py”, line 592, in _config_callback self._backend.interface.publish_config(data) File “/usr/local/lib/python3.6/dist-packages/wandb/interface/interface.py”, line 497, in publish_config self._publish_config(cfg) File “/usr/local/lib/python3.6/dist-packages/wandb/interface/interface.py”, line 501, in _publish_config self._publish(rec) File “/usr/local/lib/python3.6/dist-packages/wandb/interface/interface.py”, line 429, in _publish raise Exception(“The wandb backend process has shutdown”) Exception: The wandb backend process has shutdown

wandb: Waiting for W&B process to finish, PID 52394 wandb: Program failed with code 1. Press ctrl-c to abort syncing. wandb: ERROR Problem finishing run Traceback (most recent call last): File “/usr/local/lib/python3.6/dist-packages/wandb/sdk/wandb_run.py”, line 1271, in _atexit_cleanup self._on_finish() File “/usr/local/lib/python3.6/dist-packages/wandb/sdk/wandb_run.py”, line 1415, in _on_finish self._backend.interface.publish_exit(self._exit_code) File “/usr/local/lib/python3.6/dist-packages/wandb/interface/interface.py”, line 572, in publish_exit self._publish(rec) File “/usr/local/lib/python3.6/dist-packages/wandb/interface/interface.py”, line 429, in _publish raise Exception(“The wandb backend process has shutdown”) Exception: The wandb backend process has shutdown /usr/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 6 leaked semaphores to clean up at shutdown len(cache))

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 24 (5 by maintainers)

Most upvoted comments

The same problem occurred to me.

wandb: ERROR Internal wandb error: file data was not synced 
...
Exception: The wandb backend process has shutdown

It turns out that I logged an unaccepted content (an instance) to wandb and the problem was solved after removing the unaccepted content. The error message is somewhat misleading though.

I realized that this was resulting from a long tag that I use. I truncate the tag and it worked seamlessly. I think the error should be more explicit than wandb backend process has shutdown.

I’ve runned into the same issue when running my code on Google Colab after restarting the kernel. At least for me the problem was due to authentication issues (maybe some failure when caching the auth token as I restarted the kernel, was and somehow was already logged in, but wandb.init() failed). I could solve it by running:

!wandb login --relogin

When working locally, the “!” char may be omitted via shell as this is jupyter notebook syntax.

It was more than 64 chars. Now, it is less than 64 and it works.