wandb: wandb.watch does not work with torch/jit/ScriptModules
Hi, first of all, I would like to say huge thank you for a great tool, I really like wandb and it helps me a lot.
Here is a problem, which I get while working with wandb
- Weights and Biases version: wandb, version 0.8.12
- Python version: Python 3.6.1
- Operating System: Darwin
Description
wandb.watch does not work with any models, which contains layer with ScriptModules
What I Did
I am working with detection module from torchvision https://github.com/pytorch/vision/tree/master/references/detection Here is a small example of a problem: If you run this code
import wandb
from torchvision.models.detection import fasterrcnn_resnet50_fpn
wandb.init(project='common')
model = fasterrcnn_resnet50_fpn()
wandb.watch(model)
you will get this error
wandb: Started W&B process version 0.8.12 with PID 38398
wandb: Local directory: wandb/run-20190928_163622-ngu052su
wandb: Syncing run curious-deluge-85: https://app.wandb.ai/truskovskiyk/common/runs/ngu052su
wandb: Run `wandb off` to turn off syncing.
Traceback (most recent call last):
File "bug_rep.py", line 8, in <module>
wandb.watch(model)
File "/Users/kyryl/Projects/Kyryl/common/client/wandb/__init__.py", line 148, in watch
model, criterion, graph_idx=idx)
File "/Users/kyryl/Projects/Kyryl/common/client/wandb/wandb_torch.py", line 227, in hook_torch
graph.hook_torch_modules(model, criterion, graph_idx=graph_idx)
File "/Users/kyryl/Projects/Kyryl/common/client/wandb/wandb_torch.py", line 291, in hook_torch_modules
self.hook_torch_modules(sub_module, prefix=name)
File "/Users/kyryl/Projects/Kyryl/common/client/wandb/wandb_torch.py", line 291, in hook_torch_modules
self.hook_torch_modules(sub_module, prefix=name)
File "/Users/kyryl/Projects/Kyryl/common/client/wandb/wandb_torch.py", line 326, in hook_torch_modules
sub_module.register_forward_hook(self.create_forward_hook(name, modules)))
File "/Users/kyryl/vn/common/lib/python3.6/site-packages/torch/jit/__init__.py", line 1834, in fail
raise RuntimeError(name + " is not supported on ScriptModules")
RuntimeError: register_forward_hook is not supported on ScriptM
wandb: Waiting for W&B process to finish, PID 38398
odules
wandb: Program failed with code 1. Press ctrl-c to abort syncing.
wandb: Process crashed early, not syncing files
The problem here is this module https://github.com/pytorch/vision/blob/master/torchvision/ops/misc.py#L135
Here is some toy example of how to reproduce
import wandb
import torch
class MyModule(torch.nn.Module):
def __init__(self):
super().__init__()
self.weight = torch.nn.Parameter(torch.rand(2, 2))
self.linear = torch.nn.Linear(2, 2)
def forward(self, x):
output = self.weight.mv(x)
output = self.linear(output)
return output
module = MyModule()
scripted_module = torch.jit.script(MyModule())
wandb.init(project='common')
wandb.watch(scripted_module)
ouput
wandb: Started W&B process version 0.8.12 with PID 38764
wandb: Local directory: wandb/run-20190928_164606-sdgo358a
wandb: Syncing run wobbly-crypt-89: https://app.wandb.ai/truskovskiyk/common/runs/sdgo358a
wandb: Run `wandb off` to turn off syncing.
Traceback (most recent call last):
File "bug_rep.py", line 34, in <module>
wandb.watch(scripted_module)
wandb: Waiting for W&B process to finish, PID 38764
File "/Users/kyryl/Projects/Kyryl/common/client/wandb/__init__.py", line 148, in watch
model, criterion, graph_idx=idx)
File "/Users/kyryl/Projects/Kyryl/common/client/wandb/wandb_torch.py", line 227, in hook_torch
graph.hook_torch_modules(model, criterion, graph_idx=graph_idx)
File "/Users/kyryl/Projects/Kyryl/common/client/wandb/wandb_torch.py", line 326, in hook_torch_modules
sub_module.register_forward_hook(self.create_forward_hook(name, modules)))
File "/Users/kyryl/vn/common/lib/python3.6/site-packages/torch/jit/__init__.py", line 1834, in fail
raise RuntimeError(name + " is not supported on ScriptModules")
RuntimeError: register_forward_hook is not supported on ScriptModules
wandb: Program failed with code 1. Press ctrl-c to abort syncing.
wandb: Process crashed early, not syncing files
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 2
- Comments: 17 (6 by maintainers)
@cvphelps do you an update on this? I’m running into the same problem.