torchdynamo: Debug python key tracing error with hf_BigBird

Repro

./torchbench.py --no-skip --python-key -n 1 -k hf_BigBird --devices cuda --float32

old stuff (this output is no longer master

Output

cpu  eval  hf_BigBird                         ERROR:root:unhandled error
Traceback (most recent call last):
  File "./torchbench.py", line 911, in run_one_model
    new_result = model_iter_fn(model, example_inputs)
  File "./torchbench.py", line 456, in forward_pass
    def forward_pass(mod, inputs, collect_outputs=True):
  File "/home/jansel/pytorch/torch/nn/modules/module.py", line 1111, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jansel/conda/envs/torchdynamo/lib/python3.8/site-packages/transformers/models/big_bird/modeling_big_bird.py", line 2321, in forward
    @add_start_docstrings_to_model_forward(BIG_BIRD_INPUTS_DOCSTRING.format("(batch_size, sequence_length)"))
  File "/home/jansel/pytorch/torch/nn/modules/module.py", line 1111, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jansel/conda/envs/torchdynamo/lib/python3.8/site-packages/transformers/models/big_bird/modeling_big_bird.py", line 1920, in forward
    @add_start_docstrings_to_model_forward(BIG_BIRD_INPUTS_DOCSTRING.format("(batch_size, sequence_length)"))
  File "/home/jansel/pytorch/torch/nn/modules/module.py", line 1111, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jansel/conda/envs/torchdynamo/lib/python3.8/site-packages/transformers/models/big_bird/modeling_big_bird.py", line 1615, in forward
    layer_outputs = layer_module(
  File "/home/jansel/pytorch/torch/nn/modules/module.py", line 1111, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jansel/conda/envs/torchdynamo/lib/python3.8/site-packages/transformers/models/big_bird/modeling_big_bird.py", line 1451, in forward
    def forward(
  File "/home/jansel/pytorch/torch/nn/modules/module.py", line 1111, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jansel/conda/envs/torchdynamo/lib/python3.8/site-packages/transformers/models/big_bird/modeling_big_bird.py", line 1381, in forward
    self_outputs = self.self(
  File "/home/jansel/pytorch/torch/nn/modules/module.py", line 1111, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jansel/conda/envs/torchdynamo/lib/python3.8/site-packages/transformers/models/big_bird/modeling_big_bird.py", line 435, in forward
    def forward(
  File "/home/jansel/conda/envs/torchdynamo/lib/python3.8/site-packages/transformers/models/big_bird/modeling_big_bird.py", line 502, in bigbird_block_sparse_attention
    def bigbird_block_sparse_attention(
  File "/home/jansel/conda/envs/torchdynamo/lib/python3.8/site-packages/transformers/models/big_bird/modeling_big_bird.py", line 502, in bigbird_block_sparse_attention
    def bigbird_block_sparse_attention(
  File "/home/jansel/conda/envs/torchdynamo/lib/python3.8/site-packages/transformers/models/big_bird/modeling_big_bird.py", line 502, in bigbird_block_sparse_attention
    def bigbird_block_sparse_attention(
  [Previous line repeated 2 more times]
  File "/home/jansel/torchdynamo/torchdynamo/eval_frame.py", line 58, in _fn
    return fn(*args, **kwargs)
  File "/home/jansel/torchdynamo/torchdynamo/optimizations/python_key.py", line 129, in call_fn
    return inner(*params_flat, *args)
  File "<eval_with_key>.20", line 105, in forward
    unsqueeze__1 = torch.ops.aten.unsqueeze_(detach_36, 2);  detach_36 = None
  File "/home/jansel/pytorch/torch/_ops.py", line 142, in __call__
    return self._op(*args, **kwargs or {})
RuntimeError: set_storage_offset is not allowed on a Tensor created from .data or .detach().
If your intent is to change the metadata of a Tensor (such as sizes / strides / storage / storage_offset)
without autograd tracking the change, remove the .data / .detach() call and wrap the change in a `with torch.no_grad():` block.
For example, change:
    x.data.set_(y)
to:
    with torch.no_grad():
        x.set_(y)
ERROR

Somehow the code produced by python key tracing triggers an error.

cc @Chillee @anijain2305

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 15 (14 by maintainers)

Commits related to this issue

RFC: Delete ProxyTensor wrapper subclass I was working on https://github.com/pytorch/torchdynamo/issues/80 and my working hypothesis for what was causing the error was that proxy tensor was not adver... — committed to pytorch/pytorch by ezyang 2 years ago
Update on "RFC: Delete ProxyTensor wrapper subclass" I was working on https://github.com/pytorch/torchdynamo/issues/80 and my working hypothesis for what was causing the error was that proxy tenso... — committed to pytorch/pytorch by ezyang 2 years ago
RFC: Delete ProxyTensor wrapper subclass I was working on https://github.com/pytorch/torchdynamo/issues/80 and my working hypothesis for what was causing the error was that proxy tensor was not adver... — committed to pytorch/pytorch by ezyang 2 years ago
Update on "RFC: Delete ProxyTensor wrapper subclass" I was working on https://github.com/pytorch/torchdynamo/issues/80 and my working hypothesis for what was causing the error was that proxy tenso... — committed to pytorch/pytorch by ezyang 2 years ago
RFC: Delete ProxyTensor wrapper subclass I was working on https://github.com/pytorch/torchdynamo/issues/80 and my working hypothesis for what was causing the error was that proxy tensor was not adver... — committed to pytorch/pytorch by ezyang 2 years ago
Update base for Update on "RFC: Delete ProxyTensor wrapper subclass" I was working on https://github.com/pytorch/torchdynamo/issues/80 and my working hypothesis for what was causing the error was ... — committed to pytorch/pytorch by ezyang 2 years ago
Update on "RFC: Delete ProxyTensor wrapper subclass" I was working on https://github.com/pytorch/torchdynamo/issues/80 and my working hypothesis for what was causing the error was that proxy tenso... — committed to pytorch/pytorch by ezyang 2 years ago
Update base for Update on "Delete ProxyTensor wrapper subclass" I was working on https://github.com/pytorch/torchdynamo/issues/80 and my working hypothesis for what was causing the error was that ... — committed to pytorch/pytorch by ezyang 2 years ago
Update on "Delete ProxyTensor wrapper subclass" I was working on https://github.com/pytorch/torchdynamo/issues/80 and my working hypothesis for what was causing the error was that proxy tensor wa... — committed to pytorch/pytorch by ezyang 2 years ago
Update base for Update on "Delete ProxyTensor wrapper subclass" I was working on https://github.com/pytorch/torchdynamo/issues/80 and my working hypothesis for what was causing the error was that ... — committed to pytorch/pytorch by ezyang 2 years ago
Update on "Delete ProxyTensor wrapper subclass" I was working on https://github.com/pytorch/torchdynamo/issues/80 and my working hypothesis for what was causing the error was that proxy tensor wa... — committed to pytorch/pytorch by ezyang 2 years ago
Update base for Update on "Delete ProxyTensor wrapper subclass" I was working on https://github.com/pytorch/torchdynamo/issues/80 and my working hypothesis for what was causing the error was that ... — committed to pytorch/pytorch by ezyang 2 years ago
Update on "Delete ProxyTensor wrapper subclass" I was working on https://github.com/pytorch/torchdynamo/issues/80 and my working hypothesis for what was causing the error was that proxy tensor wa... — committed to pytorch/pytorch by ezyang 2 years ago
Update base for Update on "Delete ProxyTensor wrapper subclass" I was working on https://github.com/pytorch/torchdynamo/issues/80 and my working hypothesis for what was causing the error was that ... — committed to pytorch/pytorch by ezyang 2 years ago
Update on "Delete ProxyTensor wrapper subclass" I was working on https://github.com/pytorch/torchdynamo/issues/80 and my working hypothesis for what was causing the error was that proxy tensor wa... — committed to pytorch/pytorch by ezyang 2 years ago

Most upvoted comments

Without functionalization, if we allow resizing detached tensors (@albanD has approved this), bigbird now gets through tracing and just fails accuracy check

ezyang@devfair040:/scratch/ezyang/torchdynamo$ ./benchmarks/torchbench.py --no-skip --accuracy-aot-nop -n 1 -k hf_BigBird --devices cuda --training
cuda train hf_BigBird                         Accuracy failed for key name bert.embeddings.token_type_embeddings.weight.grad
INCORRECT

ezyang on Aug 16, 2022