data: Graph traversal is broken for custom iter datapipes
from torch.utils.data.graph import traverse
from torchdata.datapipes.iter import IterDataPipe, IterableWrapper
class CustomIterDataPipe(IterDataPipe):
def noop(self, x):
return x
def __init__(self):
self._dp = IterableWrapper([]).map(self.noop)
def __iter__(self):
yield from self._dp
traverse(CustomIterDataPipe())
RecursionError: maximum recursion depth exceeded
Without the .map()
call it works fine. I don’t think this is specific to .map()
though. From trying a few datapipes, this always happens if self._dp
is composed in some way.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 3
- Comments: 24 (24 by maintainers)
Commits related to this issue
- Update on "[DataPipe] only apply special serialization when dill is installed" This is a quick fix. Only applies the custom serialization logic when `dill` is installed. The specific issue mentione... — committed to pytorch/pytorch by NivekT 2 years ago
- [DataPipe] apply dill serialization for _Demux and add cache to traverse - Fix _Demux can not be pickled with DILL presented https://github.com/pytorch/pytorch/pull/74958#issuecomment-1084637227 - An... — committed to pytorch/pytorch by ejguan 2 years ago
- [DataPipe] apply dill serialization for _Demux and add cache to traverse (#75034) Summary: - Fix _Demux can not be pickled with DILL presented https://github.com/pytorch/pytorch/pull/74958#issuecomme... — committed to pytorch/pytorch by ejguan 2 years ago
- Update base for Update on "[DataPipe] Restrict special serialization for non-method functions" This PR further restricts the application of our custom serialization function in `DataPipe`. It will ... — committed to pytorch/pytorch by NivekT 2 years ago
- Update on "[DataPipe] Restrict special serialization for non-method functions" This PR further restricts the application of our custom serialization function in `DataPipe`. It will only be applied ... — committed to pytorch/pytorch by NivekT 2 years ago
- Update base for Update on "[DataPipe] Restrict special serialization for non-method functions" This PR further restricts the application of our custom serialization function in `DataPipe`. It will ... — committed to pytorch/pytorch by NivekT 2 years ago
- Update on "[DataPipe] Restrict special serialization for non-method functions" This PR further restricts the application of our custom serialization function in `DataPipe`. It will only be applied ... — committed to pytorch/pytorch by NivekT 2 years ago
- Update base for Update on "[DataPipe] Restrict special serialization for non-method functions" This PR further restricts the application of our custom serialization function in `DataPipe`. It will ... — committed to pytorch/pytorch by NivekT 2 years ago
- Update on "[DataPipe] Restrict special serialization for non-method functions" This PR further restricts the application of our custom serialization function in `DataPipe`. It will only be applied ... — committed to pytorch/pytorch by NivekT 2 years ago
- Update base for Update on "[DataPipe] Restrict special serialization for non-method functions" This PR further restricts the application of our custom serialization function in `DataPipe`. It will ... — committed to pytorch/pytorch by NivekT 2 years ago
- Update on "[DataPipe] Restrict special serialization for non-method functions" This PR further restricts the application of our custom serialization function in `DataPipe`. It will only be applied ... — committed to pytorch/pytorch by NivekT 2 years ago
- Update base for Update on "[DataPipe] Restrict special serialization for non-method functions" This PR further restricts the application of our custom serialization function in `DataPipe`. It will ... — committed to pytorch/pytorch by NivekT 2 years ago
- Update on "[DataPipe] Restrict special serialization for non-method functions" This PR further restricts the application of our custom serialization function in `DataPipe`. It will only be applied ... — committed to pytorch/pytorch by NivekT 2 years ago
- Update base for Update on "[DataPipe] Restrict special serialization for non-method functions" This PR further restricts the application of our custom serialization function in `DataPipe`. It will ... — committed to pytorch/pytorch by NivekT 2 years ago
- Update on "[DataPipe] Restrict special serialization for non-method functions" This PR further restricts the application of our custom serialization function in `DataPipe`. It will only be applied ... — committed to pytorch/pytorch by NivekT 2 years ago
- Update base for Update on "[DataPipe] Revamp serialization logic of DataPipes" This PR removes the custom serialization logic (i.e. try with `pickle` then with `dill`) inside `__getstate__` and `__... — committed to pytorch/pytorch by NivekT 2 years ago
- Update on "[DataPipe] Revamp serialization logic of DataPipes" This PR removes the custom serialization logic (i.e. try with `pickle` then with `dill`) inside `__getstate__` and `__setstate__` of D... — committed to pytorch/pytorch by NivekT 2 years ago
- Update base for Update on "[DataPipe] Revamp serialization logic of DataPipes" This PR removes the custom serialization logic (i.e. try with `pickle` then with `dill`) inside `__getstate__` and `__... — committed to pytorch/pytorch by NivekT 2 years ago
- Update on "[DataPipe] Revamp serialization logic of DataPipes" This PR removes the custom serialization logic (i.e. try with `pickle` then with `dill`) inside `__getstate__` and `__setstate__` of D... — committed to pytorch/pytorch by NivekT 2 years ago
- Update base for Update on "[DataPipe] Revamp serialization logic of DataPipes" This PR removes the custom serialization logic (i.e. try with `pickle` then with `dill`) inside `__getstate__` and `__... — committed to pytorch/pytorch by NivekT 2 years ago
Yep, the PR is meant to be a quick fix to unblock your work and I marked that as a TODO. I agree with your point and am looking into it.
@ejguan @NivekT , going back to @pmeier 's https://github.com/pytorch/data/issues/237#issuecomment-1080651807:
dill
?dill
is needed? Hopefully we can still find a fix for the cyclic reference issue while still supporting dill?Fixing the cyclic reference issue would allow us to move forward with our preferred design for torchvision new datasets.
It will be my top priority next week. I need to prepare all release-related stuff this week.
I think adding
dill
as a hard dependency is even more BC-breaking. I’m adding a PR to only use the custom serialization whendill
is available. This should mean that your code snippet above will work ifdill
is not installed.