tensorflow: DistributionStrategy is not supported by tf.keras.models.Model.fit_generator
Hi! Recently I’ve encountered a NonImplemetedError while trying to apply a fit_generator method of a tf.keras.models.Model with a MultiWorkerDistributionStrategy. It is almost a year since this handlers were added to the code ( https://github.com/tensorflow/tensorflow/commit/9541ce3475ea70fd8eb9552f60de462127f15440#diff-de9b96ac2d81503324cbbbe21732031f ) and I’m wondering whether to expect an implementation to be added any time soon? (with the release of TF2.0 for example)
Making efforts to find a workaround I’ve tried to transform a generator to TF Dataset by tf.data.Dataset.from_generator to replace the fit_generator by fit but encountered similar problem. The obtained object has type DatasetV1Adapterwhich is also incompatible with distribution strategies
I dare to assume that for a wide society of TF users and for me in particular this functionality would be of a great interest. Dealing with large, domain specific data sets that doesn’t fit into memory, one often has no choice other than writing a custom data generator. When big data is involved the distributed training might be crucial.
I would highly appreciate any information on the current state of the problem or possible workarounds from the TensorFlow developers team. Thanks in advance!
System information
- TensorFlow version (you are using): 2.0.0.dev20190729
- Are you willing to contribute it (Yes/No): No
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 2
- Comments: 23 (4 by maintainers)
Hi, I have found that the workaround those not work if the model has multiple inputs. The following code fails:
Should I open a new thread for this issue?
I have the same error,
with
TPUStrategy.I am trying to run a keras model on TPU with significant CPU preprocessing of data that I want to be done in parallel with running batches on TPU. The strategy,
TPUStrategyis a simple one that distribute everything to one TPU core. There is no reason why it cannot run preprocessing on CPU in parallel.I shall have to switch to fit from fit_generator and run everything sequentially.
I suggest to reopen this issue.
Try returning tuples instead of lists.
This issue has been resolved with TF v2.1.0 by replacing
model.fit_generator()withmodel.fit()Any update to this? Particularly curious about tf.keras.utils.Sequence
any optimal solution for this issue?
Model.fit() was recently made work with generators and distribution strategies. Could you try the latest version of TF2? Does it provide a work-around for you?
This works:
did you solve this?
Please reopen if it doesn’t provide a workaround.