hydra: [Bug] submitit launcher doesnt work with dynamically defined enums
this works:
python fairseq_cli/hydra_train.py --multirun distributed_training.distributed_world_size=2 checkpoint.save_dir=/checkpoint/abaevski/tmp task.data=/checkpoint/abaevski/data/speech/libri/960h/wav2vec/raw/ +optimization.update_freq='[2]' --config-path /private/home/abaevski/fairseq-py/examples/wav2vec/config/pretraining --config-name wav2vec2_large_librivox
this does not work:
python fairseq_cli/hydra_train.py --multirun hydra/launcher=submitit_slurm hydra.launcher.cpus_per_task=80 hydra.launcher.gpus_per_node=8 hydra.launcher.tasks_per_node=1 hydra.launcher.nodes=2 hydra.launcher.partition=dev hydra.launcher.mem_gb=400 distributed_training.distributed_world_size=16 checkpoint.save_dir=/checkpoint/abaevski/tmp task.data=/checkpoint/abaevski/data/speech/libri/960h/wav2vec/raw/ +optimization.update_freq='[2]' --config-path /private/home/abaevski/fairseq-py/examples/wav2vec/config/pretraining --config-name wav2vec2_large_librivox
error is:
File "/private/home/abaevski/.conda/envs/fairseq-fp16-20200821/lib/python3.6/site-packages/hydra/_internal/config_loader_impl.py", line 156, in load_configuration
from_shell=from_shell,
File "/private/home/abaevski/.conda/envs/fairseq-fp16-20200821/lib/python3.6/site-packages/hydra/_internal/config_loader_impl.py", line 262, in _load_configuration
run_mode=run_mode,
File "/private/home/abaevski/.conda/envs/fairseq-fp16-20200821/lib/python3.6/site-packages/hydra/_internal/config_loader_impl.py", line 804, in _merge_defaults_into_config
hydra_cfg = merge_defaults_list_into_config(hydra_cfg, system_list)
File "/private/home/abaevski/.conda/envs/fairseq-fp16-20200821/lib/python3.6/site-packages/hydra/_internal/config_loader_impl.py", line 772, in merge_defaults_list_into_config
merged_cfg.merge_with(job_cfg)
File "/private/home/abaevski/.conda/envs/fairseq-fp16-20200821/lib/python3.6/site-packages/omegaconf/basecontainer.py", line 327, in merge_with
self._format_and_raise(key=None, value=None, cause=e)
File "/private/home/abaevski/.conda/envs/fairseq-fp16-20200821/lib/python3.6/site-packages/omegaconf/base.py", line 101, in _format_and_raise
type_override=type_override,
File "/private/home/abaevski/.conda/envs/fairseq-fp16-20200821/lib/python3.6/site-packages/omegaconf/_utils.py", line 610, in format_and_raise
_raise(ex, cause)
File "/private/home/abaevski/.conda/envs/fairseq-fp16-20200821/lib/python3.6/site-packages/omegaconf/_utils.py", line 591, in _raise
raise ex # set end OC_CAUSE=1 for full backtrace
File "/private/home/abaevski/.conda/envs/fairseq-fp16-20200821/lib/python3.6/site-packages/omegaconf/basecontainer.py", line 325, in merge_with
self._merge_with(*others)
File "/private/home/abaevski/.conda/envs/fairseq-fp16-20200821/lib/python3.6/site-packages/omegaconf/basecontainer.py", line 343, in _merge_with
BaseContainer._map_merge(self, other)
File "/private/home/abaevski/.conda/envs/fairseq-fp16-20200821/lib/python3.6/site-packages/omegaconf/basecontainer.py", line 290, in _map_merge
dest_node._merge_with(src_value)
File "/private/home/abaevski/.conda/envs/fairseq-fp16-20200821/lib/python3.6/site-packages/omegaconf/basecontainer.py", line 343, in _merge_with
BaseContainer._map_merge(self, other)
File "/private/home/abaevski/.conda/envs/fairseq-fp16-20200821/lib/python3.6/site-packages/omegaconf/basecontainer.py", line 301, in _map_merge
dest._format_and_raise(key=key, value=src_value, cause=e)
File "/private/home/abaevski/.conda/envs/fairseq-fp16-20200821/lib/python3.6/site-packages/omegaconf/base.py", line 101, in _format_and_raise
type_override=type_override,
File "/private/home/abaevski/.conda/envs/fairseq-fp16-20200821/lib/python3.6/site-packages/omegaconf/_utils.py", line 675, in format_and_raise
_raise(ex, cause)
File "/private/home/abaevski/.conda/envs/fairseq-fp16-20200821/lib/python3.6/site-packages/omegaconf/_utils.py", line 591, in _raise
raise ex # set end OC_CAUSE=1 for full backtrace
omegaconf.errors.ValidationError: Value json (Choices) is not a valid input for <enum 'Choices'>
full_key: common.log_format
reference_type=CommonConfig
object_type=CommonConfig
common.logformat is defined as follows:
log_format: Optional[LOG_FORMAT_CHOICES] = field(
default=None, metadata={"help": "log format to use"}
)
where LOG_FORMAT_CHOICES is LOG_FORMAT_CHOICES = ChoiceEnum(["json", "none", "simple", "tqdm"])
and ChoiceEnum is
class StrEnum(Enum):
def __str__(self):
return self.value
def __eq__(self, other: str):
return self.value == other
def __repr__(self):
return self.value
def __hash__(self):
return hash(str(self))
def ChoiceEnum(choices: List[str]):
"""return the Enum class used to enforce list of choices"""
return StrEnum("Choices", {k: k for k in choices})
(you can find these here: https://github.com/pytorch/fairseq/blob/master/fairseq/dataclass/constants.py)
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 20
Submitit recently switched to cloudpickle for all serialization, but looks like they messed up the release (and it’s not actually released even though it should have been).
Using a minimal repro with a standard enum, things seems to work for me.
But only if I install submitit from github. Otherwise I am getting a completely different error.
I filed an issue against submitit to update their release.