omegaconf: Structured configs do not respect `dataclasses.field(init=False)`

The documentation of dataclasses has this example for init=False:

@dataclass
class C:
    a: float
    b: float
    c: float = field(init=False)

    def __post_init__(self):
        self.c = self.a + self.b

It just means that c is not an argument to the __init__. I find this useful sometimes when I have a configuration value that can be computed from other configuration values.

Describe the solution you’d like

OmegaConf should just ignore fields that have init=False set.

Describe alternatives you’ve considered

One alternative is to not give a type annotation to the field and just set it to MISSING:

>>> @dataclass
... class C:
...     a: float
...     b: float
...     c = MISSING
...     def __post_init__(self):
...         self.c = self.a + self.b
...
>>> d = OmegaConf.structured(C)
>>> d.a = 3
>>> d.b = 4
>>> o = OmegaConf.to_object(d)
>>> o.c
7.0

This works, but it is a bit sad that we have to accept the loss of the type annotation.

Additional context

This is what currently happens when you use init=False.
>>> from dataclasses import dataclass, field
>>> @dataclass
... class C:
...     a: float
...     b: float
...     c: float = field(init=False)
...     def __post_init__(self):
...         self.c = self.a + self.b
...
>>> c = C(a=3, b=4)
>>> c.c
7
>>> from omegaconf import OmegaConf
>>> d = OmegaConf.structured(C)
>>> d.a = 3
>>> d.b = 4
>>> OmegaConf.to_object(d)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/tmk/.conda/envs/palbolts/lib/python3.9/site-packages/omegaconf/omegaconf.py", line 574, in to_object
    return OmegaConf.to_container(
  File "/home/tmk/.conda/envs/palbolts/lib/python3.9/site-packages/omegaconf/omegaconf.py", line 553, in to_container
    return BaseContainer._to_content(
  File "/home/tmk/.conda/envs/palbolts/lib/python3.9/site-packages/omegaconf/basecontainer.py", line 249, in _to_content
    return conf._to_object()
  File "/home/tmk/.conda/envs/palbolts/lib/python3.9/site-packages/omegaconf/dictconfig.py", line 735, in _to_object
    self._format_and_raise(
  File "/home/tmk/.conda/envs/palbolts/lib/python3.9/site-packages/omegaconf/base.py", line 190, in _format_and_raise
    format_and_raise(
  File "/home/tmk/.conda/envs/palbolts/lib/python3.9/site-packages/omegaconf/_utils.py", line 821, in format_and_raise
    _raise(ex, cause)
  File "/home/tmk/.conda/envs/palbolts/lib/python3.9/site-packages/omegaconf/_utils.py", line 719, in _raise
    raise ex.with_traceback(sys.exc_info()[2])  # set end OC_CAUSE=1 for full backtrace
omegaconf.errors.MissingMandatoryValue: Structured config of type `C` has missing mandatory value: c
    full_key: c
    object_type=C
>>> d.c = 0
>>> OmegaConf.to_object(d)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/tmk/.conda/envs/palbolts/lib/python3.9/site-packages/omegaconf/omegaconf.py", line 574, in to_object
    return OmegaConf.to_container(
  File "/home/tmk/.conda/envs/palbolts/lib/python3.9/site-packages/omegaconf/omegaconf.py", line 553, in to_container
    return BaseContainer._to_content(
  File "/home/tmk/.conda/envs/palbolts/lib/python3.9/site-packages/omegaconf/basecontainer.py", line 249, in _to_content
    return conf._to_object()
  File "/home/tmk/.conda/envs/palbolts/lib/python3.9/site-packages/omegaconf/dictconfig.py", line 752, in _to_object
    result = object_type(**field_items)
TypeError: __init__() got an unexpected keyword argument 'c'

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 1
  • Comments: 17 (5 by maintainers)

Most upvoted comments

Great! FYR most of the to_object tests are here.

Ah, thanks for tackling this!

Your suggestion to call MyDataClass.init with init_field_items, followed by calling setattr for each of the non_init_field_items, feels very natural, and I would like to move forward with it.

I feel that a provision should be made for the case where an init=False field has the special value omegaconf.MISSING, in which case it makes sense to skip calling setattr.

Great, I will take a swing at this; hopefully relatively soon 😄

@thomkeh I think using @property is an excellent alternative to using __post_init__+init=False, and it avoids the issues brought up by @rsokl.

By the way @thomkeh, I’ve discovered a workaround for your original issue:

@dataclass
class C:
    a: float
    b: float

    def __post_init__(self):
        self.c: float = self.a + self.b

obj = C(1, 2)
assert obj.c == 3
reveal_type(obj.c)
# "c" is not a field of C, but type checkers still reveal obj.c to be of type `float`

The obj.c attribute behaves “like an init=False field”.


@rsokl:

Looking at the implementation of to_object, I believe it would be relatively straightforward to support init=False fields in the specific case where __post_init__ is not defined on the dataclass object. The current implementation would simply adjust object_type_field_names to only accumulate those fields that are present in the structured config’s signature.

Your suggestion to call MyDataClass.__init__ with init_field_items, followed by calling setattr for each of the non_init_field_items, feels very natural, and I would like to move forward with it.

I feel that a provision should be made for the case where an init=False field has the special value omegaconf.MISSING, in which case it makes sense to skip calling setattr. Here is the motivating example:

@dataclass
class MyDataClass:
    foo: int = field(init=False)

obj1 = MyDataClass()
assert not hasattr(obj1, "foo")

cfg = OmegaConf.create(MyDataClass)
assert OmegaConf.is_missing(cfg, "foo")
obj2 = OmegaConf.to_object(cfg)
# It would be strange if in this case `obj2.foo == MISSING`.
# I think `hasattr(obj2, "foo")` should be False here,
# agreeing with the case of `obj1` above.

If you are inclinded to submit a PR implementing this, it would be most welcome! Otherwise, I will put it on my docket.


The __post_init__ problem

As mentioned earlier, this __post_init__ problem is really independent from supporting init=False fields.

Agreed.

You’ve pointed out that strange things can occur if __post_init__ gets run twice, specifically if __post_init__ changes the value of init=True fields (e.g. as in your BadGuy example above). I think using __post_init__ to change the value of an init=True field is a strange thing to do, and I’m not sure what use-case would require such logic.

Let me point out that, to ensure __post_init__ is run only once, one can call OmegaConf.create on the dataclass itself (rather than on a dataclass instance):

cfg = OmegaConf.create(BadGuy)
assert cfg.a is MISSING
cfg.a = 4.
obj = OmegaConf.to_object(cfg)
assert obj.a == 16.

This is the pattern that should be used e.g. if __post_init__ has side-effects and must be invoked exactly once.

I understand your view that attributes of structured DictConfig instances should correspond closely to attributes of the dataclass instances returned from to_object. That being said, I fear that undoing the effects of a call to __post_init__ by invoking setattr could lead to confusion. In the absence of a motivating use-case, I think for now we should preserve the current behavior of OmegaConf.to_object in this case where __post_init__ is defined and where all fields have init=True.

The case where __post_init__ is defined and some fields have init=False:

I think we agree that, in this case, OmegaConf.to_object should raise an error.

FWIW, since opening this issue, I have realized that my use case was better handled by @property or even @cached_property (added in python 3.8). So, consider my suggestion retracted.

(And I don’t understand @rsokl 's requirements well enough to comment on his proposal.)

Hi, I am also registering the interest in excluding init=False. We are creating a framework within PyTorch3D where we try to unify the definitions of Pytorch modules and structured configs. Thus, the same class is passed to structured() and then instantiated in the code. It is useful to annotate types for dynamic fields, and init=False is generally fulfils that purpose.

Hi @thomkeh , thanks for the feature request!

OmegaConf should just ignore fields that have init=False set.

What is the motivation for ignoring fields that have init=False? Is this so that the dataclass’s __post_init__ method will be able to initialize that field later (when it’s time to initialize a dataclass instance using the DictConfig’s data)?

EDIT: Here is a WORKAROUND:

@dataclass
class C:
    a: float
    b: float

    def __post_init__(self):
        self.c: float = self.a + self.b

obj = C(1, 2)
assert obj.c == 3
reveal_type(obj.c)
# "c" is not a field of C, but type checkers still reveal obj.c to be of type `float`

The obj.c attribute behaves “like an init=False field”.

>>> from omegaconf import OmegaConf
>>> d = OmegaConf.create(C)
>>> d.a = 3
>>> d.b = 4
>>> o = OmegaConf.to_object(d)
>>> o.c
7.0