pydantic: Class attributes starting with underscore do not support assignment

Bug

For bugs/questions:

  • OS: macOS
  • Python version import sys; print(sys.version): 3.6.6
  • Pydantic version import pydantic; print(pydantic.VERSION): 0.14

In #184 it was suggested to use variables starting with the underscore, however this does not work. The last comment in #184 referred to the same problem, but it is technically a separate issue.

from pydantic import BaseModel
m = BaseModel()
m._foo = "bar"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/dmfigol/projects/my/public/simple-smartsheet/.venv/lib/python3.6/site-packages/pydantic/main.py", line 176, in __setattr__
    raise ValueError(f'"{self.__class__.__name__}" object has no field "{name}"')
ValueError: "BaseModel" object has no field "_foo"

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 19
  • Comments: 57 (17 by maintainers)

Commits related to this issue

Most upvoted comments

I read the above but saw no explanation for this limitation. From brief study of the source code I could not determine any danger in sunder fields.

This check is introduced in the first commit ever a8e844da and a check added in f9cf6b42 but no explanation given in commit messages.

Python itself by design does not enforce any special meaning to sunder and dunder fields - the “we are all consenting adults here” approach. I am used to a convention of marking fields private-like with sunder.

Using aliases is clumsy. I use Pydantic to save code lines and make code more readable. I would rather not add this magic. Also this prevents me from doing arguably dirty trick of having both _response and response (processed response).

I would very much like to know the motivation or have this restriction lifted.

Still, I adore Pydantic, thanks!

Hello there, I use pydantic in FastAPI as well, and I use mongodb as database.

Mongo sets Identifiers as the field _id in collections/query outputs and handling those via Pydantic models is quite confusing because of this issue. Let me try to show what I do to ask if I’m doing something wrong; situations I’ve met with my use case

  1. when sending the id as output I have to create a Model with the alias as
class Foo(BaseModel):
    id: str = Field(..., alias='_id')
    ...

query = db.collection.find(...)
validated = Foo(**query).dict()
# where validated = {'id': MONGO_HASH, ...}
# send validated as response

Note: if I’d want to send the id like mongo does (if my API are used by another service that deals with queries and mongo for example) I need to use

validated = Foo(**query).dict(by_alias=True)
# where validated = {'_id': MONGO_HASH, ...}
  1. when storing a PUT in the database the above require the input to be using id instead of _id like:
data = read_from_source(...)
# data = {'id': EXISTING_MONGO_HASH, ...}
db.collection.update(Foo(**data).dict(by_alias=True))

but if some other service is sending me raw data I would get the _id in input so I need to change the model to

class Foo(BaseModel):
    id: str = Field(..., alias='_id')
    ...
    class Config:
        allow_population_by_field_name = True

otherwise I get a validation error


There are other situations that I can’t recall well enough to show with code, but in general I’m currently forced by pydantic on taking care of these differences since the underscores are not accepted. I wouldn’t use the id translation at all, I’d go with _id always - because I’m risking non deterministic behaviour in wrapper methods for all models in the codebase and in the responses.

Hope this helps to reason on it, but thanks for this awesome library!

Have you tried using field aliases?

eg,

class MyModel(BaseModel):
    foobar: str
    class Config:
        fields = {'foobar': '_foobar'}

As @frndlytm said:

so really, any restrictions on identifiers that aren’t explicitly prohibited by the python language (like @timestamp), should probably be supported natively.

Regardless of the underlying database, Pydantic should not enforce what is only a convention. This is definitely not an expected behavior.

Same thing happens in connexion framework. I want to use default mongo _id as the id.

class FileMetadata(BaseModel):
    _id: str
    created_date: str

Set id

data = FileMetadata.parse_obj(info.to_dict())
data._id = my_custom_id

Error

ValueError: "FileMetadata" object has no field "_id"

Change summary explained here solved the problem.

class FileMetadata(BaseModel):
    _id: str
    created_date: str

    class Config:
        extra = Extra.allow

I have a use case that is kind of the opposite of what others are doing. Effectively, I have a use case where the raw input needs to be processed or fetched lazily and then cached. In my case, if a value is absent, then I need to make an API call to a remote system to fetch the data.

I need to load the raw value that arrives and keep it in a private member (something prefixed with an underscore _). Users will access a property of the class which will do the needful to ensure that a proper value is returned.

Here is what I am trying to do:

class SomeThing(BaseModel):

    some_field: str = Field(..., alias="someField")
    some_other_field: str = Field(..., alias="someOtherField")

    # Lazy-loading fields if they are absent
    _lazy_field: Optional[Dict[str, Any]] = Field(..., alias="lazyField")

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

        self._all_loaded = self._lazy_field != None

    def _lazy_load(self):
        # reach to to external API...
        # self._lazy_field = processed result from API...
        self._all_loaded = True

    @property
    def lazy_field(self):
        if not self._all_loaded:
            self._lazy_load()
        
        return self._lazy_field

Unfortunately, this does not work, because fields that are prefixed with an underscore are ignored. In order to make this work, I need to scrap the underscore on the field (make the field named raw_lazy_load or something equivalent). I would prefer to keep with the “expected” Python behavior and have it underscored to indicate that you should not directly access that field.

I tried the other class Config options mentioned in the thread, but don’t work for me. If it’s easy to fix, I’m happy to submit a PR for it.

+1 – Fields starting with _ are also useful for ArangoDB, each vertex object there has a _key, _id, and _rev, and edges also have _from and _to.

I earlier thought I needed that feature, and also found a quite easy way to have it via monkey patching:

# monkey patch to get underscore fields
def is_valid_field(name:str):
    if not name.startswith('__'):
        return True
    elif name == '__root__':
        return True

pydantic.main.is_valid_field = is_valid_field

However I have to say that using an alias works very well now for me!

Yeah, second all the _attribute support. it’s not just Mongo, any Elasticsearch work behind FastAPI misses out on the meta-properties on documents. any in general as a DB developer, I find _attributes an expressive way of working around langue-specific keywords when necessary, like _id and _timestamp

to me, it seems like a by_alias workaround, while sufficient to solve any immediate issues, is not expressing the schema being parsed.

It seems that the justification for why it’s not supported is because Python conventionally (not enforced) uses underscores for privacy problems, but I would argue that Python metaclasses being a way of having the language rewrite itself throws that issue out the window, particularly in their use in pydantic.

Additionally, I would ask what is the core responsibility or value proposition of pydantic? which I believe is as a model parsing library with applications in messaging applications. (e.g. there’s nothing stopping a protobuf loader in the Model Config),a and if that IS the value proposition, then the BaseModel declaration should look as close as possible to the model being parsed.

so really, any restrictions on identifiers that aren’t explicitly prohibited by the python language (like @timestamp), should probably be supported natively.

Just my two cents.

If there’s anything I can do to help push these changes along, I’m dying to get involved in OSS.

I had the same problem with pydantic. In my case that is the MUST. Looks like this problem is still ignored and the only solution is to use different library such as Marshmallow, Attrs or Dataclass

+1, “allowing fields with underscores” is necessary for mongodb issue. Using aliases is just workaround and not very handsome

I’m very new to pydantic, however i’m really impressed and really like it, however with the current implementation filtering out any field starting with an underscore which is completely unconfigurable main.py 162 except monkey patching is not really acceptable. So I think it is really necessary to add an additional configuration parameter.

Hello @samuelcolvin 👋

I just started using pydantic along with FastAPI and pymongo. I used alias like @pdonorio mentions above. But I would like to understand why attributes starting with underscore are disallowed in the first place?

Thanks for your suggestions, It appears I have managed to solve this problem at least at the moment, by changing the design and going for composition instead of inheritance. So I have now my model classes which have a class attribute called schema_class which is type bound to my schema superclass (a Pydantic BaseModel extension) and they have a from_data and to_data methods, which allows to read the serialized structure, validate the model using the schema_class, and store it as data storage as self.data into the model class. With the composition design I have also managed to create a PersistenceEnabled mixin which allow me to extend the main Model with SQLAlchemy persistence, using the same pattern, by declaring a db_class class attribute, and providing saving, loading and searching features on the model. This of course requires to have 3 classes implemented:

  1. The main domain related Model class (e.g.: MyModel) which now can mixin with all other regular python classes;
  2. The Pydantic schema class (e.g.: MyModelSchema) which does all of the validation and serialization business;
  3. If the domain model needs persistence also a db class (e.g.: MyModelDB) which does all of the CRUD business behind the scenes, and allows also for transaction participation (in case of multiple items operation).

I didn’t really solve the problem, but I went around it. Every piece is smaller, has clearer responsibilities and doesn’t cause interference between the different frameworks.

Can the original question be solved in any way? I tried using alias, but still model._foo = False raises the same error.

#1679 would probably be a good solution

The solution for me was using aliases and by_alias when exporting the model. Using the example from @Seshirantha that would look as follows:

class FileMetadata(BaseModel):
    id: str = Field(alias="_id") # can be done using the Config class
    created_date: str

input_data = {"_id": "some id", "created_date": today()}
FileMetadata(**input_data).dict(by_alias=True)

That should return {"_id": "some id", "created_date": "whatever today is"}

I am using Pydantic with pymongo therefore having access to _id and _meta on the validated model is very important.

I still think most python users would not expect “private” attributes to be exposed as fields.

I know python doesn’t formally prevent attributes who’s names start with an underscore from being accessed externally, but it’s a pretty solid convention that you’re hacking if you have to access attributes with leading underscores. (Yes I know there are some exceptions, but in the grand scheme they’re rare).

In the example quoted above regarding mongo, I would personally feel it’s inelegant design to keep the mongo attribute _id in python with the same name. You should be able to use an alias generator and a shared custom base model to change the field name in all models without repeated code.

Still, if people really want this, I’d accept a PR to allow field names with underscores. I guess the best solution would be to add a new method to Config which replaces this logic, perhaps returning None to fallback to the default logic.

Then in v2 we can completely remove Config.keep_untouched it’s a bad name and a confusing attribute.


None of my business, but: @gpakosz, unless i’m missing something, you’re using a synchronous library for io (pymongo) while using an async web framework (starlette+fastapi). I would worry that your entire process will hang during every database call, thereby negating the point in asyncio.

@rreusser sorry I don’t follow - just saying, I used PrivateAttr and it seemed to solve all the commotion in this thread.

I just got bitten by this issue while migration from dataclasess. I’m using pydantic model to describe aggregate result from mongodb:

class CollectionAggregateId(BaseModel):
    field1: str
    field2: str
    field3: str


class CollectionAggregateModel(BaseModel):
    _id: CollectionAggregateId

    count: int
AttributeError: 'CollectionAggregateModel' object has no attribute '_id'

Using alias is not an option for me because I have data with both _id and id fields in my database.

I still think most python users would not expect “private” attributes to be exposed as fields.

I know python doesn’t formally prevent attributes who’s names start with an underscore from being accessed externally, but it’s a pretty solid convention that you’re hacking if you have to access attributes with leading underscores. (Yes I know there are some exceptions, but in the grand scheme they’re rare).

In the example quoted above regarding mongo, I would personally feel it’s inelegant design to keep the mongo attribute _id in python with the same name. You should be able to use an alias generator and a shared custom base model to change the field name in all models without repeated code.

Still, if people really want this, I’d accept a PR to allow field names with underscores. I guess the best solution would be to add a new method to Config which replaces this logic, perhaps returning None to fallback to the default logic.

Then in v2 we can completely remove Config.keep_untouched it’s a bad name and a confusing attribute.

None of my business, but: @gpakosz, unless i’m missing something, you’re using a synchronous library for io (pymongo) while using an async web framework (starlette+fastapi). I would worry that your entire process will hang during every database call, thereby negating the point in asyncio.

I also need a possibility to use underscore attributes. As @samuelcolvin suggested himself, the easiest way would be probably to implement a simple switch in the Config, which would allow to use “_attribute” style. Is it something that you would accept in PR?

Hey @emremrah Probably because you want to call super() not super 😉

Ahh you are right @PrettyWood ,:

class A(BaseModel):
    id = 5
    
    def dict(self):
        d = super().dict()
        d['_id'] = d['id']
        del d['id']
        return d

returned {'_id': 5}. So if anyone is interested, I just have to call .dict() method when inserting to MongoDB. But initiating A with a MongoDB document like

doc = collection.find({})
A(**doc)

still won’t work as we wanted. Any suggestions on that?

@153957, @PrettyWood, just to make it clear: underscore_attrs_are_private and private attributes themselves are not meant to be used as fields. They’re meant to be internally implemented and used only inside of model class, not anywhere outside.

Hello, Should we close this issue now that private model attributes have been implemented?

from pydantic import BaseModel

class Model(BaseModel):
    _foo: str
    _id: str

    a: str

    class Config:
        underscore_attrs_are_private = True

m = Model(a='pika')
m._foo = 'foo'
m._id = 'id'
print(m.dict())