sqlalchemy: Add a way to fetch a model or create one if it doesn't exist

Migrated issue, originally created by Markus Meskanen (@Mahi)

Django has a get_or_create() method for times when you don’t know if data for an entity already exists. There are multiple implementations available for SQLAlchemy at http://stackoverflow.com/questions/2546207/does-sqlalchemy-have-an-equivalent-of-djangos-get-or-create/, but I thought it would be nice to have a native support for such widely used feature.

I’m not sure of the ideal implementation, but one that pleases my eye is to add a keyword arugment for one() etc:

session.query(Player).filter_by(id=my_player_id).one(create=True)

Or simply add a new method:

session.query(Player).filter_by(id=my_player_id).get_or_create()

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 19

Most upvoted comments

Michael Bayer (@zzzeek) wrote:

get_or_create() as being proposed (and as in the blog post EDIT: the very last example accommodates it, at great complexity) does not solve for race conditions. A conflicting row can easily be inserted in between the time that we ask for the row vs. add one to the Session ourselves, causing an INSERT conflict (assuming proper constraints exist, silent data corruption if not). Solving for a race like this can be done in many different ways which are highly dependent on the database and scenario in use - such as using “select…for update” [edit: sorry, keep jumbling up UPDATE/INSERT w/ these issues, for pessimistic this needs to be like LOCK TABLE or serializable isolation], using Postgresql’s “INSERT…ON DUPLICATE UPDATE” is another (works only on Postgresql though).

if django claims their “get_or_create()” solves for race conditions and isn’t trying to do some sort of locking or optimistic concurrency checking, then the irony would be that this bug report claiming django has “solved” this problem is in fact uncovering a bug in django 😃.

I think people here have to appreciate what a big deal “get_or_create()” in the entirely generalized sense really is. People’s applications will work much better if they produce this function themselves since they will be able to tailor for and know exactly what limitations they’ve chosen (since there will have to be some kind of limitations).