sqlalchemy: Chained joinedload causes duplicate sqla objects, when run with pypy
Migrated issue, originally created by Ashley Chaloner
This bug appears when running in pypy but not cpython. It appears when using sqlite or postgres backends (these are the only two I’ve tested with). The bug does not cause a stack trace, but results in duplicate database rows being returned in non-identical sqlalchemy objects.
Versions
- pypy: Python 2.7.13 (5.8.0+dfsg-2~ppa2~ubuntu16.04, Jun 17 2017, 18:50:19) [PyPy 5.8.0 with GCC 5.4.0 20160609]
- cffi==1.10.1
- greenlet==0.4.12
- readline==6.2.4.1
- six==1.10.0
- SQLAlchemy==1.1.13
- (if using
UUIDType: SQLAlchemy-Utils==0.32.16)
bash script for venv setup
virtualenv -p /usr/bin/pypy ~/test-venv
source ~/test-venv/bin/activate
pip install \
cffi==1.10.1 \
greenlet==0.4.12 \
readline==6.2.4.1 \
six==1.10.0 \
SQLAlchemy==1.1.13
# if using UUIDType:
pip install SQLAlchemy-Utils==0.32.16
Python script to reproduce bug
from sqlalchemy import (
Column,
ForeignKey,
Index,
Table,
create_engine
)
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import joinedload, relationship, sessionmaker
from sqlalchemy.types import Integer
Base = declarative_base()
class Base2(Base):
__abstract__ = True
attrlist = ["id"]
# for easier debugging
def __repr__(self):
attrs = ["%s:%s" % (a, getattr(self, a)) for a in self.attrlist]
return "<%s.%s object at 0x%016x: %s>" % (
self.__class__.__module__, self.__class__.__name__,
id(self), ", ".join(attrs))
class Scene(Base2):
__tablename__ = "scene"
id = Column(Integer, primary_key=True)
class ActScenes(Base2):
__tablename__ = "act_scenes"
attrlist = ["id", "act_id", "scene_id"]
id = Column(Integer, primary_key=True)
act_id = Column(
Integer(),
ForeignKey("act.id"),
nullable=False)
scene_id = Column(
Integer(),
ForeignKey("scene.id"),
nullable=False)
scene = relationship(Scene, lazy="joined")
class Act(Base2):
__tablename__ = "act"
id = Column(Integer, primary_key=True)
scenes = relationship(ActScenes)
acts = Table(
"acts", Base.metadata,
Column("play_id", Integer(), ForeignKey("play.id")),
Column("act_id", Integer(), ForeignKey("act.id"))
)
Index("ix_acts", acts.c.play_id, acts.c.act_id, unique=True)
class Play(Base2):
__tablename__ = "play"
id = Column(Integer, primary_key=True)
acts = relationship("Act", secondary=acts)
def run_test(i):
"""Run one test, and return True if something failed."""
engine = create_engine('sqlite:///:memory:')
asessionmaker = sessionmaker()
asessionmaker.configure(bind=engine)
Base.metadata.create_all(engine)
session = asessionmaker()
play1 = Play()
act = Act()
act.scenes = [
ActScenes(
act_id=act.id,
scene=Scene()
)
for _ in xrange(1000)
]
play1.acts.append(act)
scenecounts1 = len(play1.acts[0].scenes)
session.add(play1)
session.commit()
# Comment out this block and watch the bug vanish.
session.query(Play).options(
joinedload(Play.acts).joinedload(Act.scenes)
).filter_by(id=play1.id).first()
# End block.
scenecounts1_again = len(play1.acts[0].scenes)
if scenecounts1 != scenecounts1_again:
print "Iteration {} failed".format(i)
seen_scene_ids = set()
for scene in act.scenes:
if scene.id in seen_scene_ids:
# import pdb; pdb.set_trace()
print "Duplicate scene.id spotted: {}".format(scene.id)
scenes_with_id = [s for s in act.scenes if s.id == scene.id]
print "Scenes with this id:\n{}".format(
"\n".join(repr(s) for s in scenes_with_id))
else:
seen_scene_ids.add(scene.id)
return scenecounts1 != scenecounts1_again
if __name__ == "__main__":
N = 25
results = [run_test(i) for i in xrange(N)]
print "{} out of {} failed.".format(sum(results), N)
Sample output
(test-venv) user@host:~$ pypy test_chained_joinedload.py
Iteration 0 failed
Duplicate scene.id spotted: 185
Scenes with this id:
<__main__.ActScenes object at 0x00007f77de383948: id:185, act_id:1, scene_id:185>
<__main__.ActScenes object at 0x00007f77de9c4790: id:185, act_id:1, scene_id:185>
1 out of 25 failed.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 19
Michael Bayer (@zzzeek) wrote:
gerrit at https://gerrit.sqlalchemy.org/#/q/I9f6ae3fe5b078f26146af82b15d16f3a549a9032 a patched version is available for early testing at https://gerrit.sqlalchemy.org/changes/504/revisions/1ceb88eb53bdce1fa98c6b044f996fb995645876/archive?format=tgz
thanks for the effort on this great bug report this must have been very difficult to isolate !