neomodel: Upgrading to version 3.3.0 causes a KeyError on initialising connection to neo4j DB

We have a neo4j instance, populated with the help of neomodel v3.2.9. The v3.3.0 of neomodel breaks when attempting to initialise the connection to the database:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/neomodel/util.py", line 150, in _object_resolution
    resolved_object = self._NODE_CLASS_REGISTRY[frozenset(a_result_attribute[1].labels)].inflate(
KeyError: frozenset({'Person'})

This appears to be as a result of the self._NODE_CLASS_REGISTRY changes included in v3.3.0, which appear to be breaking. Is this a known issue? Given that this is a minor version bump, we would not expect breaking changes to be included.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 7
  • Comments: 78 (18 by maintainers)

Commits related to this issue

Most upvoted comments

@aanastasiou The changes appears to work:

  • I populated a flask+neo4j app with some data using neomodel 3.2.9
  • Using neomodel 3.2.9, I was able to POST additional entities successfully
  • Using neomodel 3.3.0, I was unable to POST additional entities as described in this issue
  • Using the version in the branch specified above, I was able to post entities successfully

Subsequent switching between the versions has the behaviour you would expect (3.2.9 and the above branch work; 3.3.0 results in the node registry problems).

@jonadaly @manoadamro @theterg @mzgajner @mostafa @mjmare @MardanovTimur @phoebebright @robertlagrant @robinedwards @jberends @oikonomou

Can I please ask you guys to give it a go and test this branch?

It incorporates a number of pull requests the most important of which for this issue is the one by @oikonomou that so far seems to have solved this issue:

  • It works with this Flask test-case (main_naive.py);
  • It works with this very simple Django test case
    • @mostafa, there is a tiny little bit of configuration you need to do so that the Django app gets to connect successfully to your server; and
  • It also seems to work in a very brief multithreaded test I run over a tiny little cluster to test explicit transactions.

@mjmare, @robertlagrant talked about Pylons / Pyramid before so it would be nice to hear from you if it works there too, although, I do not see why it should not.

Looking forward to hearing from you

I am also running into this issue in a different context which may be of use. I’m attempting to use neomodel from a django environment via django-neomodel. In this environment I have even less control over the ordering of import statements or the threading context in which neomodel is imported. I’m not very well versed on django internals, but it looks as though the global db object is re-instantiated many times during the lifecycle of a django process.

During django startup I see the db object being recreated and my models successfully added to _NODE_CLASS_REGISTRY. However during a webrequest to a View object, I see the db object have been re-instantiated immediately before servicing the request, and my model never gets added back to the _NODE_CLASS_REGISTRY.

It’s worth mentioning that neomodel works perfectly fine when invoked from a normal django shell python manage.py shell. I may try to dig a little deepr, but thought i’d give you a heads-up. Maybe a global db object is difficult to manage in this context?

EDIT: I can confirm that reverting to 3.2.9 solves the problem.

@jberends Yes, I did:

Version 3.3.1 2019-02-08
 * Fixed a number of warnings due to deprecations both within neomodel and pytest and overall improvements in
   code style (#381) - Abhishek Modi
 * Added support for spatial data through neomodel.contrib.spatial_datatypes (#384) - Athanasios Anastasiou
 * Added the ability to filter "left-hand statements" (in NodeSet expressions) too (#395) - Grzegorz Grzywacz
 * Refactor the Node Class Registry to make util.Database a true Singleton (#403) - Giorgos Oikonomou
     * Many thanks to Giorgos Oikonomou, Jon Daly, Adam Romano, Andrew Tergis, Mato Žgajner, Mostafa Moradian, mjmare,
       Phoebe Bright, Robert Grant, jberends and anyone else who helped flag, track and correct this bug. For more
       information, please see: https://github.com/neo4j-contrib/neomodel/issues/378
     * Added the ClassAlreadyDefined exception to prevent against accidental redefinitions of data model classes (#409)
       - Athanasios Anastasiou
 * Added the ability to lazily `.nodes.get()` and `.nodes.all()` (#406) - Matan Hart
 * Fixed a bug in the assumed direction of relationships in _build_merge_query (#408) - MrAnde7son

Will make sure that these changes are reflected in the repository too.

Hello everyone, the release was made last Friday. We are now on 3.3.1. I am sorry I did not make a special announcement.

I would also like to thank everyone who took the time to look into any aspect of this issue and give us feedback or suggestions.

All the best

We’ve had it running in dev for a couple of weeks and it seems stable to us.

@jberends Thanks for taking the time to perform any additional tests, just bear with me as I am going through a number of other contributions at the moment prior to making the next release please.

@robertlagrant A possible way forward if waitress simply does multiple threads (?). Please see the pull request (work not completed as far as I can tell) and discussion there.

I’ve tested the @oikonomou changes, but to no avail! The issue persists!

@jberends

AFAIK in Django you have a DatabaseWrapper object that is thread specific that manages the connection to the database…

My question was specifically about how Django’s middleware uses neomodel. However, from what I see here, this is not very far off from the way neomodel is set up as well. The only problem is maintaining the state of the neomodel.db object (more about this further below).

Is neomodel assuming a single open connection?

I am assuming you mean one connection that needs to remain connected to the server. If that is the case, then no.

I am curious to see how neomodel manages this as we experience the same problems with 3.3.0 as anyone else. Will make some effort to do testing.

Please see this gist which is also linked in my last post. This is a very crude outline of how the problem is dealt with in Flask and it shows how to “save” the state between calls.

I have not looked any further details into this but I am fairly confident that applications can be isolated from each other so that it is possible to run different applications on the same server, all using neomodel but over different connections and schemata.

As mentioned above, I think that this is going to have to be solved on a “per-framework” basis. Django has its own middleware infrastructure so does Pyramid and other frameworks.

I think that it would help if there was a repository that demonstrated how Django models integrate with neomodel (which provides its own models. Do you inherit from both? Do you build Django wrappers around neomodel classes to “force” it to talk to the database via neomodel?). We could then use that one to discuss ways forward. If you have experience with Django middleware and would like to create one for neomodel, even better.

All the best

Hi @aanastasiou I did some more spelunking. In pyramid I tried several approaches to save and restore the neomodel.db. They failed. Well, technically they worked but the db is still empty ie an empty Database() object in the view. Other test objects were OK. I also found out, by setting breakpoints, that the models’ definition code is only run once.

Then I noticed in neomodel.util that Database is defined as:

class Database(thread.local):

Doesn’t that mean that it is to be expected/intended for the class to be newly created for each request? IOW db._CLASS_REGISTRY will always be empty except in the thread that runs the program startup code. It is not a singleton!

I changed the code to

class Database():

and everything works again! Don’t know whether this messes up something else, but for now I’m happy.

Let me chime in here. I have the exact same problem in a Pyramid app. On app init everything is well, labels etc are installed. db._NODE_CLASS_REGISTRY is populated etc. However, in a view, db._NODE_CLASS_REGISTRY is empty and obviously I get a neomodel.exceptions.ModelDefinitionMismatch exception. Using

import neomodel
from blah import MyModel 

does not solve the problem. neomodel.config.DATABASE_URL is still correct in the view.

Downgrading to neomodel 3.2.9 “fixes” the problem.

@mostafa I agree. I can see that for myself too. If you notice further above, I am trying to give temporary solutions to problems that users face until we can come up with a better solution. I don’t think that this comes across very efficiently, so I am going to have to try harder 😃

Thanks for letting us know that the g option appears to be a deadend.

Will try this evening and report back

@manoadamro

Background:

  1. neomodel 3.3.0 solves a specific issue with it that allows it to instantiate classes properly in a hierarchy. Prior to this, if a class had an association with an abstract class, then upon instantiation it would instantiate a node to that abstract class, even if subsequent specialisations of that class pointed to specialised versions of the abstract class. You can read more about this here.

  2. This was solved by adding a mapping (a dictionary) in neomodel.util.Database() (tuple(node labels)=>node class) that is populated by the NodeMeta class upon a given StructuredNode definition.

The issue:

  1. The consequence of this is that if a class is not defined, it is unknown to the mapping and consequently it will raise ModelDefinitionMismatch

  2. Now, the question is, how do you get into this condition? So far, I have not been able to reproduce the problem EXACTLY as you mention. But here is the closest I got to:

    • Suppose that we have a project with modules neomodel, A, B, C, D where A,B,C,D are purely various specialisations of StructuredNode (i.e. an application’s models).
      • A depends on B
      • C depends on A
      • D depends on A
  3. From the point of view of Node inheritance, it is impossible for a node class to be imported and not end up in the neomodel mapping.

    • The reason why, here, is trivial. Any import, imports the whole module and depending on its form will choose to import a particular symbol to the calling namespace. So doing from D import AClassDefinedInD, imports D (which will import A which imports B) and subsequently AClassDefinedInD to the local namespace.
    • Therefore upon doing this, the mapping is populated with the full lineage of the class and it can instantiate it properly.
  4. BUT, Nodes can maintain relationships with other nodes. In this case, from the point of view of referential integrity we need to know:

    • The data types of the relationship’s end points, the direction of the relationship and the multiplicity (one-to-one, one-to-many, etc)
    • The properties of the relationship.
  5. And here is where the issue is because neomodel (currently) does not follow the relationships of a class to check if it might attempt to load a class that has not yet been defined. Therefore, if AClassDefinedInD contains some some_aclass_attribute = neomodel.Relationship("C.ModelWhatever", "REL_TO", model=AClassToWhatever), the C module will not have been imported when importing D which will cause a runtime error if the property is traversed.

    • Notice here, if you did not have to traverse the property, no error will be raised because neomodel is not going to try to instantiate the object on the other side of the relationship which is the one whose definition is not yet in the mapping.
    • This was partly why it was a bit difficult to reproduce the error with trivial attempts.

Temporary Fix:

Now, is the solution for neomodel to actually try and traverse a relationship and automate the loading? (maybe that’s too slow (?)). Shall we make the type parameter of the Relationship an actual data type rather than a string? (Maybe that’s a bit too constraining (?)).

I have not thought about this yet, maybe the fix ends up being a combination of the two.

As a temporary solution however, I would suggest that you examine your Relationship definitions within a particular module and make sure that you import the classes that might be referenced within that module. Yes, this means that you might end up importing classes that are not used within the module, but this is a temporary fix so that you continue using 3.3.0 which is better at handling inheritance (and also includes a number of other very useful contributions).

Can I please ask you to confirm if this fixes the issue for you?

@manoadamro That is great, thank you for letting me know. I am not advocating bad code style with my suggestion but since that worked, I now have something to work with on my end and a better idea of how to go about fixing the problem.

If you come up with any specific fixes in the meantime, please let me know or go ahead with a PR.

All the best

Adding import neomodel at the top of the file producing the error has no effect.

The full stack trace is:

File "./lib/python3.6/site-packages/neomodel/match.py", line 564, in get_or_none
    return self.get(**kwargs)
  File "./lib/python3.6/site-packages/neomodel/match.py", line 548, in get
    result = self._get(limit=2, **kwargs)
  File "./lib/python3.6/site-packages/neomodel/match.py", line 539, in _get
    return self.query_cls(self).build_ast()._execute()
  File "./lib/python3.6/site-packages/neomodel/match.py", line 445, in _execute
    results, _ = db.cypher_query(query, self._query_params, resolve_objects=True)
  File "./lib/python3.6/site-packages/neomodel/util.py", line 32, in wrapper
    return func(self, *args, **kwargs)
  File "./lib/python3.6/site-packages/neomodel/util.py", line 202, in cypher_query
    results = self._object_resolution(results)
  File "./lib/python3.6/site-packages/neomodel/util.py", line 162, in _object_resolution
    raise ModelDefinitionMismatch(a_result_attribute[1], self._NODE_CLASS_REGISTRY)
neomodel.exceptions.ModelDefinitionMismatch: Node with labels Person does not resolve to any of the known objects

Hello @jdalydt

Thanks for letting us know about this. No, it is not a known issue so far and I am sorry if it has caused a headache to you guys.

Can you please confirm if import neomodel comes before any other neomodel imports you might be doing on the file that raises the error?

This is related to the way neomodel creates a mapping between labels and native classes to be able to properly instantiate a class given a generic Node as retrieved from the database. To make sure that every call references the same neomodel “instance”, import model has to be placed first.

All the best

P.S.: Can I please ask @Winawer, @mstefaniak10 and @manoadamro to confirm if they also receive a similar error and let us know if the fix that is suggested in this post fixed it?