pyRDF2Vec: SPARQLConnector error when building locally

🐛 Bug

When I build this project locally (with pip install .), I get the following error message:

TypeError: unhashable type: 'SPARQLConnector'

I do not get this error message when I install this project remotely (with pip install pyRDF2Vec).

Expected Behavior

No error, regardless of whether I build locally or install from a remote release.

Current Behavior

The above shared error message.

Steps to Reproduce

  1. Clone this repository.
  2. Enter the directory of the resulting close and run pip install .
  3. Run python main.py with the following content in file main.py:
import pandas as pd
from pyrdf2vec import RDF2VecTransformer
from pyrdf2vec.embedders import Word2Vec
from pyrdf2vec.graphs import KG
from pyrdf2vec.walkers import RandomWalker
data = pd.read_csv("https://raw.githubusercontent.com/IBCNServices/pyRDF2Vec/master/samples/countries-cities/entities.tsv", sep="\t")
entities = [entity for entity in data["location"]]
knowledge_graph = KG("https://dbpedia.org/sparql")
transformer = RDF2VecTransformer(
    Word2Vec(epochs=10),
    walkers=[RandomWalker(4, 10, with_reverse=False, n_jobs=2)])
embeddings, literals = transformer.fit_transform(knowledge_graph, entities)
  1. Observe the following traceback:
Traceback (most recent call last):
  File "dbpedia.py", line 12, in <module>
    embeddings, literals = transformer.fit_transform(knowledge_graph, entities)
  File "/home/wouter/.local/lib/python3.8/site-packages/pyrdf2vec/rdf2vec.py", line 146, in fit_transform
    self.fit(kg, entities, is_update)
  File "/home/wouter/.local/lib/python3.8/site-packages/pyrdf2vec/rdf2vec.py", line 107, in fit
    walks = self.get_walks(kg, entities)
  File "/home/wouter/.local/lib/python3.8/site-packages/pyrdf2vec/rdf2vec.py", line 166, in get_walks
    if kg.skip_verify is False and not kg.is_exist(entities):
  File "/home/wouter/.local/lib/python3.8/site-packages/pyrdf2vec/graphs/kg.py", line 374, in is_exist
    responses = [self.connector.fetch(query) for query in queries]
  File "/home/wouter/.local/lib/python3.8/site-packages/pyrdf2vec/graphs/kg.py", line 374, in <listcomp>
    responses = [self.connector.fetch(query) for query in queries]
  File "/home/wouter/.local/lib/python3.8/site-packages/cachetools/__init__.py", line 686, in wrapper
    return c[k]
  File "/home/wouter/.local/lib/python3.8/site-packages/cachetools/__init__.py", line 414, in __getitem__
    link = self.__getlink(key)
  File "/home/wouter/.local/lib/python3.8/site-packages/cachetools/__init__.py", line 501, in __getlink
    value = self.__links[key]
  File "/home/wouter/.local/lib/python3.8/site-packages/cachetools/keys.py", line 19, in __hash__
    self.__hashvalue = hashvalue = hash(self)
TypeError: unhashable type: 'SPARQLConnector'

Environment

  • Operating system: Ubuntu 20.04.4 LTS
  • pyRDF2Vec version: 0.2.3
  • Python version: 3.8.10

Additional remarks

I see the same error message in #64, but the problem description there is very different (use of a specific IDE IIUC).

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 16 (7 by maintainers)

Most upvoted comments

@rememberYou Running pip install . on the last commit in master indeed works for me. Thanks for fixing this!

The commit d039a702c9b072983b5f60ba7892b8d7a2cf1993 on the develop branch has fixed the issue with cachetool. If you clone to the master, since you did a git reset with an older version of cachetool, it should work too.

@wouterbeek Does it work on your side?

Using the develop branch, I’m able to extract the walk from the TriplyDB endpoint without issues. I haven’t test with the master branch.

Last comment for this issue, cachetools 5.0.0 works if we simply remove the key:

@cachedmethod(operator.attrgetter("cache"), key=partial(hashkey, "fetch")))
def foo(self, bar):
    pass

by:

@cachedmethod(operator.attrgetter("cache"))
def foo(self, bar):
    pass

I double check that there are no more problems with cachetools and I push this fix.

I’ve already tried, but the error is always the same. I will look into fixing this as soon as possible and also update the attr syntax which has changed since then and is easier to use.

I did a reset and push -f to reset our repo back to the latest working commits. All commits after that point in time did get removed however so best store a copy locally if you want to salvage anything from that.

Cachemethod is probably defined on a function with a self attribute, so self (here the sparql connector) will be cached as well.

Which throws an error because it doesn’t now how to hash it