gruut: Two test failures in french phonemizer sqlite3 connection

Not entirely sure what db_path it’s trying to access here, and these are the only two tests failing for me.

======================================================================
ERROR: test_last_token (tests.test_fr_phonemizer.FrenchPhonemizerTestCase)
Ensure liason does not leave last token
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/build/source/tests/test_fr_phonemizer.py", line 75, in test_last_token
    phonemes = text_to_phonemes("Est-ce-que", lang="fr")
  File "/build/source/gruut/__init__.py", line 224, in text_to_phonemes
    return_tuples.extend(
  File "/build/source/gruut/__init__.py", line 224, in <genexpr>
    return_tuples.extend(
  File "/build/source/gruut/lang.py", line 858, in phonemize
    for (token1, token1_pron), (token2, token2_pron) in pairwise(
  File "/build/source/gruut/utils.py", line 125, in pairwise
    next(b, None)
  File "/build/source/gruut/phonemize.py", line 271, in phonemize
    token_pron = self.get_pronunciation(token)
  File "/build/source/gruut/phonemize.py", line 328, in get_pronunciation
    word_prons = list(pron for _word, pron in self.select_prons(word))
  File "/build/source/gruut/phonemize.py", line 328, in <genexpr>
    word_prons = list(pron for _word, pron in self.select_prons(word))
  File "/build/source/gruut/phonemize.py", line 607, in select_prons
    self._connect()
  File "/build/source/gruut/phonemize.py", line 481, in _connect
    self.db_conn = sqlite3.connect(self.db_path)
sqlite3.OperationalError: unable to open database file

======================================================================
ERROR: test_liason (tests.test_fr_phonemizer.FrenchPhonemizerTestCase)
Test addition of liason
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/build/source/tests/test_fr_phonemizer.py", line 51, in test_liason
    sentence = self._without_and_with_liason(
  File "/build/source/tests/test_fr_phonemizer.py", line 93, in _without_and_with_liason
    phonemes_no_liason = list(phonemizer_no_liason.phonemize(sentence.tokens))
  File "/build/source/gruut/lang.py", line 849, in phonemize
    yield from token_phonemes
  File "/build/source/gruut/phonemize.py", line 271, in phonemize
    token_pron = self.get_pronunciation(token)
  File "/build/source/gruut/phonemize.py", line 328, in get_pronunciation
    word_prons = list(pron for _word, pron in self.select_prons(word))
  File "/build/source/gruut/phonemize.py", line 328, in <genexpr>
    word_prons = list(pron for _word, pron in self.select_prons(word))
  File "/build/source/gruut/phonemize.py", line 607, in select_prons
    self._connect()
  File "/build/source/gruut/phonemize.py", line 481, in _connect
    self.db_conn = sqlite3.connect(self.db_path)
sqlite3.OperationalError: unable to open database file

----------------------------------------------------------------------

I tried to debug this by printing out the db_path and it is only opening :memory: and gruut/data/en-us/lexicon.db.

diff --git a/gruut/phonemize.py b/gruut/phonemize.py
index 4650fd3..934af86 100644
--- a/gruut/phonemize.py
+++ b/gruut/phonemize.py
@@ -477,8 +477,13 @@ class SqlitePhonemizer(Phonemizer):
         if self.db_conn is None:
             assert self.db_path is not None, "No sqlite3 database path"
 
-            _LOGGER.debug("Connecting to %s", self.db_path)
+            _LOGGER.error("Connecting to %s", self.db_path)
             self.db_conn = sqlite3.connect(self.db_path)
+            _LOGGER.error("OK")
+
+            import sys
+            sys.stdout.flush()
+            sys.stderr.flush()
 
             if not self.feature_to_id:
                 # Try to load feature names from the database

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 19 (8 by maintainers)

Most upvoted comments

Lol, I missed the extra “i” in liaison this whole time 🤦‍♂️

Please keep bothering me 🙂

The problem here is that the lexicon.db files for some languages are over the size limit that GitHub has for files stored in a repo. So I either need to use GitLFS or assume that the files can be pulled from a release.

Any suggestions?

Yep, that works! Thank you.

I might wear out the facepalm emoji here 🤦‍♂️

OK, got the POS model added to the French model. I can at least confirm that tests pass for me with a clean pull of v1.2 after running the create-venv.sh script.

Thanks for your patience 👍

I’ve pushed the files up to a side branch: https://github.com/rhasspy/gruut/tree/v1.2

Turns the biggest file was still under the limit!

With my package maintainer hat on: We would be fine with fetching via git-lfs, and it is probably easier to work with, than having to track/manage separate release artifacts.

Ah, it seems my language files packaging is at fault.

/nix/store/fih6hsnikzc4pmb6ri6nri58jghacz7z-python3.8-gruut-lang-fr-1.2.2/
/nix/store/fih6hsnikzc4pmb6ri6nri58jghacz7z-python3.8-gruut-lang-fr-1.2.2/lib
/nix/store/fih6hsnikzc4pmb6ri6nri58jghacz7z-python3.8-gruut-lang-fr-1.2.2/lib/python3.8
/nix/store/fih6hsnikzc4pmb6ri6nri58jghacz7z-python3.8-gruut-lang-fr-1.2.2/lib/python3.8/site-packages
/nix/store/fih6hsnikzc4pmb6ri6nri58jghacz7z-python3.8-gruut-lang-fr-1.2.2/lib/python3.8/site-packages/gruut_lang_fr
/nix/store/fih6hsnikzc4pmb6ri6nri58jghacz7z-python3.8-gruut-lang-fr-1.2.2/lib/python3.8/site-packages/gruut_lang_fr/__pycache__
/nix/store/fih6hsnikzc4pmb6ri6nri58jghacz7z-python3.8-gruut-lang-fr-1.2.2/lib/python3.8/site-packages/gruut_lang_fr/__pycache__/__init__.cpython-38.pyc
/nix/store/fih6hsnikzc4pmb6ri6nri58jghacz7z-python3.8-gruut-lang-fr-1.2.2/lib/python3.8/site-packages/gruut_lang_fr/VERSION
/nix/store/fih6hsnikzc4pmb6ri6nri58jghacz7z-python3.8-gruut-lang-fr-1.2.2/lib/python3.8/site-packages/gruut_lang_fr/__init__.py
/nix/store/fih6hsnikzc4pmb6ri6nri58jghacz7z-python3.8-gruut-lang-fr-1.2.2/lib/python3.8/site-packages/gruut_lang_fr-1.2.dist-info
/nix/store/fih6hsnikzc4pmb6ri6nri58jghacz7z-python3.8-gruut-lang-fr-1.2.2/lib/python3.8/site-packages/gruut_lang_fr-1.2.dist-info/INSTALLER
/nix/store/fih6hsnikzc4pmb6ri6nri58jghacz7z-python3.8-gruut-lang-fr-1.2.2/lib/python3.8/site-packages/gruut_lang_fr-1.2.dist-info/METADATA
/nix/store/fih6hsnikzc4pmb6ri6nri58jghacz7z-python3.8-gruut-lang-fr-1.2.2/lib/python3.8/site-packages/gruut_lang_fr-1.2.dist-info/RECORD
/nix/store/fih6hsnikzc4pmb6ri6nri58jghacz7z-python3.8-gruut-lang-fr-1.2.2/lib/python3.8/site-packages/gruut_lang_fr-1.2.dist-info/REQUESTED
/nix/store/fih6hsnikzc4pmb6ri6nri58jghacz7z-python3.8-gruut-lang-fr-1.2.2/lib/python3.8/site-packages/gruut_lang_fr-1.2.dist-info/WHEEL
/nix/store/fih6hsnikzc4pmb6ri6nri58jghacz7z-python3.8-gruut-lang-fr-1.2.2/lib/python3.8/site-packages/gruut_lang_fr-1.2.dist-info/direct_url.json
/nix/store/fih6hsnikzc4pmb6ri6nri58jghacz7z-python3.8-gruut-lang-fr-1.2.2/lib/python3.8/site-packages/gruut_lang_fr-1.2.dist-info/top_level.txt

Not exactly sure where the package data specified in the language’s setup.py is located tbh.