modernmt: Tuning error
Hi Andrea and Davide,
I am opening a new issue for the tuning error.
First, I upgraded to the latest release “1.0.3” to see if this would resolve the issue, but apparently it didn’t.
Here is the error again at step (3 of 4) Tuning… The duration of each first 2 phases is also strange. “6s for tokenization” and '0s for merging corpus…"!!! Is this normal?
`INFO: (1 of 4) Corpora tokenization… DONE (in 6s) INFO: (2 of 4) Merging corpus… DONE (in 0s) INFO: (3 of 4) Tuning… DONE (in 5s)
ERROR Unexpected exception: Command ‘/root/ModernMT/cli/opt/mert-moses.perl /root/ModernMT/runtime/default/tmp/tuning/corpus.en /root/ModernMT/runtime/default/tmp/tuning/corpus.ar /root/ModernMT/cli/opt/mertinterface.py /tmp/tmputq3Uv --threads 4 --mertdir /root/ModernMT/build/bin --mertargs ‘–binary --sctype BLEU’ --working-dir /root/ModernMT/runtime/default/tmp/tuning/mert --nbest 100 --decoder-flags “–port 8045” --nonorm --closest --no-filter-phrase-table --bleuscorer /root/ModernMT/cli/opt/mmt-bleu.perl --bleuscorer-flags “-nt” --early-stopping-value 1 --predictable-seeds --maximum-iterations=25’ failed with exit code 2`
I have also uploaded the logs @
Please let me know how to resolve this issue. Could this be a memory issue?
One more thing: is there a way to know if my old engine is compatible with the latest release? If it’s working fine after upgrade and grabs translations from the server, does this mean that it’s compatible and it doesn’t need porting?
Also, what is the command to get the current installed version of ModernMT?
Thanks again for your support.
Kind regards, Mohamed
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 19 (10 by maintainers)
Commits related to this issue
- #271 fix in cluster.py: if context is empty do not pass it to translate request — committed to modernmt/modernmt by andrea-de-angelis 7 years ago
- hotfix #271: ignore context elements without domain — committed to modernmt/modernmt by andrea-de-angelis 7 years ago
Hi @mzeidhassan,
I apologize for the very delayed response. We’ve been quite busy with the new neural releases lately, so I haven’t had much time for investigating this issue until now.
However I have good news 😃 I believe that with the commits and fixes we have pushed in the last weeks this issue was solved.
More in detail: I have replicated your issue on an old 1.0.3 engine using the context and db models you have sent me a while ago. I found out that the context analysis tried to use a domain (domain 14) that apparently was not contained in the
domains
table in the DB. That was the reason why the tuning failed.Having domains in the context analysis models and not finding them in the
domains
DB table should not be possible, so I’m not really sure about what has caused this; maybe a Cassandra inconsistency?However, some of the our recent commits (and my old hotfix!) should definitely prevent situations like this. So, since this issue seems finally solved I’m going to close it.
Of course, if you need further support do not hesitate to re-open this issue 😃
Have a nice day, and thanks again for your patience!
Andrea
Hi @mzeidhassan,
I’m glad the hotfix worked.
You can send the link to my mail address: andrea@translated.net
Thanks a million for your support. We really appreciate it!
I’ll keep you updated.
Kind regards, Andrea
Hi @mzeidhassan,
Thank you for the script result. Sorry for keeping you waiting. In these days I’ve been investigating your issue, and it seems that one of the domains in your DB is not working as it should.
I’ve just committed on
master
a hot fix that should let you ignore the problem domain. This should allow you to perform the tune.However, this is a strange issue indeed, and we would like to do further investigation on it. Could you please compress and send us your folders:
<your_mmt_root>/engines/<your_engine>/models/db
and<your_mmt_root>/engines/<your_engine>/models/context
?Please let me know if with this hot fix, everything goes well. Once again, thank you for your patience and support.
Andrea
Hi @mzeidhassan,
Thank you for the logs, they are really helping me understand the issue.
As for the duration of the first two tuning steps, don’t worry, they seem just fine to me.
Those steps are usually really fast: using 10M words for training, when I perform tuning those steps take 2s and 0s respectively, so I think everything is OK there.
It doesn’t look like a memory issue either.
As for the compatibility, if your old engine was v1.0 or more recent, it is directly compatible with v1.0.3. You can find more information on backwards compatibility here.
To check your MMT version, please check the README.md file in your MMT home folder: it should start with the version number.
To give me even more information on the problem, could you please run
./mmt tune --debug -e <your_engine_name>
and send me the resultingruntime/<your_engine_name>/logs
folder anduntime/<your_engine_name>/tmp/tuning
folder? It would be really useful.Thank you for your patience!
Best, Andrea