modernmt: Cannot load my engine - failed to start node

Hello MMT,

Not sure what I am having some issues with MMT, especially after porting my old engine.

Here is what I see in the terminal after running ./mmt start

Starting MMT engine ‘default’… OK Loading models… FAIL ERROR Unexpected exception: Exception(‘failed to start node, check log file for more details: /home/mlearner/ModernMT/runtime/default/logs/node.log’,)

and here is the node.log output.

By the way, I have done clean build already using these commands:

`git checkout – . git clean -fd .

rm -rf build/* git pull origin master

cd vendor ./compile cd …/src mvn clean install`

I also noticed something else:

According to this issue , the command was:

extras/porting/convert_from_v2.0.2_to_v2.0.3 --engine-path engines/default

But now I see a different script convert_from_v2.0.2_to_v2.1in ‘ModernMT/extras/porting’

Any idea?

I hope there is a quick fix.

Thanks, Mohamed

2017-12-16 12:55:23,048 INFO eu.modernmt.cluster.ClusterNode [main] Node is joining the cluster 2017-12-16 12:55:23,209 INFO com.hazelcast.instance.AddressPicker [main] [LOCAL] [eu.modernmt] [3.9.1] Interfaces is disabled, trying to pick one address from TCP-IP config addresses: [] 2017-12-16 12:55:23,209 INFO com.hazelcast.instance.AddressPicker [main] [LOCAL] [eu.modernmt] [3.9.1] Prefer IPv4 stack is true. 2017-12-16 12:55:23,209 WARN com.hazelcast.instance.AddressPicker [main] [LOCAL] [eu.modernmt] [3.9.1] Could not find a matching address to start with! Picking one of non-loopback addresses. 2017-12-16 12:55:23,213 INFO com.hazelcast.instance.AddressPicker [main] [LOCAL] [eu.modernmt] [3.9.1] Picked [172.17.0.1]:5016, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5016], bind any local is true 2017-12-16 12:55:23,213 INFO com.hazelcast.instance.AddressPicker [main] [LOCAL] [eu.modernmt] [3.9.1] Using public address: [127.0.1.1]:5016 2017-12-16 12:55:23,218 INFO com.hazelcast.system [main] [127.0.1.1]:5016 [eu.modernmt] [3.9.1] Hazelcast 3.9.1 (20171130 - feca534) starting at [127.0.1.1]:5016 2017-12-16 12:55:23,218 INFO com.hazelcast.system [main] [127.0.1.1]:5016 [eu.modernmt] [3.9.1] Copyright © 2008-2017, Hazelcast, Inc. All Rights Reserved. 2017-12-16 12:55:23,218 INFO com.hazelcast.system [main] [127.0.1.1]:5016 [eu.modernmt] [3.9.1] Configured Hazelcast Serialization version: 1 2017-12-16 12:55:23,284 INFO com.hazelcast.spi.impl.operationservice.impl.BackpressureRegulator [main] [127.0.1.1]:5016 [eu.modernmt] [3.9.1] Backpressure is disabled 2017-12-16 12:55:23,472 INFO com.hazelcast.instance.Node [main] [127.0.1.1]:5016 [eu.modernmt] [3.9.1] Creating TcpIpJoiner 2017-12-16 12:55:23,532 INFO com.hazelcast.spi.impl.operationexecutor.impl.OperationExecutorImpl [main] [127.0.1.1]:5016 [eu.modernmt] [3.9.1] Starting 8 partition threads and 5 generic threads (1 dedicated for priority tasks) 2017-12-16 12:55:23,534 INFO com.hazelcast.internal.diagnostics.Diagnostics [main] [127.0.1.1]:5016 [eu.modernmt] [3.9.1] Diagnostics disabled. To enable add -Dhazelcast.diagnostics.enabled=true to the JVM arguments. 2017-12-16 12:55:23,536 INFO com.hazelcast.core.LifecycleService [main] [127.0.1.1]:5016 [eu.modernmt] [3.9.1] [127.0.1.1]:5016 is STARTING 2017-12-16 12:55:23,543 INFO com.hazelcast.system [main] [127.0.1.1]:5016 [eu.modernmt] [3.9.1] Cluster version set to 3.9 2017-12-16 12:55:23,543 INFO com.hazelcast.internal.cluster.ClusterService [main] [127.0.1.1]:5016 [eu.modernmt] [3.9.1]

Members {size:1, ver:1} [ Member [127.0.1.1]:5016 - 02b7ed04-9a06-4508-9fad-8e24088d7eb6 this ]

2017-12-16 12:55:23,553 INFO com.hazelcast.core.LifecycleService [main] [127.0.1.1]:5016 [eu.modernmt] [3.9.1] [127.0.1.1]:5016 is STARTED 2017-12-16 12:55:23,554 INFO eu.modernmt.cluster.ClusterNode [main] Node joined the cluster in 0.364s 2017-12-16 12:55:23,554 WARN eu.modernmt.cluster.ClusterNode [main] Model syncing not supported in this version! 2017-12-16 12:55:23,555 INFO eu.modernmt.cluster.ClusterNode [main] Model loading started 2017-12-16 12:55:23,841 INFO eu.modernmt.decoder.neural.execution.NativeProcess [Thread-4] (nmmt.NMTDecoder) Loading “en__ar” model from checkpoint… START 2017-12-16 12:55:28,339 INFO eu.modernmt.decoder.neural.execution.NativeProcess [Thread-4] (nmmt.NMTDecoder) Loading “en__ar” model from checkpoint END 4.50s 2017-12-16 12:55:28,339 ERROR eu.modernmt.decoder.neural.execution.NativeProcess [Thread-4] (root) init() got an unexpected keyword argument ‘dim’ Traceback (most recent call last): File “main_loop.py”, line 171, in run_main decoder = NMTDecoder(args.model, gpu_id=args.gpu, random_seed=3435) File “/home/mlearner/ModernMT/build/lib/pynmt/nmmt/NMTDecoder.py”, line 66, in init self._engines[key].running_state = NMTEngine.HOT File “/home/mlearner/ModernMT/build/lib/pynmt/nmmt/NMTEngine.py”, line 384, in running_state self.__load() File “/home/mlearner/ModernMT/build/lib/pynmt/nmmt/NMTEngine.py”, line 162, in __load decoder = Models.Decoder(self.metadata, self.trg_dict) File “/home/mlearner/ModernMT/build/lib/pynmt/onmt/Models.py”, line 119, in init self.attn = onmt.modules.GlobalAttention(opt.rnn_size) File “/home/mlearner/ModernMT/build/lib/pynmt/onmt/modules/GlobalAttention.py”, line 31, in init self.sm = nn.Softmax(dim=1) TypeError: init() got an unexpected keyword argument ‘dim’ 2017-12-16 12:55:28,419 FATAL eu.modernmt.facade.ModernMT [main] Unexpected exception thrown by thread [main] java.lang.NullPointerException at java.io.StringReader.<init>(StringReader.java:50) at com.google.gson.JsonParser.parse(JsonParser.java:45) at eu.modernmt.decoder.neural.execution.NativeProcess.deserialize(NativeProcess.java:225) at eu.modernmt.decoder.neural.execution.NativeProcess.<init>(NativeProcess.java:134) at eu.modernmt.decoder.neural.execution.NativeProcess$Builder.start(NativeProcess.java:101) at eu.modernmt.decoder.neural.execution.NativeProcess$Builder.startOnGPU(NativeProcess.java:75) at eu.modernmt.decoder.neural.execution.StartNativeProcessGpuTask.call(StartNativeProcessGpuTask.java:34) at eu.modernmt.decoder.neural.execution.SingletonExecutionQueue.forGPU(SingletonExecutionQueue.java:47) at eu.modernmt.decoder.neural.execution.ExecutionQueue.newGPUInstance(ExecutionQueue.java:29) at eu.modernmt.decoder.neural.NeuralDecoder.<init>(NeuralDecoder.java:65) at eu.modernmt.engine.impl.NeuralEngine.<init>(NeuralEngine.java:32) at eu.modernmt.engine.Engine.load(Engine.java:70) at eu.modernmt.cluster.ClusterNode.start(ClusterNode.java:314) at eu.modernmt.facade.ModernMT.start(ModernMT.java:49) at eu.modernmt.cli.ClusterNodeMain.main(ClusterNodeMain.java:168)

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 19 (8 by maintainers)

Most upvoted comments

Great! Thanks Mohamed for your constant and amazing support to the project.

Have a great day and happy holidays! Davide

Thanks @davidecaroselli !

OK, I have now installed Pytorch 0.3.0 for Python 2.7 using this command:

sudo python2.7 /usr/bin/pip install http://download.pytorch.org/whl/cu90/torch-0.3.0.post4-cp27-cp27mu-linux_x86_64.whl

I had to specify Python 2.7 version and path to pip to be able to install it to Python 2.7.

Now everything is working just fine and when I run mmt, I see now the correct version:

“PyTorch version is 0.3.0.post4”

I hope this will help someone else who may struggle with this issue. This was a headache indeed, but I am happy that it all ended well.

Thanks again Davide for your support. Mohamed

So, MMT can work with Python 3.6, Pytorch 0.3.0? Or do I need Python 2.7 to make it work?

MMT only works with Python 2.7. With the Anaconda environment, do you have PyTorch 0.3.0 installed also on Python 2.7? If not, try installing it and then run again MMT. Maybe Anaconda is fine, the problem is just that PyTorch 0.3.0 is not available for the right version of python.