deeplearning4j: `WordVectorSerializer` cannot read gensim word2vec model
in gensim v2.3.0-py27
model = Word2Vec(sentences,
size=self._wv_config.vector_size,
window=self._wv_config.window_size,
min_count=self._wv_config.min_count,
workers=self._wv_config.workers,
sg=int(self._wv_config.use_skip_kgram),
iter=self._wv_config.num_epoch)
model.save(self._model_path)
in dl4j v0.9.1
import org.deeplearning4j.models.embeddings.loader.WordVectorSerializer
WordVectorSerializer.readWord2VecModel(new java.io.File("../data/wordvec.model"))
error
Exception in thread "main" java.lang.RuntimeException: Unable to guess input file format. Please use corresponding loader directly
at org.deeplearning4j.models.embeddings.loader.WordVectorSerializer.readWord2VecModel(WordVectorSerializer.java:2480)
at org.deeplearning4j.models.embeddings.loader.WordVectorSerializer.readWord2VecModel(WordVectorSerializer.java:2266)
at LoadWordVecModel$.delayedEndpoint$LoadWordVecModel$1(RunTensorflowModel.scala:47)
at LoadWordVecModel$delayedInit$body.apply(RunTensorflowModel.scala:45)
at scala.Function0.apply$mcV$sp(Function0.scala:34)
at scala.Function0.apply$mcV$sp$(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App.$anonfun$main$1$adapted(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:389)
at scala.App.main(App.scala:76)
at scala.App.main$(App.scala:74)
at LoadWordVecModel$.main(RunTensorflowModel.scala:45)
at LoadWordVecModel.main(RunTensorflowModel.scala)
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 15 (6 by maintainers)
I’ve got the same problem:
I solved it like this:
gensim (3.0.1, python 2.7.5):
model.wv.save_word2vec_format("abc.bin.gz", binary=True)in dl4j 0.9.1: