open_spiel: Build with tensorflow_cc: tf_trajectories_example, vpnet_test, and alpha_zero_example_test fail

Ubuntu 18.04 cmake version 3.18.4 clang version 10.0.0-4ubuntu1~18.04.2 bazel 2.0.0 Python 3.8.8

Build succeeds when following standard installation instructions (i.e. BUILD_WITH_TENSORFLOW_CC=OFF). Following instructions from alphazero.md and https://github.com/deepmind/open_spiel/issues/172 I installed @mrdaliri 's open_spiel branch of tensorflow_cc and followed installation instructions. Tensorflow_cc built/installed fine, as did the usage examples. After running: BUILD_WITH_TENSORFLOW_CC=ON ./install.sh and CXX=/usr/bin/clang++ BUILD_WITH_TENSORFLOW_CC=ON ./open_spiel/scripts/build_and_run_tests.shc, I noticed that tf_trajectories_example, vpnet_test, and alpha_zero_example_test were each failing with the same error:

tf_trajectories_example:

1/189 Test #94: tf_trajectories_example …Child aborted***Exception: 0.75 sec 2021-03-17 23:21:02.238356: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2021-03-17 23:21:02.357149: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3600000000 Hz 2021-03-17 23:21:02.357492: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x24a7d20 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2021-03-17 23:21:02.357505: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2021-03-17 23:21:02.499277: E tensorflow/core/framework/op_segment.cc:54] Create kernel failed: Invalid argument: NodeDef mentions attr ‘allowed_devices’ not in Op<name=VarHandleOp; signature= -> resource:resource; attr=container:string,default=“”; attr=shared_name:string,default=“”; attr=dtype:type; attr=shape:shape; is_stateful=true>; NodeDef: {{node beta1_power}}. (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.). 2021-03-17 23:21:02.508800: F /…/open_spiel/open_spiel/contrib/tf_trajectories.cc:114] Non-OK-status: tf_session_->Run({}, {}, {“init_all_vars_op”}, nullptr) status: Invalid argument: NodeDef mentions attr ‘allowed_devices’ not in Op<name=VarHandleOp; signature= -> resource:resource; attr=container:string,default=“”; attr=shared_name:string,default=“”; attr=dtype:type; attr=shape:shape; is_stateful=true>; NodeDef: {{node beta1_power}}. (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.). [[beta1_power]]

vpnet_test:

2021-03-17 23:21:12.672911: E tensorflow/core/framework/op_segment.cc:54] Create kernel failed: Invalid argument: NodeDef mentions attr ‘allowed_devices’ not in Op<name=VarHandleOp; signature= -> resource:resource; attr=container:string,default=“”; attr=shared_name:string,default=“”; attr=dtype:type; attr=shape:shape; is_stateful=true>; NodeDef: {{node beta1_power}}. (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.). 2021-03-17 23:21:12.697392: F /…/open_spiel/open_spiel/algorithms/alpha_zero/vpnet.cc:92] Non-OK-status: tf_session_->Run({}, {}, {“init_all_vars_op”}, nullptr) status: Invalid argument: NodeDef mentions attr ‘allowed_devices’ not in Op<name=VarHandleOp; signature= -> resource:resource; attr=container:string,default=“”; attr=shared_name:string,default=“”; attr=dtype:type; attr=shape:shape; is_stateful=true>; NodeDef: {{node beta1_power}}. (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.). [[beta1_power]]

alpha_zero_example_test:

2021-03-17 23:21:45.369748: E tensorflow/core/framework/op_segment.cc:54] Create kernel failed: Invalid argument: NodeDef mentions attr ‘allowed_devices’ not in Op<name=VarHandleOp; signature= -> resource:resource; attr=container:string,default=“”; attr=shared_name:string,default=“”; attr=dtype:type; attr=shape:shape; is_stateful=true>; NodeDef: {{node beta1_power}}. (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.). 2021-03-17 23:21:45.371901: F /…/open_spiel/open_spiel/algorithms/alpha_zero/vpnet.cc:92] Non-OK-status: tf_session_->Run({}, {}, {“init_all_vars_op”}, nullptr) status: Invalid argument: NodeDef mentions attr ‘allowed_devices’ not in Op<name=VarHandleOp; signature= -> resource:resource; attr=container:string,default=“”; attr=shared_name:string,default=“”; attr=dtype:type; attr=shape:shape; is_stateful=true>; NodeDef: {{node beta1_power}}. (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.). [[beta1_power]]

After searching around a little bit, it looks like this is usually caused by a version mismatch between tf and the tf model server, but I haven’t seen any dependency on tf model server for open_spiel or tensorflow_cc. I have a feeling there is a simple/key piece I am missing here. Have you seen this error before?

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 19 (9 by maintainers)

Most upvoted comments

@lanctot Apologies for the delay – I was tied up at the end of last week / weekend responding to reviewers 😃 Anyways, just submitted this PR for the minor fix: https://github.com/deepmind/open_spiel/pull/547#issue-602959881

As for libtorch vs. tensorflow_cc, I am planning to use OpenSpiel for research as well, so I have started using the libtorch version for now just to start making progress. I can try to pickup where we left off with tensorflow_cc if I have an extra moment.