TensorFlow.NET: Segmentation fault in multithread app (v0.11.2)
App is crashing during sess.run with message Segmentation fault (core dumped)
on docker or Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
on Windows.
Stack trace on Windows
at Tensorflow.c_api.TF_SessionRun(IntPtr session, TF_Buffer* run_options, TF_Output[] inputs, IntPtr[] input_values, Int32 ninputs, TF_Output[] outputs, IntPtr[] output_values, Int32 noutputs, IntPtr[] target_opers, Int32 ntargets, IntPtr run_metadata, IntPtr status)
at Tensorflow.BaseSession._call_tf_sessionrun(KeyValuePair`2[] feed_dict, TF_Output[] fetch_list, List`1 target_list)
at Tensorflow.BaseSession._do_run(List`1 target_list, List`1 fetch_list, Dictionary`2 feed_dict)
at Tensorflow.BaseSession._run(Object fetches, FeedItem[] feed_dict)
at Tensorflow.BaseSession.run(Tensor fetche, FeedItem[] feed_dict)
I created small example project for tests: https://github.com/deadman2000/TensorFlowNetMultithreading
About this issue
- Original URL
- State: open
- Created 5 years ago
- Comments: 22 (11 by maintainers)
Commits related to this issue
- MultithreadingTests.cs: Added unit-test for case #380 — committed to SciSharp/TensorFlow.NET by Nucs 5 years ago
So I think issue is caused by the following usage of nd.GetData() in Tensor.Creation.cs. I guess that starts pointing to GC controlled memory with no guarantees it will stay at the same address after GC work.
Changing to the following helped (probably with some performance degradation which I didn’t notice due to small input dataset):
After this change I do not reproduce the crash anymore, but I will keep testing this.
@Mghobadid fix by #533 should do the trick. Not the most efficient way, but seems to work at least on CPU (and I reproduced exactly same issue on GPU, so should be same)…
After I’ll get my hands on a dump and research it. I’ll let you know.
If you’ll need to do multi-threaded unit tests in the future, you are welcome to use
MultiThreadedUnitTestExecuter
I wrote for the library: https://github.com/SciSharp/TensorFlow.NET/blob/master/test/TensorFlowNET.UnitTest/Utilities/MultiThreadedUnitTestExecuter.csUsage: