tensorflow: Quantization:quantize_graph produces corrupted graph

Hi, sorry to bother again. In order to save some space I would like to quantize my graph, but everytime I run quantize_graph, it becomes unusable in android app later on. I would like to generally ask, how is this facility meant to be used. What I have done:

  1. retrained graph (works fine)
  2. optimized it using optimize_for_inference (works fine)
  3. bazel-bin/tensorflow/tools/quantization/quantize_graph
    –input=/tmp/optimized.pb
    –output=/tmp/eightbit.pb
    –output_node_names=“final_result_a,final_result_b”
    –mode=eightbit

Then it threws on android following (excerpt):

I/native: tensorflow_inference_jni.cc:85 Creating new session variables for 635cce3c13fa1ff4 I/native: tensorflow_inference_jni.cc:113 Loading Tensorflow. I/native: tensorflow_inference_jni.cc:120 Session created. I/native: tensorflow_inference_jni.cc:126 Acquired AssetManager. I/native: tensorflow_inference_jni.cc:128 Reading file to proto: file:///android_asset/eightbit.pb I/native: jni_utils.cc:111 Opening asset eightbit.pb from disk with zero-copy. I/native: tensorflow_inference_jni.cc:132 GraphDef loaded from file:///android_asset/eightbit.pb with 1345 nodes. I/native: stat_summarizer.cc:38 StatSummarizer found 1345 nodes I/native: tensorflow_inference_jni.cc:139 Creating TensorFlow graph from GraphDef. I/native: tensorflow_inference_jni.cc:151 Initialization done in 234.141ms Which I suppose is correct

But then:

I/native: tensorflow_inference_jni.cc:228 End computing. Ran in 2665ms (2665ms avg over 1 runs) A/libc: Fatal signal 11 (SIGSEGV), code 1, fault addr 0xb95eb000 in tid 31217 (inference) Application terminated. Would you please once again help? Thanks

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 54 (23 by maintainers)

Commits related to this issue

Most upvoted comments

I have tracked down the issue with the transform graph command line. It’s in the argument parsing code, and I have a fix pending, but in the meantime you can remove the new lines and it should work:

bazel-bin/tensorflow/tools/graph_transforms/transform_graph --in_graph=output.pb --out_graph=stripped.pb --inputs='Mul:0' --outputs="final_result_musoth,final_result" --transforms='strip_unused_nodes(type=float, shape="1,299,299,3") remove_nodes(op=Identity, op=CheckNumerics) fold_constants(ignore_errors=true) fold_batch_norms fold_old_batch_norms'

To be clear, you have instantiated two unique TensorFlowInferenceInterface objects in your app, and you use one for each layer you’re interested in? If you make the calls in the opposite order, does it still crash?

To workaround your issue, if the input data remains the same in computing finaleResultA/B, you can and should get both output layers from a single runInference call on a single TensorFlowInferenceInterface instance. This will save you multiple seconds of duplicate computation anyway.

It’s possible you’ve run into a bug in TF, but without doublechecking that all the identifiers are correct, you’re not using any of the same buffers etc, it’s hard to diagnose if the problem is in TF or your app. Sometimes things can seem to work in one situation, when really there is still a problem and the other codepaths just aren’t as picky about the data access. If you have a minimal reproducible example you’d be willing to share I can take a look, but otherwise I’d suggest just using a single TensorFlowInferenceInterface.