tensorflow: Quantized graph fails to work on NVIDIA Jetson TX1 architecture although it worked on a normal PC?
System Information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow)?: No
- OS Platform and Distribution (i.e. Linux Ubuntu 16.0): Ubuntu 16.04 LTS on NVIDIA Jetson TX1 (Linux for Tegra 24.1)
- TensorFlow installed from (source or binary)?: Compiled from source and installed via this wheel available here: https://github.com/rwightman/tensorflow/releases/tag/v1.0.0-alpha-tegra-ugly_hack
- TensorFlow version (use command below): 1.0 Alpha
- CUDA/cuDNN version: 8.0/5.1
- GPU Model and Memory: Tegra X1, 4GB
Describe the problem clearly
When I ran a quantized graph on my laptop, it works quite as expected (just slightly lower accuracy compared to the frozen graph, while much lower size). However, when I transferred the exact same quantized graph to run on an Jetson TX1, the file gives me a highly inaccurate class prediction, at a 100% probability for the random class. On the other hand, when I tried to perform inference from the frozen graph (from which the quantized graph was derived) on both my laptop and the Jetson TX1, I got the exact same answers as expected.
So I suspected it could have been an issue of data transfer causing the file to be slightly corrupted. I checked the files byte by byte at all points of transfer (from my laptop to memory stick, then memory stick to the Jetson), but I found the files are exactly the same. This is the command I used to check: cmp $old_file $new_file || echo "different files".
Thus, I am suspecting it could be an issue of how tensorflow performs on the Jetson TX1 ARM architecture (aarch64). Is there anyway to verify this, and if it is indeed a performance issue on an ARM architecture, is there a way to resolve this?
Thank you for your help.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 19 (14 by maintainers)
@kwotsin
The “official” way to quantize a graph is described here, but I’m in the process of updating it slightly. In any case, I think the problem you are encountering is unrelated to the actual quantization steps.
My theory is that a different code path is triggered on a Jetson, because it supports NEON instructions. I’ll try to get a fix in the nightly builds before the end of next week. Meanwhile, if you’d like to build it yourself, try:
bazel build -c opt --copt="-DTENSORFLOW_DISABLE_META" //tensorflow/contrib/android:libtensorflow_inference.so --config=android_arm(or the equivalent for your architecture)Sorry @kwotsin . I don’t know what those errors are about. I usually build like this:
I just tried them. (The full instructions are on tensorflow’s website)
The random single-category symptom is the same as the issue seen in https://github.com/tensorflow/tensorflow/issues/7150#issuecomment-282343046, so maybe this is related.
As @andrehentz suggests there, maybe try quantizing with both quantize_weights and quantize_nodes?
@andrewharp might be more up-to-date on the Jetson progress.