tensorflow: Compiler Test cases with tf-mlir-translate pass/crash with specific build flag on s390x architecture
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
- TensorFlow installed from (source or binary): source
- TensorFlow version (use command below): 2.3.1
- Python version: 3.6.9
- Bazel version (if compiling from source): 3.4.1
- GCC/Compiler version (if compiling from source): Ubuntu 7.5.0-3ubuntu1~18.04
- CUDA/cuDNN version: N/A
- GPU model and memory: N/A
Describe the current behavior
When running test case //tensorflow/compiler/mlir/xla/tests/translate:while.hlotxt.test
on s390x machine, if I include the build flag: --per_file_copt=mlir,llvm-project@-UNDEBUG
the Test case passes, but if I remove the build flag, it fails with a bad alloc
crash. Backtrace is attached below.
Another test case:
//tensorflow/compiler/tf2xla:fused_batchnorm_reserve_space_test
, unlike while.hlotxt.test
, fails with this build flag but passes without it.
There are multiple test case failures in //tensorflow/compiler/...
with similar crash. The command I am using to test:
bazel --host_jvm_args="-Xms1024m" --host_jvm_args="-Xmx2048m" test --host_javabase="@local_jdk//:jdk" --test_tag_filters=-gpu,-benchmark-test,-v1only,-no_oss,-oss_serial -k --test_timeout 300,450,1200,3600 --build_tests_only --test_output=errors --per_file_copt=mlir,llvm-project@-UNDEBUG -- //tensorflow/compiler/...
Please note that there is no regression in the compiler test cases with --per_file_copt=mlir,llvm-project@-UNDEBUG
on x86 machine.
Describe the expected behavior Test case should pass and test case behaviour should not vary with build flag.
Other info / logs while.hlotxt.test.log
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 21 (18 by maintainers)
@smit-hinsu @kun-lu20, Thank you for the confirmation. Glad the issue is resolved for you, please feel free to move this to closed status.
Hi @tilakrayal ,
I am @skribm9 's colleague. Thanks for your response.
Yes, test case
//tensorflow/compiler/mlir/xla/tests/translate:while.hlotxt.test
could pass with both optimized binary and debug binary. We’ve verified it on v2.9.1.