AMDMIGraphX: Resnext50 failing to run on MIGraphX Driver

While trying https://zenodo.org/record/6617879/files/resnext50_32x4d_fpn.onnx with… migraphx-driver perf ./resnext50_32x4d_fpn.onnx

I hit an error indicating an operator is not supported by migraphx… what(): /workspace/AMDMIGraphX/src/onnx/onnx_parser.cpp:318: parse_graph: Unknown operator: Mod

Investigate and add

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 38

Commits related to this issue

Add in changes for onnx Mod operator Initial operator for mod implimentation and test cases for integer and floating based types. Need to use fmod from stdlib for floating point types. half_float::h... — committed to ROCm/AMDMIGraphX by TedThemistokleous 2 years ago
Initial Changes for gpu and cpu implimentation for mod and fmod Initial changes for gpu/cpu implementation to get recognized. Still needs some changes for tests and if HIP supports fmod and remainde... — committed to ROCm/AMDMIGraphX by TedThemistokleous 2 years ago
Initial Changes for gpu and cpu implimentation for mod and fmod Initial changes for gpu/cpu implementation to get recognized. Still needs some changes for tests and if HIP supports fmod and remainde... — committed to ROCm/AMDMIGraphX by TedThemistokleous 2 years ago
Initial Changes for gpu and cpu implimentation for mod and fmod Initial changes for gpu/cpu implementation to get recognized. Still needs some changes for tests and if HIP supports fmod and remainde... — committed to ROCm/AMDMIGraphX by TedThemistokleous 2 years ago
Add onnx mod operator (#1302) * Add in changes for onnx Mod operator Initial operator for mod implementation and test cases for integer and floating based types. Need to use fmod from stdlib fo... — committed to ROCm/AMDMIGraphX by TedThemistokleous 2 years ago
Initial Changes for gpu and cpu implimentation for mod and fmod Initial changes for gpu/cpu implementation to get recognized. Still needs some changes for tests and if HIP supports fmod and remainde... — committed to ROCm/AMDMIGraphX by TedThemistokleous 2 years ago
Add tupleVisitor for from_gpu (#1465) Need this for when we debug and use MIGRAPHX_TRACE_EVAL() to show tuples Without this we break when reading our buffer due to the use of visit() This came up a... — committed to ROCm/AMDMIGraphX by TedThemistokleous 2 years ago
Adding testcase for large max boxes and sanity nms test Realize we didn't actually have a compute test for nonmaxsuppression to just run this during our test runs. Add it as a part of comparrison wit... — committed to ROCm/AMDMIGraphX by TedThemistokleous a year ago

Most upvoted comments

We’re finally getting runs on resnext50 thanks to @pfultz2 and hinding and fixing an issue with concat vectorization #1653 In the meantime i was trying to make our NMS multithreaded for a single huge batch thinking that was still the issue as we worked in parallel.

Adding both of these fixes ontop of each-other gives us the following result

Summary:
gpu::topk: 25017.3ms / 5 = 5003.47ms, 85%
nonmaxsuppression: 3860.93ms / 1 = 3860.93ms, 14%
hip::copy_from_gpu: 183.658ms / 4 = 45.9146ms, 1%
gpu::code_object::reduce_kernel: 143.436ms / 1 = 143.436ms, 1%
gpu::nonzero: 127.553ms / 5 = 25.5106ms, 1%
hip::copy_to_gpu: 70.1181ms / 1 = 70.1181ms, 1%
gpu::miopen_fusion: 23.6265ms / 49 = 0.482174ms, 1%
gpu::convolution: 20.3926ms / 53 = 0.384766ms, 1%
gpu::code_object::gather_kernel: 15.7473ms / 50 = 0.314946ms, 1%
gpu::code_object::concat_kernel: 9.74119ms / 19 = 0.512694ms, 1%
gpu::code_object::mul_add_kernel: 6.44609ms / 21 = 0.306956ms, 1%
gpu::code_object::convert_kernel: 4.20801ms / 12 = 0.350667ms, 1%
gpu::code_object::max_min_kernel: 3.53736ms / 10 = 0.353737ms, 1%
gpu::code_object::sub_kernel: 3.49076ms / 20 = 0.174538ms, 1%
gpu::code_object::min_exp_mul_mul_kernel: 3.35976ms / 10 = 0.335976ms, 1%
gpu::code_object::add_relu_kernel: 3.00536ms / 37 = 0.0812258ms, 1%
gpu::code_object::concat_add_kernel: 2.82924ms / 1 = 2.82924ms, 1%
gpu::code_object::add_kernel: 2.11693ms / 17 = 0.124525ms, 1%
gpu::code_object::contiguous_kernel: 1.44307ms / 10 = 0.144307ms, 1%
gpu::code_object::add_add_relu_kernel: 1.2778ms / 12 = 0.106483ms, 1%
gpu::code_object::concat_mod_kernel: 0.97079ms / 1 = 0.97079ms, 1%
gpu::code_object::less_convert_convert_logical_xor_mod_equal_convert_convert_not_logical_and_mul_add_where_kernel: 0.739805ms / 5 = 0.147961ms, 1%
gpu::code_object::gathernd_kernel: 0.723255ms / 5 = 0.144651ms, 1%
gpu::code_object::sigmoid_kernel: 0.655794ms / 5 = 0.131159ms, 1%
gpu::code_object::greater_convert_kernel: 0.479331ms / 5 = 0.0958662ms, 1%
gpu::code_object::mul_kernel: 0.47863ms / 1 = 0.47863ms, 1%
load: 0.365675ms / 407 = 0.000898464ms, 1%
gpu::pooling: 0.250205ms / 1 = 0.250205ms, 1%
hip::hip_copy_literal: 0.146673ms / 147 = 0.000997776ms, 1%
slice: 0.076922ms / 46 = 0.00167222ms, 1%
unsqueeze: 0.06204ms / 60 = 0.001034ms, 1%
broadcast: 0.060922ms / 53 = 0.00114947ms, 1%
multibroadcast: 0.055462ms / 56 = 0.000990393ms, 1%
step: 0.0426ms / 33 = 0.00129091ms, 1%
reshape: 0.035711ms / 36 = 0.000991972ms, 1%
gpu::code_object::relu_kernel: 0.0212ms / 1 = 0.0212ms, 1%
get_tuple_elem: 0.02079ms / 10 = 0.002079ms, 1%
squeeze: 0.019261ms / 12 = 0.00160508ms, 1%
transpose: 0.01602ms / 15 = 0.001068ms, 1%
@param: 0.006191ms / 4 = 0.00154775ms, 1%
flatten: 0.00549ms / 5 = 0.001098ms, 1%
hip::sync_stream: 0.00529ms / 1 = 0.00529ms, 1%
check_context::migraphx::version_2_6_0::gpu::context: 0.00369ms / 1 = 0.00369ms, 1%
hip::hip_allocate_memory: 0.00231ms / 1 = 0.00231ms, 1%

Batch size: 1
Rate: 0.0331548/sec
Total time: 30161.5ms
Total instructions time: 29509.5ms
Overhead time: 0.636814ms, 652.028ms
Overhead: 0%, 2%

We still have an issue with accuracy when using the accuracy checker and we’re currently seeing an HSA_FAULT when running with MIGRAPHX_GPU_DEBUG =1

./migraphx/kernels/gather.hpp:59: operator(): error: Out of bounds access at offset: 23760000
:0:rocdevice.cpp            :2647: 4812524667207 us: 3344692: [tid:0x7f5677d9a700] Device::callbackQueue aborting with error : HSA_STATUS_ERROR_EXCEPTION: An HSAIL operation

which is something I’m looking into between reviews

TedThemistokleous on Mar 29, 2023