java: Failed to fetch latest snapshot for tensorflow-core-api linux gpu mkl

Hi, DJL’s TensorFlow engine is depending on tensorflow-core-api’ SNAPSHOT package. Our dependencies here: https://github.com/awslabs/djl/blob/master/tensorflow/tensorflow-native-auto/build.gradle#L19 We found out there is an update on tensorflow-core-api SNAPSHOT on 04/28, but the corresponding linux-gpu-mkl.jar is missing, same for windows.

Did the upload failed? https://oss.sonatype.org/#nexus-search;quick~tensorflow-core-api

we get 404 when trying to download jar, both gradle build and manually trying the following link failed. https://oss.sonatype.org/service/local/artifact/maven/redirect?r=snapshots&g=org.tensorflow&a=tensorflow-core-api&v=0.1.0-SNAPSHOT&e=jar&c=windows-x86_64

The mac-os-mkl.jar is there, but the libjnimkldnn.dylib and libiomp5.dylib extra libraries are missing, is this intended? how can I find them? We rely on this task to download native dependencies automatically for users based on their platform. Please help take a look, thank you so much!

image image(1)

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 29 (11 by maintainers)

Most upvoted comments

The mac-os-mkl.jar is there, but the libjnimkldnn.dylib and libiomp5.dylib extra libraries are missing, is this intended

Yes it is intended, we don’t distribute them in our artifacts, you need to have them installed on your machine or add dependencies to other JavaCPP artifacts to upload them via Maven/Gradle, like this one: https://github.com/bytedeco/javacpp-presets/tree/master/mkl-dnn. (@saudet correct me if I’m wrong)

Ok, it’s unfortunate but it looks like we cannot increase the timeout of a job on GitHub action on their hosted-runners beyond 6 hours, as dictated here. Setting the timeout-minutes parameter to a higher value didn’t helped.

So this build (based on TF2.2) is the best we had got so far on this CI solution. So I’ll go ahead with merging PR #44 and we will probably need to continue the discussion to find an alternative or a complement to it.

An update on this: I’ve added the --define build_with_mkl_dnn_v1_only=false flag to the bazel command line and now MacOS + MKL passes, looks like MKL-DNN v1 only works out with Linux.

I was planning to leave it that way if everything goes fine and then we can check how to enable MKL-DNN 1.x on all platforms or only on Linux.

I patched the CMake build locally so it added things to the include & library paths if running on macOS, after running brew install libomp. I think their CI was already configured in some way I didn’t quite understand.

Still, that is kind of refreshing to see all these green checks on that page, thanks @saudet !

The idea behind having MKL-DNN in a different package is because it’s over 100 MB in size (all platforms) and these binaries are used by other libraries, with which they can be shared, for example, MXNet: https://github.com/bytedeco/javacpp-presets/blob/master/mxnet/platform/pom.xml#L33-L37

The same applies to most popular native libraries used by other native libraries such as OpenBLAS, OpenCV, FFmpeg, Arrow, HDF5, LLVM, MKL, CUDA/cuDNN/NCCL/TensorRT, etc: https://github.com/bytedeco/javacpp-presets/tree/master/openblas https://github.com/bytedeco/javacpp-presets/tree/master/opencv https://github.com/bytedeco/javacpp-presets/tree/master/ffmpeg https://github.com/bytedeco/javacpp-presets/tree/master/arrow https://github.com/bytedeco/javacpp-presets/tree/master/hdf5 https://github.com/bytedeco/javacpp-presets/tree/master/llvm https://github.com/bytedeco/javacpp-presets/tree/master/mkl https://github.com/bytedeco/javacpp-presets/tree/master/cuda https://github.com/bytedeco/javacpp-presets/tree/master/tensorrt

I talk about the need of a distribution that would allow us to share components like that in this post: http://bytedeco.org/news/2019/01/11/importance-of-a-distribution/ This is going to be crucial for Java applications of machine learning to succeed in the enterprise.