tfjs: running some models using webgl is 10x slower than using nodejs
Model executes in NodeJS using tensorflow backend in ~100ms, but above 1sec in browser using WebGL backend
That is 10x difference in performance between NodeJS and WebGL
I’ve tried to figure out why a model is executing an order of magnitude slower than expected using WebGL backend,
but profiler info for it makes no sense
const t0 = performance.now();
const res = await tf.profile(() => model.executeAsync(input));
const t1 = performance.now();
const wallTime = t1 - t0;
const kernelTime = res.kernels.reduce((a, b) => a += b.kernelTimeMs, 0);
wallTime is 900-1200ms
kernelTime is ~20ms
I’m re-running inference on the same input twice and looking at second run to allow for warmup time of WebGL backend (shader compile, etc.)
I even tried tf.enableDebugMode() and I still don’t see anything that gets even close to overall wall time
And I have no idea where is time spent?
Model in question: https://github.com/vladmandic/nanodet
Environment: TFJS 3.3.0 on Ubuntu 20.10 and Chrome 89
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 16 (8 by maintainers)
inference time is simple:
in browser:
and in nodejs:
and i’m only measuring subsequent runs, skipping first run so webgl has time to warmup (compile&upload shaders)
for most models, webgl is fast and it’s the best option (since node-gpu doesn’t work due to obsolete cuda dependency from tf1). however, for some models (i’ve provided examples), webgl is 10x slower than nodejs. but running
profile()shows nothing useful.