tfjs: running some models using webgl is 10x slower than using nodejs

Model executes in NodeJS using tensorflow backend in ~100ms, but above 1sec in browser using WebGL backend That is 10x difference in performance between NodeJS and WebGL

I’ve tried to figure out why a model is executing an order of magnitude slower than expected using WebGL backend,
but profiler info for it makes no sense

const t0 = performance.now();
const res = await tf.profile(() => model.executeAsync(input));
const t1 = performance.now();
const wallTime = t1 - t0;
const kernelTime = res.kernels.reduce((a, b) => a += b.kernelTimeMs, 0);

wallTime is 900-1200ms
kernelTime is ~20ms

I’m re-running inference on the same input twice and looking at second run to allow for warmup time of WebGL backend (shader compile, etc.)

I even tried tf.enableDebugMode() and I still don’t see anything that gets even close to overall wall time

And I have no idea where is time spent?

Model in question: https://github.com/vladmandic/nanodet

Environment: TFJS 3.3.0 on Ubuntu 20.10 and Chrome 89

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 16 (8 by maintainers)

Most upvoted comments

inference time is simple:

in browser:

const t0 = performance.now();
const res = await model.executeAsync(input));
const t1 = performance.now();
const inferenceTime = Math.round(t1 - t0); // elapsed time in ms

and in nodejs:

const t0 = process.hrtime.bigint();
const res = await model.executeAsync(input)); // also can be model.predict(input);
const t1 = process.hrtime.bigint();
const inferenceTime = Math.round((t1 - t0) / 1000 /1000); // to convert to ms

and i’m only measuring subsequent runs, skipping first run so webgl has time to warmup (compile&upload shaders)

for most models, webgl is fast and it’s the best option (since node-gpu doesn’t work due to obsolete cuda dependency from tf1). however, for some models (i’ve provided examples), webgl is 10x slower than nodejs. but running profile() shows nothing useful.