GPflow: Predict call execution time degrades linearly with additional calls

OS Platform and Distribution: Linux Ubuntu 17.04
TensorFlow installed from (source or binary): binary
Tensorflow version: tensorflow-gpu==1.4.0
GPflow version: gpflow-1.0.0 (from github master source)
CUDNN installed: 6
GPU: GeForce GTX 1080 Ti
Python Version: Python 3.5.3

Issue: Repeated calls to .predict (either .predict_f or .predict_y) have degrading execution time when called successively. The initial call might take perhaps 10ms, but later calls might take as long as several seconds each. This poses a problem when many calls to *.predict are desired. I did not observe this behaviour in versions of gpflow prior to 1.0.0. This occurs for at least the regression models (GPR, SGPR, SVGP), and possibly others.

import numpy as np
import matplotlib.pyplot as plt
import gpflow
import time

X = np.linspace(0,10, 100).reshape(-1,1)
y = np.sin(X) + np.random.normal(0, 0.25, size=100).reshape(-1,1)

k = gpflow.kernels.RBF(input_dim=1)
m = gpflow.models.gpr.GPR(X, y, kern=k)
m.compile()
gpflow.train.ScipyOptimizer().minimize(m)

print('Done optimize.')
predict_duration_log = []
for i in range(250):
    start = time.time()
    m.predict_y(X) 
    end = time.time()
    predict_duration_log.append(end-start)

plt.plot(predict_duration_log)
plt.show()

About this issue

Original URL
State: closed
Created 7 years ago
Comments: 18 (16 by maintainers)

Commits related to this issue

[#568] Improve initialization status checking performance. — committed to GPflow/GPflow by awav 7 years ago
[#568, #561] Test for Dataset iterators is not possible. — committed to GPflow/GPflow by awav 7 years ago
Address issues #568, #561 (#575) * Add a bunch of test for Parameter and DataHolder. Minibatch seed can be changed after cleaning or in defer_build. * Add dataholder tests. * Add failure creati... — committed to GPflow/GPflow by awav 7 years ago

Most upvoted comments

Here is speed results after improvements which I made:

W/O checking if variables were initialized (sec):

In [9]: woi_pd.describe(percentiles=[0.75,0.9,0.999])
Out[9]:
count    50000.000000
mean         0.002592
std          0.003511
min          0.001846
50%          0.002432
75%          0.002665
90%          0.003011
99.9%        0.005679
max          0.388551
dtype: float64

WITH checking if variables were initialized (sec):

In [10]: wi_pd.describe(percentiles=[0.75,0.9,0.999])
Out[10]:
count    50000.000000
mean         0.004250
std          0.003326
min          0.003169
50%          0.004069
75%          0.004368
90%          0.004832
99.9%        0.008807
max          0.405657
dtype: float64

Let’s vote:

+1 leave auto-initializing as is
-1 we can live w/o initializing 😃

awav on Nov 24, 2017

@markvdw, it is not a degradation. I’m sorry for confusion - I edited an image above - my previous experiments were wrong. Here is new graphs:

So, there is only small overhead 2-3ms, which is constant.

awav on Nov 24, 2017