GPflow: Issues in GP regression with the Matern Kernel. Is RBF more numerically stable than the Matern Kernel?

I am doing GP regression with some real data dim(X)=[N,3] and dim[Y] = [N,1]. When I use the RBF Kernel it works fine when I use the Matern kernel it gives me some errors, probably in computing the cholesky during the optimazation. Here an idea of my code:

import GPflow

m = GPflow.gpr.GPR(X, Y, kern=GPflow.kernels.RBF(3))
m.optimize()


m2 = GPflow.gpr.GPR(X, Y, kern=GPflow.kernels.Matern52(3)+ GPflow.kernels.White(input_dim=3))
m2.optimize()

This is the error that I obtain

2017-08-21 17:49:05.142629: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-21 17:49:05.142639: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-21 17:49:05.142641: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-08-21 17:49:05.142643: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-21 17:49:05.142645: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-08-21 17:49:05.246520: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-08-21 17:49:05.246789: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties: 
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.86
pciBusID 0000:01:00.0
Total memory: 7.92GiB
Free memory: 7.16GiB
2017-08-21 17:49:05.246798: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0 
2017-08-21 17:49:05.246801: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   Y 
2017-08-21 17:49:05.246806: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
2017-08-21 17:49:06.786071: I tensorflow/core/kernels/cuda_solvers.cc:137] Creating CudaSolver handles for stream 0x510a320
2017-08-21 17:49:11.326938: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
2017-08-21 17:49:13.169500: W tensorflow/core/framework/op_kernel.cc:1158] Invalid argument: Got info = 49 for batch index 0, expected info = 0. Debug_info =potrf
Traceback (most recent call last):
  File "/homes/romeres/.local/share/umake/ide/pycharm/helpers/pydev/pydevd.py", line 1596, in <module>
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/homes/romeres/.local/share/umake/ide/pycharm/helpers/pydev/pydevd.py", line 1023, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/homes/romeres/Projects/1DModelCodes/syncinputs_v2.py", line 177, in <module>
    m2.optimize()
  File "/usr/local/lib/python2.7/dist-packages/GPflow-0.3.8-py2.7.egg/GPflow/model.py", line 223, in optimize
    return self._optimize_np(method, tol, callback, maxiter, **kw)
  File "/usr/local/lib/python2.7/dist-packages/GPflow-0.3.8-py2.7.egg/GPflow/model.py", line 308, in _optimize_np
    options=options)
  File "/usr/lib/python2.7/dist-packages/scipy/optimize/_minimize.py", line 447, in minimize
    callback=callback, **options)
  File "/usr/lib/python2.7/dist-packages/scipy/optimize/lbfgsb.py", line 330, in _minimize_lbfgsb
    f, g = func_and_grad(x)
  File "/usr/lib/python2.7/dist-packages/scipy/optimize/lbfgsb.py", line 278, in func_and_grad
    f = fun(x, *args)
  File "/usr/lib/python2.7/dist-packages/scipy/optimize/optimize.py", line 289, in function_wrapper
    return function(*(wrapper_args + args))
  File "/usr/lib/python2.7/dist-packages/scipy/optimize/optimize.py", line 63, in __call__
    fg = self.fun(x, *args)
  File "/usr/local/lib/python2.7/dist-packages/GPflow-0.3.8-py2.7.egg/GPflow/model.py", line 43, in __call__
    f, g = self._objective(x)
  File "/usr/local/lib/python2.7/dist-packages/GPflow-0.3.8-py2.7.egg/GPflow/model.py", line 161, in obj
    feed_dict=feed_dict)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 789, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 997, in _run
    feed_dict_string, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1132, in _do_run
    target_list, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1152, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Got info = 49 for batch index 0, expected info = 0. Debug_info =potrf
	 [[Node: name.build_likelihood/Cholesky = Cholesky[T=DT_DOUBLE, _device="/job:localhost/replica:0/task:0/gpu:0"](name.build_likelihood/add)]]
	 [[Node: name.build_likelihood/Cholesky/_27 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_394_name.build_likelihood/Cholesky", tensor_type=DT_DOUBLE, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

Caused by op u'name.build_likelihood/Cholesky', defined at:
  File "/homes/romeres/.local/share/umake/ide/pycharm/helpers/pydev/pydevd.py", line 1596, in <module>
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/homes/romeres/.local/share/umake/ide/pycharm/helpers/pydev/pydevd.py", line 1023, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/homes/romeres/Projects/1DModelCodes/syncinputs_v2.py", line 177, in <module>
    m2.optimize()
  File "/usr/local/lib/python2.7/dist-packages/GPflow-0.3.8-py2.7.egg/GPflow/model.py", line 223, in optimize
    return self._optimize_np(method, tol, callback, maxiter, **kw)
  File "/usr/local/lib/python2.7/dist-packages/GPflow-0.3.8-py2.7.egg/GPflow/model.py", line 284, in _optimize_np
    self._compile()
  File "/usr/local/lib/python2.7/dist-packages/GPflow-0.3.8-py2.7.egg/GPflow/model.py", line 133, in _compile
    f = self.build_likelihood() + self.build_prior()
  File "/usr/local/lib/python2.7/dist-packages/GPflow-0.3.8-py2.7.egg/GPflow/scoping.py", line 43, in runnable
    return f(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/GPflow-0.3.8-py2.7.egg/GPflow/gpr.py", line 60, in build_likelihood
    L = tf.cholesky(K)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_linalg_ops.py", line 227, in cholesky
    result = _op_def_lib.apply_op("Cholesky", input=input, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1269, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Got info = 49 for batch index 0, expected info = 0. Debug_info =potrf
	 [[Node: name.build_likelihood/Cholesky = Cholesky[T=DT_DOUBLE, _device="/job:localhost/replica:0/task:0/gpu:0"](name.build_likelihood/add)]]
	 [[Node: name.build_likelihood/Cholesky/_27 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_394_name.build_likelihood/Cholesky", tensor_type=DT_DOUBLE, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]


Process finished with exit code 1

I obtained this error from the PyCharm console but from terminal it is analogous. Line 177 corresponds to m2.optimize().

As you can see I tried to add White noise to overcome possible numerical error. I also tried to constrain the hyperparameters within a certain range but the result did not change.

m2.kern.matern52.lengthscales.transform = GPflow.transforms.Logistic(0.1, 10.)
m2.kern.matern52.variance.transform = GPflow.transforms.Logistic(0.1, 10.)

Am I missing something or is there a problem in the Matern kernel implementation? Thank you in advance.

About this issue

Original URL
State: closed
Created 7 years ago
Reactions: 1
Comments: 23 (13 by maintainers)

Most upvoted comments

Have you tried the SafeMatern with e.g. 1e-6?

hughsalimbeni on Aug 24, 2017