keops: Memory leak in Pykeops 2.2 not present in 2.1.2
I have a convolutional layer implementing a convolution over a point cloud using Pykeops. Using v2.1.2 this all works fine, in v2.2 this causes a memory leak that eventually crashes the training
from torch import nn
from pykeops.torch import LazyTensor
class ConvLayer(nn.Module):
def __init__(
self,
in_channels: int,
hidden_units: int,
out_channels: int,
radius: float,
):
"""
Creates the KeOps convolution layer.
Args:
in_channels: dimension of input features
hidden_units (int, optional): number of hidden uniots per point.
Defaults to out_channels.
out_channels: dimension of output features per point.
radius : deviation of the Gaussian window on the
quasi-geodesic distance `d_ij`. Defaults to 1..
"""
super().__init__()
self.radius = radius
# 3D convolution filters, encoded as an MLP:
self.conv = nn.Sequential(
nn.Linear(3, hidden_units),
nn.ReLU(),
nn.Linear(hidden_units, out_channels),
)
def forward(
self, points: Tensor, nuv: Tensor, features: Tensor, ranges=None, batch=None
):
"""
points, local basis, in features -> out features
(N, 3), (N, 3, 3), (N, I) -> (N, O)
Args:
points (Tensor): (N,3) point coordinates `x_i`.
nuv (Tensor): (N,3,3) local coordinate systems `[n_i,u_i,v_i]`.
features (Tensor): (N,I) input feature vectors `f_i`.
Returns:
(Tensor): (N,O) output feature vectors `f'_i`.
"""
# Normalize the kernel radius:
points = points / (sqrt(2.0) * self.radius) # (N, 3)
# Vertices:
x_i = LazyTensor(points[:, None, :].contiguous()) # (N, 1, 3)
x_j = LazyTensor(points[None, :, :].contiguous()) # (1, N, 3)
normals = nuv[:, 0, :].contiguous().detach()
# Local bases:
nuv_i = LazyTensor(nuv.view(-1, 1, 9)) # (N, 1, 9)
# Normals:
n_i = nuv_i[:3] # (N, 1, 3)
n_j = LazyTensor(normals[None, :, :].contiguous()) # (1, N, 3)
# Pseudo-geodesic squared distance:
d2_ij = ((x_j - x_i) ** 2).sum(-1) * ((2 - (n_i | n_j)) ** 2) # (N, N, 1)
# Gaussian window:
window_ij = (-d2_ij).exp() # (N, N, 1)
# Local coordinates:
X_ij = nuv_i.matvecmult(x_j - x_i)
A_1, B_1 = self.conv[0].weight, self.conv[0].bias
A_2, B_2 = self.conv[2].weight, self.conv[2].bias
a_1 = LazyTensor(A_1.view(1, 1, -1)) # (1, 1, C*3)
b_1 = LazyTensor(B_1.view(1, 1, -1)) # (1, 1, C)
a_2 = LazyTensor(A_2.view(1, 1, -1)) # (1, 1, Hd*C)
b_2 = LazyTensor(B_2.view(1, 1, -1)) # (1, 1, Hd)
# MLP:
X_1 = a_1.matvecmult(X_ij) + b_1 # (N, N, C)
X_2 = X_1.relu() # (N, N, C)
X_3 = a_2.matvecmult(X_2) + b_2 # (N, N, Hd)
X_4 = X_3.relu()
f_j = LazyTensor(features[None, :, :].contiguous())
F_ij = window_ij * X_4 * f_j
conv_features = F_ij.sum(dim=1)
return conv_features
Version 2.2:
Version 2.1.2:
About this issue
- Original URL
- State: closed
- Created 5 months ago
- Comments: 18 (11 by maintainers)
Commits related to this issue
- use ctx.save_for_forward, should fix issue #353 — committed to getkeops/keops by joanglaunes 5 months ago
Version v2.2.1 should fix the issue. Thank you everyone for your effort.
After git-bisect I found this commit b1b304d8 to be the turning point where leak appears. And more precisely c2d5a372