tensorflow: Tensorflow 2.0 is much slower than pytorch for large matrix assignment
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
- TensorFlow installed from (source or binary): binary
- TensorFlow version (use command below): tensorflow 2.0 beta
- Python version: 3.6
- Bazel version (if compiling from source):
- GCC/Compiler version (if compiling from source):
- CUDA/cuDNN version: CUDA 10.0/cuDNN 7
- GPU model and memory: GeForce RTX 2080 Ti
Describe the current behavior In one of my research, I need to assign values to large sparse matrices. Since tensorflow does not support value assignment/update to large matrices , I have to use a lot of tf.stack()/tf.concat() functions.
I compared the same function implemented in tensorflow 2.0 and pytorch 1.1.0, and the execution time for tensorflow was much slower than pytorch. The execution times are listed below: pytorch: 0.0036 secs tf 2.0 : 0.1734 secs
Describe the expected behavior Is there a way to optimize the tensorflow codes to have comparable performance?
Code to reproduce the issue
## Tensorflow 2.0
import tensorflow as tf
import time
import numpy as np
def skew(x):
x = tf.reshape(x, [-1, 1])
z1 = tf.stack([tf.zeros(1), -x[2], x[1]], 1)
z2 = tf.stack([x[2], tf.zeros(1), -x[0]], 1)
z3 = tf.stack([-x[1], x[0], tf.zeros(1)], 1)
X = tf.concat([z1, z2, z3], 0)
return X
def propagate(Rot, v, p, g):
v_skew_rot = tf.matmul(skew(v), Rot)
p_skew_rot = tf.matmul(skew(p), Rot)
F0 = tf.zeros([3, 3])
F1 = tf.concat([F0, skew(g), F0], 0)
F2 = tf.concat([F0, F0, tf.eye(3)], 0)
F3 = tf.zeros([9, 3])
F4 = tf.concat([-Rot, -v_skew_rot, -p_skew_rot], 0)
F5 = tf.concat([F0, -Rot, F0], 0)
F6 = tf.zeros([9, 6])
F7 = tf.concat([F1, F2, F3, F4, F5, F6], 1)
F = tf.concat([F7, tf.zeros([12, 21])], 0)
G0 = tf.zeros([3, 12])
G1 = tf.concat([Rot, v_skew_rot, p_skew_rot, tf.zeros([12, 3])], 0)
G2 = tf.concat([F0, Rot, F0, tf.zeros([12, 3])], 0)
G3 = tf.concat([G0, G0, G0, tf.eye(12)], 0)
G = tf.concat([G1, G2, G3], 1)
return F, G
if __name__ == '__main__':
Rot = tf.eye(3)
v = np.array([0.5, 0, 0], dtype=np.float32)
p = np.array([1.5, 0, 0], dtype=np.float32)
g = np.array([0, 0, -9.80655], dtype=np.float32)
start = time.time()
P = propagate(Rot, v, p, g)
print("Propagate function takes {} secs".format(time.time() - start))
## pytorch 1.1.0
import torch
import time
import numpy as np
def skew(x):
X = torch.Tensor([[0, -x[2], x[1]],
[x[2], 0, -x[0]],
[-x[1], x[0], 0]])
return X
def propagate(Rot_prev, v_prev, p_prev, g):
F = torch.zeros(21, 21)
G = torch.zeros(21, 18)
v_skew_rot = skew(v_prev).mm(Rot_prev)
p_skew_rot = skew(p_prev).mm(Rot_prev)
F[:3, 9:12] = -Rot_prev
F[3:6, :3] = skew(g)
F[6:9, 3:6] = torch.eye(3)
F[3:6, 12:15] = -Rot_prev
F[3:6, 9:12] = -v_skew_rot
F[6:9, 9:12] = -p_skew_rot
G[:3, :3] = Rot_prev
G[3:6, 3:6] = Rot_prev
G[3:6, :3] = v_skew_rot
G[6:9, :3] = p_skew_rot
G[9:12, 6:9] = torch.eye(3)
G[12:15, 9:12] = torch.eye(3)
G[15:18, 12:15] = torch.eye(3)
G[18:21, 15:18] = torch.eye(3)
return F, G
if __name__ == '__main__':
Rot = torch.eye(3)
v = np.array([0.5, 0, 0], dtype=np.float32)
p = np.array([1.5, 0, 0], dtype=np.float32)
g = np.array([0, 0, -9.80655], dtype=np.float32)
start = time.time()
P = propagate(Rot, v, p, g)
print("Propagate function takes {} secs".format(time.time() - start))
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 17 (6 by maintainers)
The tf.data API helps to build flexible and efficient input pipelines.