pyk4a: Error with transformations in recorded file (playback): color frame not BGRA32

Hello! I was trying to run a few of the examples, such as /examples/viewer_point_cloud.py, but with pre-recorded .mkv files instead of in a “live” session with a device attached. However, an exception was being thrown because the .mkv file was recorded in MJPG format – whereas the transformation functions needed a BGRA32-formatted color frame.

I ended up fixing this issue myself and am adding the solution here for anyone in a similar situation, because even though it was incredibly simple to solve, it did take me a little while. In my own fork (4 most recent commits), I ended up adding a bool parameter called force_bgra, and I made the pyk4a.capture.color property getter convert the color frame to BGRA32 as necessary. Now, I can create a PyK4APlayback object, passing force_bgra=True, then run the rest of viewer_point_cloud.py without error:

# Open recording/playback
playback = PyK4APlayback("./recording.mkv", force_bgra=True)
playback.open()

# Get first frame with depth and color
while True:
	capture = playback.get_next_capture()
	if np.any(capture.depth) and np.any(capture.color):
		break
points = capture.depth_point_cloud.reshape((-1, 3))
colors = capture.transformed_color[..., (2, 1, 0)].reshape((-1, 3))

# Plot point cloud with color
fig = plt.figure()
ax = fig.add_subplot(111, projection="3d")
ax.scatter(
    points[:, 0], points[:, 1], points[:, 2], s=1, c=colors / 255,
)
ax.set_xlabel("x")
ax.set_ylabel("y")
ax.set_zlabel("z")
ax.set_xlim(-2000, 2000)
ax.set_ylim(-2000, 2000)
ax.set_zlim(0, 4000)
ax.view_init(elev=-90, azim=-90)
plt.show()

# Close recording/playback
playback.close()

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 19 (1 by maintainers)

Most upvoted comments

Thanks for looking at this Shagren. rajkundu thanks for taking the time for sharing and explaining your point.

I agree with rajkundu that we should implement this. Enabling operations is good to have and can be really useful. Also playback of lossless recording is required for many use cases. Personally I would have never recorded in lossy format for my research.

With that being said, I did not look at the code yet and my understanding of the playback code is limited.

We should make a PR. Ideally with some tests but they can come in a future PR since you confirmed it works.

I realized after looking back at my code that it perhaps is not the best solution to this problem. While I found it convenient to have a flag such as force_bgra=True, behind the scenes, this would require pyk4a to depend on cv2/OpenCV just for this one feature – which is a downright terrible idea.

Instead, if anyone has this same issue (your recording is in MJPG format, but pyk4a functions require BGRA32 format), I would suggest manually overwriting the PyK4ACapture object’s properties as follows:

playback = pyk4a.PyK4APlayback('1662056463_Left.mkv')
playback.open()

while True:
	# Get next capture
	try:
		capture = playback.get_next_capture()
	except EOFError:
		break

	# Convert MJPG color image to BGRA32
	capture._color = cv2.cvtColor(cv2.imdecode(capture.color, cv2.IMREAD_COLOR), cv2.COLOR_BGR2BGRA)
	capture._color_format = pyk4a.ImageFormat.COLOR_BGRA32
	
	# Now we can use the capture object as if it were originally recorded in BGRA32 format!
	capture.transformed_color # e.g. this function works without error

playback.close()

Perhaps it’s worth officially documenting this workaround somewhere, or perhaps not – it seems like nobody else has really encountered these issues. Hope this helps anyone who needs it, though!

I ended up rewriting my application in C++, and everything works perfectly fine; I can record BGRA32 captures at 30 FPS with no memory leaks – I’m still not sure why I had a leak when using pyk4a.

@lpasselin Of course! Glad to help.

One question: I would also love to record raw color data, but based on the following table:

RGB Camera Resolution (HxV) Aspect Ratio Format Options Frame Rates (FPS) Nominal FOV (HxV)(post-processed)
3840x2160 16:9 MJPEG 0, 5, 15, 30 90°x59°
2560x1440 16:9 MJPEG 0, 5, 15, 30 90°x59°
1920x1080 16:9 MJPEG 0, 5, 15, 30 90°x59°
1280x720 16:9 MJPEG/YUY2/NV12 0, 5, 15, 30 90°x59°
4096x3072 4:3 MJPEG 0, 5, 15 90°x74.3°
2048x1536 4:3 MJPEG 0, 5, 15, 30 90°x74.3°

taken from this page, it seems to me that the Azure Kinect DK isn’t really capable of doing so – it appears that the device compresses the raw color frame as MJPG before sending it to the host computer. Then, this compressed MJPG color frame can be decompressed (i.e., with loss) into BGRA32 if needed.

A note below the table states:

The Sensor SDK can provide color images in the BGRA pixel format. This is not a native mode supported by the device and causes additional CPU load when used. The host CPU is used to convert from MJPEG images received from the device.

So, when you read a BGRA32 image from the SDK, it is my understanding that it is not actually raw, lossless sensor data, but rather data that has been compressed into MJPG and then decompressed, with loss, back into a BGRA32 array of fixed size. What do you think? What is your interpretation of this table and the rest of the device specifications?

Thanks for looking at this Shagren. rajkundu thanks for taking the time for sharing and explaining your point.

I agree with rajkundu that we should implement this. Enabling operations is good to have and can be really useful. Also playback of lossless recording is required for many use cases. Personally I would have never recorded in lossy format for my research.

With that being said, I did not look at the code yet and my understanding of the playback code is limited.

We should make a PR. Ideally with some tests but they can come in a future PR since you confirmed it works.