cglm: Bug in glm_mat4_inv_sse2

PREFACE: I completely lack both the linear algebra and SIMD skills to fix this; I merely note the issue’s existence!

I was calculating the inverse transpose of the model matrix to properly transform normals for my lighting shader, and noted that the inverse transpose was wrong when I calculated on the CPU with cglm. I compared the results with calculating it in the shader (OpenGL) and in GLM, and found that the issue occurred in glm_mat4_inv, when using SSE (thus following the glm_mat4_inv_sse2 path.

When I manually commented out the SSE path and forced the non-SIMD path in glm_mat4_inv, the calculations came out correct, matching OpenGL shader and GLM results.

Here is the code I used to calculate a sample model matrix, and its inverse transpose, along with the code to print out resulting matrices at each step (NOTE: I’ve flipped/corrected? the mat4 printing (#288), so the matrices print out column-major, to match glm’s output which I was comparing. I also increased precision to 6 decimal places, again to match glm’s output):

mat4 model;
mat4 inverse;
mat4 transpose;

// Create an identity matrix for the model
glm_mat4_identity(model);
fprintf(stderr, "Identity matrix:\n");
glm_mat4_print(model, stderr);

// Translate the model matrix
glm_translate(model, (vec3){2.0f, 5.0f, -15.0f});
fprintf(stderr, "Translated matrix:\n");
glm_mat4_print(model, stderr);

// Apply a rotation to the model matrix in the x and y dimensions.
glm_rotate(model, glm_rad(45.0f), (vec3){1.0f, 0.5f, 0.0f});
fprintf(stderr, "Model matrix:\n");
glm_mat4_print(model, stderr);

// Invert the matrix, and store in inverse
glm_mat4_inv(model, inverse);
fprintf(stderr, "Inverse matrix:\n");
glm_mat4_print(inverse, stderr);

// Transpose the inverse matrix, and store in transpose
glm_mat4_transpose_to(inverse, transpose);
fprintf(stderr, "Transpose matrix:\n");
glm_mat4_print(transpose, stderr);

And here are the resulting values.

Identity matrix:
  |  1.000000  0.000000  0.000000  0.000000  |
  |  0.000000  1.000000  0.000000  0.000000  |
  |  0.000000  0.000000  1.000000  0.000000  |
  |  0.000000  0.000000  0.000000  1.000000  |
Translated matrix:
  |  1.000000  0.000000  0.000000   0.000000  |
  |  0.000000  1.000000  0.000000   0.000000  |
  |  0.000000  0.000000  1.000000   0.000000  |
  |  2.000000  5.000000 -15.000000   1.000000  |
Model matrix:
  |  0.941421  0.117157 -0.316228   0.000000  |
  |  0.117157  0.765685  0.632456   0.000000  |
  |  0.316228 -0.632456  0.707107   0.000000  |
  |  2.000000  5.000000 -15.000000   1.000000  |
Inverse matrix:
  |  0.967995  0.120464  0.325154  -0.000000  |
  | -0.120464 -0.787298  0.650308  -0.000000  |
  | -0.325154  0.650308  0.727066  -0.000000  |
  |  7.415617 -5.577195 -13.507222  -1.028227  |
Transpose matrix:
  |  0.967995 -0.120464  -0.325154  7.415617  |
  |  0.120464 -0.787298   0.650308 -5.577195  |
  |  0.325154  0.650308   0.727066 -13.507222  |
  | -0.000000 -0.000000  -0.000000 -1.028227  |

When I forced the non-SIMD path, I got the following results:

Identity matrix:
  |  1.000000  0.000000  0.000000  0.000000  |
  |  0.000000  1.000000  0.000000  0.000000  |
  |  0.000000  0.000000  1.000000  0.000000  |
  |  0.000000  0.000000  0.000000  1.000000  |
Translated matrix:
  |  1.000000  0.000000  0.000000   0.000000  |
  |  0.000000  1.000000  0.000000   0.000000  |
  |  0.000000  0.000000  1.000000   0.000000  |
  |  2.000000  5.000000 -15.000000   1.000000  |
Model matrix:
  |  0.941421  0.117157 -0.316228   0.000000  |
  |  0.117157  0.765685  0.632456   0.000000  |
  |  0.316228 -0.632456  0.707107   0.000000  |
  |  2.000000  5.000000 -15.000000   1.000000  |
Inverse matrix:
  |  0.941421  0.117157  0.316228  -0.000000  |
  |  0.117157  0.765685 -0.632456   0.000000  |
  | -0.316228  0.632456  0.707107  -0.000000  |
  | -7.212046  5.424091  13.136425   1.000000  |
Transpose matrix:
  |  0.941421  0.117157  -0.316228 -7.212046  |
  |  0.117157  0.765685   0.632456  5.424091  |
  |  0.316228 -0.632456   0.707107  13.136425  |
  | -0.000000  0.000000  -0.000000  1.000000  |

For comparison, here’s the glm code I used:

// Create an identity matrix for the model.
glm::mat4 model = glm::mat4(1.0f);
fprintf(stderr, "Identity matrix:\n");
fprintf(stderr, "%s\n", glm::to_string(model).c_str());

// Translate the matrix
model = glm::translate(model, glm::vec3(2.0f, 5.0f, -15.0f));
fprintf(stderr, "Translated matrix:\n");
fprintf(stderr, "%s\n", glm::to_string(model).c_str());

// Rotate the matrix along the x and y axes
model = glm::rotate(model, glm::radians(45.0f), glm::vec3(1.0f, 0.5f, 0.0f));
fprintf(stderr, "Model matrix:\n");
fprintf(stderr, "%s\n", glm::to_string(model).c_str());

// Invert the model matrix
glm::mat4 inverse = glm::inverse(model);
fprintf(stderr, "Inverse matrix:\n");
fprintf(stderr, "%s\n", glm::to_string(inverse).c_str());

// Transpose the inverted matrix.
glm::mat4 transpose = glm::transpose(inverse);
fprintf(stderr, "Transpose matrix:\n");
fprintf(stderr, "%s\n", glm::to_string(transpose).c_str());

And here are the glm results (which match the non-SIMD path of glm_mat4_inv, slight formatting differences notwithstanding):

Identity matrix:
  |  1.000000, 0.000000, 0.000000, 0.000000 |
  |  0.000000, 1.000000, 0.000000, 0.000000 |
  |  0.000000, 0.000000, 1.000000, 0.000000 |
  |  0.000000, 0.000000, 0.000000, 1.000000 |
Translated matrix:
  |  1.000000, 0.000000, 0.000000, 0.000000 |
  |  0.000000, 1.000000, 0.000000, 0.000000 |
  |  0.000000, 0.000000, 1.000000, 0.000000 |
  |  2.000000, 5.000000, -15.000000, 1.000000 |
Model matrix:
  |  0.941421, 0.117157, -0.316228, 0.000000 |
  |  0.117157, 0.765685, 0.632456, 0.000000 |
  |  0.316228, -0.632456, 0.707107, 0.000000 |
  |  2.000000, 5.000000, -15.000000, 1.000000 |
Inverse matrix:
  |  0.941421, 0.117157, 0.316228, -0.000000 |
  |  0.117157, 0.765685, -0.632456, 0.000000 |
  | -0.316228, 0.632456, 0.707107, -0.000000 |
  | -7.212046, 5.424091, 13.136425, 1.000000 |
Transpose matrix:
  |  0.941421, 0.117157, -0.316228, -7.212046 |
  |  0.117157, 0.765685, 0.632456, 5.424091 |
  |  0.316228, -0.632456, 0.707107, 13.136425 |
  | -0.000000, 0.000000, -0.000000, 1.000000 |

So pretty clearly, something goes wonky and wrong in the SSE path for glm_mat4_inv.

About this issue

Original URL
State: closed
Created a year ago
Comments: 18 (9 by maintainers)

Most upvoted comments

@deadwanderer @gottfriedleibniz many thanks, that PR is merged.

recp on Apr 22, 2023

@gottfriedleibniz I created a PR for this, it would be nice to get a review: https://github.com/recp/cglm/pull/291

@deadwanderer is it possible to test that PR with /fp:fast to validate that is working as expected (previous tests)?

thanks

recp on Apr 21, 2023