irods: Tests are failing with `RE_UNABLE_TO_READ_SESSION_VAR`

  • main
  • 4-3-stable

Bug Report

iRODS Version, OS and Version

main/4-3-stable (haven’t seen this in 4-2-stable)

What did you try to do?

Run the test suite

Expected behavior

Consistently passing/failing tests

Observed behavior (including steps to reproduce, if applicable)

Tests which should be passing are failing, but only sometimes. This is the common thread between all of the spurious failures:

 --- IrodsSession: icommand executed by [rods#tempZone] [iadmin mkuser otherrods rodsadmin] --- 
Assert Command: iadmin mkuser otherrods rodsadmin
Expecting EMPTY: ['']
  stdout:
    | Level 0: DEBUG: error: unable to read session variable $rodsZoneProxy.
    | line 23, col 27, rule base core
    |   acCreateCollByAdmin("/"++$rodsZoneProxy++"/home", $otherUserName);
    |                            ^
    | 
    | ==========
    | error: unable to read session variable $otherUserName.
    | line 13, col 4, rule base core
    |  ON($otherUserName == "anonymous") {
    |     ^
    | 
    | 
  stderr:
    | remote addresses: 172.23.0.6 ERROR: rcGeneralAdmin failed with error -1213000 RE_UNABLE_TO_READ_SESSION_VAR 
Unexpected output on stderr

FAILED TESTING ASSERTION

Please investigate.

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 16 (12 by maintainers)

Commits related to this issue

Most upvoted comments

I haven’t run the test suite in a while, so I also haven’t seen this in a while. I’ll report back once I run it through a few times. I’m not convinced that this issue is gone.

It seems that the reason for the error is the /dev/shm partition filling up in the containers in the testing environment as a result of the /dev/shm files not being cleaned up sometimes. When the new RE cache file is created, there’s not enough space, so it results in a 0-length file, and this error occurs. I was able to reliably reproduce that piece of the issue.

I still am unsure as to how the removal of the shared memory is failing to begin with, so, still investigating.

I’ve captured a container which is exhibiting this behavior…

Like https://github.com/irods/irods/issues/4675 and https://github.com/irods/irods/issues/6521, /dev/shm is not getting cleaned up on server shutdown sometimes. When the contents of that directory are cleared and the server is started, everything works fine again. So, I think these are being caused by the same issue.

More to follow Soon™