rook: Ceph mgr test intermittently fails with no nodes or devices available
Is this a bug report or feature request?
- Bug Report
Deviation from expected behavior: The CephMgrSuite intermittently fails with errors such as the following:
--- FAIL: TestCephMgrSuite (238.55s)
--- FAIL: TestCephMgrSuite/TestCreateOSD (0.85s)
ceph_mgr_test.go:195:
Error Trace: ceph_mgr_test.go:195
Error: Expected nil, but got: &exec.ExitError{ProcessState:(*os.ProcessState)(0xc00074d060), Stderr:[]uint8{0x45, 0x72, 0x72, 0x6f, 0x72, 0x20, 0x45, 0x4e, 0x4f, 0x45, 0x4e, 0x54, 0x3a, 0x20, 0x4e, 0x6f, 0x20, 0x6f, 0x72, 0x63, 0x68, 0x65, 0x73, 0x74, 0x72, 0x61, 0x74, 0x6f, 0x72, 0x20, 0x63, 0x6f, 0x6e, 0x66, 0x69, 0x67, 0x75, 0x72, 0x65, 0x64, 0x20, 0x28, 0x74, 0x72, 0x79, 0x20, 0x60, 0x63, 0x65, 0x70, 0x68, 0x20, 0x6f, 0x72, 0x63, 0x68, 0x20, 0x73, 0x65, 0x74, 0x20, 0x62, 0x61, 0x63, 0x6b, 0x65, 0x6e, 0x64, 0x60, 0x29, 0xa, 0x63, 0x6f, 0x6d, 0x6d, 0x61, 0x6e, 0x64, 0x20, 0x74, 0x65, 0x72, 0x6d, 0x69, 0x6e, 0x61, 0x74, 0x65, 0x64, 0x20, 0x77, 0x69, 0x74, 0x68, 0x20, 0x65, 0x78, 0x69, 0x74, 0x20, 0x63, 0x6f, 0x64, 0x65, 0x20, 0x32, 0xa}}
Test: TestCephMgrSuite/TestCreateOSD
ceph_mgr_test.go:201:
Error Trace: ceph_mgr_test.go:201
Error: Expected nil, but got: &json.SyntaxError{msg:"invalid character '.' looking for beginning of value", Offset:1}
Test: TestCephMgrSuite/TestCreateOSD
ceph_mgr_test.go:217:
Error Trace: ceph_mgr_test.go:217
Error: Should not be: ""
Test: TestCephMgrSuite/TestCreateOSD
Messages: No devices available to create test OSD
ceph_mgr_test.go:218:
Error Trace: ceph_mgr_test.go:218
Error: Should not be: ""
Test: TestCephMgrSuite/TestCreateOSD
Messages: No nodes available to create test OSD
See the example failure here.
Expected behavior: Integration tests should pass consistently.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 29 (27 by maintainers)
Commits related to this issue
- ceph: disable ceph mgr test temporarily The CI keeps failing intermittently due to https://github.com/rook/rook/issues/5877 and the fix has not merged yet so disabling until fixed. Signed-off-by: Sé... — committed to leseb/rook by leseb 4 years ago
- ci: re-enable ceph manager suite The discovery daemon must be enabled to work rook ceph-mgr module. Closes: https://github.com/rook/rook/issues/5877 Signed-off-by: Satoru Takeuchi <satoru.takeuchi@... — committed to cybozu-go/rook by satoru-takeuchi 3 years ago
Sorry I’ve not had enough time to investigate this issue for several weeks. I restarted to handle this issue.
No problem, thanks for looking into this 😃
Please see my comment here: https://github.com/rook/rook/issues/5877#issuecomment-713639078
They are not related to this issue.
@varshar16 Yes, any hints are welcome.
@satoru-takeuchi I did clarify with the ceph dashboard team that there is still a scenario for starting rook clusters based on the ceph mgr so these tests are still valuable to get going again, thanks!
What errors are you hitting now when running the mgr test?
@varshar16 I’m trying to fix this problem. Please wait for a while. Probably I’ll send a PR next week.
I reproduced this problem in my local environment many times and found that it happened even when rook module seemed to be loaded properly. I’ll investigate whether #5884 is the proper fix of this problem.