rook: Ceph mgr test intermittently fails with no nodes or devices available

Is this a bug report or feature request?

  • Bug Report

Deviation from expected behavior: The CephMgrSuite intermittently fails with errors such as the following:

--- FAIL: TestCephMgrSuite (238.55s)
    --- FAIL: TestCephMgrSuite/TestCreateOSD (0.85s)
        ceph_mgr_test.go:195: 
            	Error Trace:	ceph_mgr_test.go:195
            	Error:      	Expected nil, but got: &exec.ExitError{ProcessState:(*os.ProcessState)(0xc00074d060), Stderr:[]uint8{0x45, 0x72, 0x72, 0x6f, 0x72, 0x20, 0x45, 0x4e, 0x4f, 0x45, 0x4e, 0x54, 0x3a, 0x20, 0x4e, 0x6f, 0x20, 0x6f, 0x72, 0x63, 0x68, 0x65, 0x73, 0x74, 0x72, 0x61, 0x74, 0x6f, 0x72, 0x20, 0x63, 0x6f, 0x6e, 0x66, 0x69, 0x67, 0x75, 0x72, 0x65, 0x64, 0x20, 0x28, 0x74, 0x72, 0x79, 0x20, 0x60, 0x63, 0x65, 0x70, 0x68, 0x20, 0x6f, 0x72, 0x63, 0x68, 0x20, 0x73, 0x65, 0x74, 0x20, 0x62, 0x61, 0x63, 0x6b, 0x65, 0x6e, 0x64, 0x60, 0x29, 0xa, 0x63, 0x6f, 0x6d, 0x6d, 0x61, 0x6e, 0x64, 0x20, 0x74, 0x65, 0x72, 0x6d, 0x69, 0x6e, 0x61, 0x74, 0x65, 0x64, 0x20, 0x77, 0x69, 0x74, 0x68, 0x20, 0x65, 0x78, 0x69, 0x74, 0x20, 0x63, 0x6f, 0x64, 0x65, 0x20, 0x32, 0xa}}
            	Test:       	TestCephMgrSuite/TestCreateOSD
        ceph_mgr_test.go:201: 
            	Error Trace:	ceph_mgr_test.go:201
            	Error:      	Expected nil, but got: &json.SyntaxError{msg:"invalid character '.' looking for beginning of value", Offset:1}
            	Test:       	TestCephMgrSuite/TestCreateOSD
        ceph_mgr_test.go:217: 
            	Error Trace:	ceph_mgr_test.go:217
            	Error:      	Should not be: ""
            	Test:       	TestCephMgrSuite/TestCreateOSD
            	Messages:   	No devices available to create test OSD
        ceph_mgr_test.go:218: 
            	Error Trace:	ceph_mgr_test.go:218
            	Error:      	Should not be: ""
            	Test:       	TestCephMgrSuite/TestCreateOSD
            	Messages:   	No nodes available to create test OSD

See the example failure here.

Expected behavior: Integration tests should pass consistently.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 29 (27 by maintainers)

Commits related to this issue

Most upvoted comments

Sorry I’ve not had enough time to investigate this issue for several weeks. I restarted to handle this issue.

Sorry I’ve not had enough time to investigate this issue for several weeks. I restarted to handle this issue.

No problem, thanks for looking into this 😃

@varshar16 Yes, any hints are welcome.

Please see my comment here: https://github.com/rook/rook/issues/5877#issuecomment-713639078

@varshar16 Do you have any opinion whether the bugs listed in the following link is related to this issue or not?

#6270 (comment)

They are not related to this issue.

@varshar16 Yes, any hints are welcome.

@satoru-takeuchi I did clarify with the ceph dashboard team that there is still a scenario for starting rook clusters based on the ceph mgr so these tests are still valuable to get going again, thanks!

What errors are you hitting now when running the mgr test?

@varshar16 I’m trying to fix this problem. Please wait for a while. Probably I’ll send a PR next week.

I reproduced this problem in my local environment many times and found that it happened even when rook module seemed to be loaded properly. I’ll investigate whether #5884 is the proper fix of this problem.