rook: Dashboard 500. Rook v1.3.0, Ceph 15.2.0
Kubernetes 1.16.8 Centos 7 5.5.13-1.el7.elrepo.x86_64
After upgrade to Rook v1.3.0 and Ceph 15.2.0 the dashboard is partially unavailable and indicates HTTP 500 errors. Logs of manager:
debug 2020-04-09T08:36:13.902+0000 7fa4d6d00700 0 [rook ERROR orchestrator._interface] _Promise failed
Traceback (most recent call last):
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 271, in _finalize
next_result = self._on_complete(self._value)
File "/usr/share/ceph/mgr/rook/module.py", line 52, in <lambda>
return RookCompletion(on_complete=lambda _: f(*args, **kwargs))
File "/usr/share/ceph/mgr/rook/module.py", line 316, in describe_service
placement=PlacementSpec(count=active),
File "/lib/python3.6/site-packages/ceph/deployment/service_spec.py", line 338, in __init__
assert service_type in ServiceSpec.KNOWN_SERVICE_TYPES, service_type
AssertionError: mds.core-rook
debug 2020-04-09T08:36:13.903+0000 7fa4d6d00700 0 [dashboard ERROR request] [10.32.9.136:57600] [GET] [500] [0.414s] [admin] [513.0B] /api/health/minimal
debug 2020-04-09T08:36:13.903+0000 7fa4d6d00700 0 [dashboard ERROR request] [b'{"status": "500 Internal Server Error", "detail": "The server encountered an unexpected condition which prevented it from fulfilling the request.", "request_id": "15acf2e5-68b1-43dd-a460-4a275d164bdf"} ']
10.32.7.185 - - [09/Apr/2020:08:36:14] "GET / HTTP/1.1" 200 176 "" "kube-probe/1.16"
debug 2020-04-09T08:36:14.822+0000 7fa4f242a700 0 log_channel(cluster) log [DBG] : pgmap v16509: 97 pgs: 97 active+clean; 76 GiB data, 229 GiB used, 6.3 TiB / 6.5 TiB avail; 1.2 KiB/s rd, 1.6 MiB/s wr, 155 op/s
10.32.11.5 - - [09/Apr/2020:08:36:15] "GET /metrics HTTP/1.1" 200 230652 "" "Prometheus/2.16.0"
debug 2020-04-09T08:36:16.823+0000 7fa4f242a700 0 log_channel(cluster) log [DBG] : pgmap v16510: 97 pgs: 97 active+clean; 76 GiB data, 229 GiB used, 6.3 TiB / 6.5 TiB avail; 852 B/s rd, 1.0 MiB/s wr, 99 op/s
debug 2020-04-09T08:36:18.823+0000 7fa4f242a700 0 log_channel(cluster) log [DBG] : pgmap v16511: 97 pgs: 97 active+clean; 76 GiB data, 229 GiB used, 6.3 TiB / 6.5 TiB avail; 853 B/s rd, 1.0 MiB/s wr, 99 op/s
debug 2020-04-09T08:36:19.170+0000 7fa4d7d02700 0 [rook ERROR orchestrator._interface] _Promise failed
Traceback (most recent call last):
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 271, in _finalize
next_result = self._on_complete(self._value)
File "/usr/share/ceph/mgr/rook/module.py", line 52, in <lambda>
return RookCompletion(on_complete=lambda _: f(*args, **kwargs))
File "/usr/share/ceph/mgr/rook/module.py", line 316, in describe_service
placement=PlacementSpec(count=active),
File "/lib/python3.6/site-packages/ceph/deployment/service_spec.py", line 338, in __init__
assert service_type in ServiceSpec.KNOWN_SERVICE_TYPES, service_type
AssertionError: mds.core-rook
debug 2020-04-09T08:36:19.171+0000 7fa4d7d02700 0 [dashboard ERROR request] [10.32.9.136:57850] [GET] [500] [0.679s] [admin] [513.0B] /api/health/minimal
debug 2020-04-09T08:36:19.171+0000 7fa4d7d02700 0 [dashboard ERROR request] [b'{"status": "500 Internal Server Error", "detail": "The server encountered an unexpected condition which prevented it from fulfilling the request.", "request_id": "c56bd8b8-786e-405c-91c4-d82a04cae23b"}
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 10
- Comments: 30 (11 by maintainers)
Updated ceph to 15.2.2 but still getting this issue:
FWIW, this is the PR that fixes it: https://github.com/ceph/ceph/pull/34061
The minimal health API of the Dashboard gets iSCSI services from orchestrator, which eventually invoke orchestrator’s
describe_services()function.In 15.2.0,
describe_services()asserts because it lists mds services withmds.<namespace>service type. A PR was merged a few days ago should fix this issue.Saw the same issue, and the steps in this comment got rid of the errors. If you need iSCSI in the dashboard, however, the PR is probably going to be a better option.
Ceph v15.2.4 / v15.2.4-20200630 Docker images are out, could you please check if they fix the MGR dashboard issue for you? Thanks!
If you enable dashboard debug mode you’ll get more information about the exact failure (e.g.: python traceback):
Same for me!