rook: ceph dashboard pops "500 error internal server error" after upgrade from v0.8.3 to v0.9.3
Similar to the ticket 2492/2523.
Is this a bug report or feature request?
- Bug Report
Deviation from expected behavior: The dashboard pops “500 error internal server error” periodically after upgrade. The issue seems the same as ticket 2492/2523. The new deployment v0.9.3 doesn’t have the issue.
Expected behavior: After upgrade, the dashboard would work the same as new deployment. How to reproduce it (minimal and precise):
1). New deploy of v0.8.3 2). Follow upgrade procedure ( https://rook.io/docs/rook/v0.9/ceph-upgrade.html ) to upgrade both rook to v0.9.3 and ceph to v13.2.4. During ceph upgrade, the “dashboard” part of the cluster is configured with updated “port” and “ssl” as below.
apiVersion: ceph.rook.io/v1 kind: CephCluster metadata: finalizers:
- cephcluster.ceph.rook.io
generation: 1
name: rook-ceph083
selfLink: /apis/ceph.rook.io/v1/namespaces/rook-ceph/cephclusters/rook-ceph083
spec:
cephVersion:
image: csf-docker-delivered.repo.lab.pl.alcatel-lucent.com/ceph/ceph:v13.2.4-20190109
dashboard:
enabled: true
port: 8443
ssl: true
dataDirHostPath: /data0/rook
mon:
allowMultiplePerNode: true
count: 3
network:
hostNetwork: false
placement:
all:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: is_control
operator: In
values:
- “true”
tolerations:
- key: is_control operator: Exists rbdMirroring: workers: 0 resources: limits: {} requests: {} storage: config: databaseSizeMB: “1024” journalSizeMB: “1024” storeType: bluestore devices:
- FullPath: “” config: null name: vdf useAllDevices: false useAllNodes: true status: state: Created
3). The dashboard can be accessed and viewed fine but “500” is popped periodically. ceph-mgr pod has corresponding logs
2019-03-23 22:27:34.850 7fd962c00700 0 mgr[dashboard] [10.76.47.200:62820] [GET] [500] [0.006s] [admin] [1.3K] /api/summary 2019-03-23 22:27:34.850 7fd962c00700 0 mgr[dashboard] [‘{“status”: “500 Internal Server Error”, “version”: “3.2.2”, “detail”: “The server encountered an unexpected condition which prevented it from fulfilling the request.”, “traceback”: “Traceback (most recent call last):\n File \”/usr/lib/python2.7/site-packages/cherrypy/_cprequest.py\“, line 656, in respond\n response.body = self.handler()\n File \”/usr/lib/python2.7/site-packages/cherrypy/lib/encoding.py\“, line 188, in call\n self.body = self.oldhandler(*args, **kwargs)\n File \”/usr/lib/python2.7/site-packages/cherrypy/lib/jsontools.py\“, line 61, in json_handler\n value = cherrypy.serving.request._json_inner_handler(*args, **kwargs)\n File \”/usr/lib/python2.7/site-packages/cherrypy/_cpdispatch.py\“, line 34, in call\n return self.callable(*self.args, **self.kwargs)\n File \”/usr/lib64/ceph/mgr/dashboard/controllers/summary.py\“, line 68, in call\n 'rbd_mirroring': self._rbd_mirroring(),\n File \”/usr/lib64/ceph/mgr/dashboard/controllers/summary.py\“, line 37, in _rbd_mirroring\n _, data = get_daemons_and_pools()\n File \”/usr/lib64/ceph/mgr/dashboard/tools.py\“, line 212, in wrapper\n return rvc.run(fn, args, kwargs)\n File \”/usr/lib64/ceph/mgr/dashboard/tools.py\“, line 194, in run\n raise self.exception\nUnboundLocalError: local variable 'mirror_mode' referenced before assignment\n”}’] 10.76.47.200 - - [23/Mar/2019:22:27:34] “GET /api/summary HTTP/1.1” 500 1353 “https://10.76.47.200:31731/” “Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36” 2019-03-23 22:27:36.785 7fd978241700 1 mgr send_beacon active 2019-03-23 22:27:38.788 7fd978241700 1 mgr send_beacon active 10.76.47.200 - - [23/Mar/2019:22:27:39] “GET /api/dashboard/health HTTP/1.1” 200 61770 “https://10.76.47.200:31731/” “Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36” 2019-03-23 22:27:40.491 7fd95db76700 -1 librbd::api::Mirror: mode_get: failed to retrieve mirror mode: (1) Operation not permitted 2019-03-23 22:27:40.491 7fd95db76700 -1 librbd::api::Mirror: mode_get: failed to retrieve mirror mode: (1) Operation not permitted 2019-03-23 22:27:40.491 7fd95db76700 0 mgr[dashboard] Failed to query mirror mode replicapool83 Traceback (most recent call last): File “/usr/lib64/ceph/mgr/dashboard/controllers/rbd_mirroring.py”, line 94, in get_pools mirror_mode = rbdctx.mirror_mode_get(ioctx) File “rbd.pyx”, line 1195, in rbd.RBD.mirror_mode_get (/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.4/rpm/el7/BUILD/ceph-13.2.4/build/src/pybind/rbd/pyrex/rbd.c:8543) PermissionError: [errno 1] error getting mirror mode 2019-03-23 22:27:40.491 7fd95db76700 0 mgr[dashboard] Error while calling fn=<function get_daemons_and_pools at 0x7fd96bd65f50> ex=local variable ‘mirror_mode’ referenced before assignment Traceback (most recent call last): File “/usr/lib64/ceph/mgr/dashboard/tools.py”, line 120, in run val = self.fn(*self.args, **self.kwargs) File “/usr/lib64/ceph/mgr/dashboard/controllers/rbd_mirroring.py”, line 152, in get_daemons_and_pools ‘pools’: get_pools(daemons) File “/usr/lib64/ceph/mgr/dashboard/controllers/rbd_mirroring.py”, line 99, in get_pools if mirror_mode == rbd.RBD_MIRROR_MODE_DISABLED: UnboundLocalError: local variable ‘mirror_mode’ referenced before assignment
Environment:
- OS (e.g. from /etc/os-release): centos
- Kernel (e.g.
uname -a): Linux rook-control-01 3.10.0-862.14.4.el7.x86_64 #1 SMP Fri Sep 21 09:07:21 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux - Cloud provider or hardware configuration: openstack
- Rook version (use
rook versioninside of a Rook Pod): v0.9.3 after upgrade - Kubernetes version (use
kubectl version): v1.12.3 - Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): Tectonic
- Storage backend status (e.g. for Ceph use
ceph healthin the [Rook Ceph toolbox] (https://rook.io/docs/Rook/master/toolbox.html)): HEALTH_OK
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 1
- Comments: 15 (5 by maintainers)
@sebastian-philipp
I’m repeatedly getting (with every page refresh) a red Toaster-Message on the Dashboards start page showing:
Most of the menus and contents work well, though. Only the “Cluster -> CRUSH-Map” shows the same error.
Thanks
@nathanmartins In my case, it is fixed in 14.2.2
I deployed my ceph cluster in kubernetes using image ceph/ceph:v14.2.1-20190430. My log in ceph-mgr pod:
Affects us still on v1.0.1