rook: ceph dashboard pops "500 error internal server error" after upgrade from v0.8.3 to v0.9.3

Similar to the ticket 2492/2523.

Is this a bug report or feature request?

  • Bug Report

Deviation from expected behavior: The dashboard pops “500 error internal server error” periodically after upgrade. The issue seems the same as ticket 2492/2523. The new deployment v0.9.3 doesn’t have the issue.

Expected behavior: After upgrade, the dashboard would work the same as new deployment. How to reproduce it (minimal and precise):

1). New deploy of v0.8.3 2). Follow upgrade procedure ( https://rook.io/docs/rook/v0.9/ceph-upgrade.html ) to upgrade both rook to v0.9.3 and ceph to v13.2.4. During ceph upgrade, the “dashboard” part of the cluster is configured with updated “port” and “ssl” as below.

apiVersion: ceph.rook.io/v1 kind: CephCluster metadata: finalizers:

  • cephcluster.ceph.rook.io generation: 1 name: rook-ceph083 selfLink: /apis/ceph.rook.io/v1/namespaces/rook-ceph/cephclusters/rook-ceph083 spec: cephVersion: image: csf-docker-delivered.repo.lab.pl.alcatel-lucent.com/ceph/ceph:v13.2.4-20190109 dashboard: enabled: true port: 8443 ssl: true dataDirHostPath: /data0/rook mon: allowMultiplePerNode: true count: 3 network: hostNetwork: false placement: all: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: is_control operator: In values: - “true” tolerations:
    • key: is_control operator: Exists rbdMirroring: workers: 0 resources: limits: {} requests: {} storage: config: databaseSizeMB: “1024” journalSizeMB: “1024” storeType: bluestore devices:
    • FullPath: “” config: null name: vdf useAllDevices: false useAllNodes: true status: state: Created

3). The dashboard can be accessed and viewed fine but “500” is popped periodically. ceph-mgr pod has corresponding logs

2019-03-23 22:27:34.850 7fd962c00700 0 mgr[dashboard] [10.76.47.200:62820] [GET] [500] [0.006s] [admin] [1.3K] /api/summary 2019-03-23 22:27:34.850 7fd962c00700 0 mgr[dashboard] [‘{“status”: “500 Internal Server Error”, “version”: “3.2.2”, “detail”: “The server encountered an unexpected condition which prevented it from fulfilling the request.”, “traceback”: “Traceback (most recent call last):\n File \”/usr/lib/python2.7/site-packages/cherrypy/_cprequest.py\“, line 656, in respond\n response.body = self.handler()\n File \”/usr/lib/python2.7/site-packages/cherrypy/lib/encoding.py\“, line 188, in call\n self.body = self.oldhandler(*args, **kwargs)\n File \”/usr/lib/python2.7/site-packages/cherrypy/lib/jsontools.py\“, line 61, in json_handler\n value = cherrypy.serving.request._json_inner_handler(*args, **kwargs)\n File \”/usr/lib/python2.7/site-packages/cherrypy/_cpdispatch.py\“, line 34, in call\n return self.callable(*self.args, **self.kwargs)\n File \”/usr/lib64/ceph/mgr/dashboard/controllers/summary.py\“, line 68, in call\n 'rbd_mirroring': self._rbd_mirroring(),\n File \”/usr/lib64/ceph/mgr/dashboard/controllers/summary.py\“, line 37, in _rbd_mirroring\n _, data = get_daemons_and_pools()\n File \”/usr/lib64/ceph/mgr/dashboard/tools.py\“, line 212, in wrapper\n return rvc.run(fn, args, kwargs)\n File \”/usr/lib64/ceph/mgr/dashboard/tools.py\“, line 194, in run\n raise self.exception\nUnboundLocalError: local variable 'mirror_mode' referenced before assignment\n”}’] 10.76.47.200 - - [23/Mar/2019:22:27:34] “GET /api/summary HTTP/1.1” 500 1353 “https://10.76.47.200:31731/” “Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36” 2019-03-23 22:27:36.785 7fd978241700 1 mgr send_beacon active 2019-03-23 22:27:38.788 7fd978241700 1 mgr send_beacon active 10.76.47.200 - - [23/Mar/2019:22:27:39] “GET /api/dashboard/health HTTP/1.1” 200 61770 “https://10.76.47.200:31731/” “Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36” 2019-03-23 22:27:40.491 7fd95db76700 -1 librbd::api::Mirror: mode_get: failed to retrieve mirror mode: (1) Operation not permitted 2019-03-23 22:27:40.491 7fd95db76700 -1 librbd::api::Mirror: mode_get: failed to retrieve mirror mode: (1) Operation not permitted 2019-03-23 22:27:40.491 7fd95db76700 0 mgr[dashboard] Failed to query mirror mode replicapool83 Traceback (most recent call last): File “/usr/lib64/ceph/mgr/dashboard/controllers/rbd_mirroring.py”, line 94, in get_pools mirror_mode = rbdctx.mirror_mode_get(ioctx) File “rbd.pyx”, line 1195, in rbd.RBD.mirror_mode_get (/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.4/rpm/el7/BUILD/ceph-13.2.4/build/src/pybind/rbd/pyrex/rbd.c:8543) PermissionError: [errno 1] error getting mirror mode 2019-03-23 22:27:40.491 7fd95db76700 0 mgr[dashboard] Error while calling fn=<function get_daemons_and_pools at 0x7fd96bd65f50> ex=local variable ‘mirror_mode’ referenced before assignment Traceback (most recent call last): File “/usr/lib64/ceph/mgr/dashboard/tools.py”, line 120, in run val = self.fn(*self.args, **self.kwargs) File “/usr/lib64/ceph/mgr/dashboard/controllers/rbd_mirroring.py”, line 152, in get_daemons_and_pools ‘pools’: get_pools(daemons) File “/usr/lib64/ceph/mgr/dashboard/controllers/rbd_mirroring.py”, line 99, in get_pools if mirror_mode == rbd.RBD_MIRROR_MODE_DISABLED: UnboundLocalError: local variable ‘mirror_mode’ referenced before assignment

Environment:

  • OS (e.g. from /etc/os-release): centos
  • Kernel (e.g. uname -a): Linux rook-control-01 3.10.0-862.14.4.el7.x86_64 #1 SMP Fri Sep 21 09:07:21 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
  • Cloud provider or hardware configuration: openstack
  • Rook version (use rook version inside of a Rook Pod): v0.9.3 after upgrade
  • Kubernetes version (use kubectl version): v1.12.3
  • Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): Tectonic
  • Storage backend status (e.g. for Ceph use ceph health in the [Rook Ceph toolbox] (https://rook.io/docs/Rook/master/toolbox.html)): HEALTH_OK

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 1
  • Comments: 15 (5 by maintainers)

Most upvoted comments

@sebastian-philipp

I’m repeatedly getting (with every page refresh) a red Toaster-Message on the Dashboards start page showing:

500 - OK
The server encountered an unexpected condition which prevented it from fulfilling the request.
<current date and time>

Most of the menus and contents work well, though. Only the “Cluster -> CRUSH-Map” shows the same error.

@yizhishang check the available versions here and set it in the CephCluster definition. For example, to use the latest, greatest and unstable(est) version, do: apiVersion: ceph.rook.io/v1 kind: CephCluster spec: cephVersion: image: ceph/ceph:v14.2.3-20190904

In production you should pin the version to a specific tag (with the date) and not to a rolling tag such as v14.

Thanks

@nathanmartins In my case, it is fixed in 14.2.2

I deployed my ceph cluster in kubernetes using image ceph/ceph:v14.2.1-20190430. My log in ceph-mgr pod:

TypeError: 'NoneType' object is not iterable
::ffff:10.167.226.145 - - [12/Jul/2019:21:41:53] "GET /api/health/minimal HTTP/1.1" 500 1646 "https://10.254.193.33:8443/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:67.0) Gecko/20100101 Firefox/67.0"
debug 2019-07-12 21:41:53.759 7f78d9b10700  0 mgr[dashboard] [12/Jul/2019:21:41:53] HTTP Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/cherrypy/_cprequest.py", line 656, in respond
    response.body = self.handler()
  File "/usr/lib/python2.7/site-packages/cherrypy/lib/encoding.py", line 188, in __call__
    self.body = self.oldhandler(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/cherrypy/_cptools.py", line 221, in wrap
    return self.newhandler(innerfunc, *args, **kwargs)
  File "/usr/share/ceph/mgr/dashboard/services/exception.py", line 88, in dashboard_exception_handler
    return handler(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/cherrypy/_cpdispatch.py", line 34, in __call__
    return self.callable(*self.args, **self.kwargs)
  File "/usr/share/ceph/mgr/dashboard/controllers/__init__.py", line 649, in inner
    ret = func(*args, **kwargs)
  File "/usr/share/ceph/mgr/dashboard/controllers/health.py", line 197, in minimal
    return self.health_minimal.all_health()
  File "/usr/share/ceph/mgr/dashboard/controllers/health.py", line 62, in all_health
    result['iscsi_daemons'] = self.iscsi_daemons()
  File "/usr/share/ceph/mgr/dashboard/controllers/health.py", line 126, in iscsi_daemons
    gateways = IscsiGatewaysConfig.get_gateways_config()['gateways']
  File "/usr/share/ceph/mgr/dashboard/services/iscsi_config.py", line 93, in get_gateways_config
    for instance in instances:
TypeError: 'NoneType' object is not iterable
debug 2019-07-12 21:41:53.760 7f78d9b10700  0 mgr[dashboard] [::ffff:10.167.226.145:38010] [GET] [500] [0.052s] [admin] [1.6K] /api/health/minimal
debug 2019-07-12 21:41:53.760 7f78d9b10700  0 mgr[dashboard] ['{"status": "500 Internal Server Error", "version": "3.2.2", "detail": "The server encountered an unexpected condition which prevented it from fulfilling the request.", "traceback": "Traceback (most recent call last):\\n  File \\"/usr/lib/python2.7/site-packages/cherrypy/_cprequest.py\\", line 656, in respond\\n    response.body = self.handler()\\n  File \\"/usr/lib/python2.7/site-packages/cherrypy/lib/encoding.py\\", line 188, in __call__\\n    self.body = self.oldhandler(*args, **kwargs)\\n  File \\"/usr/lib/python2.7/site-packages/cherrypy/_cptools.py\\", line 221, in wrap\\n    return self.newhandler(innerfunc, *args, **kwargs)\\n  File \\"/usr/share/ceph/mgr/dashboard/services/exception.py\\", line 88, in dashboard_exception_handler\\n    return handler(*args, **kwargs)\\n  File \\"/usr/lib/python2.7/site-packages/cherrypy/_cpdispatch.py\\", line 34, in __call__\\n    return self.callable(*self.args, **self.kwargs)\\n  File \\"/usr/share/ceph/mgr/dashboard/controllers/__init__.py\\", line 649, in inner\\n    ret = func(*args, **kwargs)\\n  File \\"/usr/share/ceph/mgr/dashboard/controllers/health.py\\", line 197, in minimal\\n    return self.health_minimal.all_health()\\n  File \\"/usr/share/ceph/mgr/dashboard/controllers/health.py\\", line 62, in all_health\\n    result[\'iscsi_daemons\'] = self.iscsi_daemons()\\n  File \\"/usr/share/ceph/mgr/dashboard/controllers/health.py\\", line 126, in iscsi_daemons\\n    gateways = IscsiGatewaysConfig.get_gateways_config()[\'gateways\']\\n  File \\"/usr/share/ceph/mgr/dashboard/services/iscsi_config.py\\", line 93, in get_gateways_config\\n    for instance in instances:\\nTypeError: \'NoneType\' object is not iterable\\n"}']

Affects us still on v1.0.1