cluster-monitoring-operator: Timezone problem with kube-state-metrics

Hi I updated my cluster yesterday with openshift-ansible with this commit https://github.com/openshift/openshift-ansible/commit/8c77207289a9ae8b1c3f565aba45d662e62a9fb3 This commit changed the timezone in api, controller and etcd. kube-state-metrics pod is still in UTC timezone and I get exactly this issue: https://github.com/kubernetes/kube-state-metrics/issues/500

What can I do? It is possible to set the timezone also in kube-state-metric pod?

Openshift-Version:

oc version
oc v3.11.0+b6db8e6-107
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://s-cp-lb-01.cloud.mycompany.de:443
openshift v3.11.0+d0c29df-98
kubernetes v1.11.0+d4cacc0

If you need more information let me know.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 18 (14 by maintainers)

Commits related to this issue

Most upvoted comments

In 2019 a timezone should be not an issue. What the problem to set the TZ environment variable? It could be done by a ansible fact like described above.

From my side, it should be document that the timezone must be unique across the whole cluster. But the timezone should be managed by the user.

You might be get an ideal solution but it is not a real world solution.

Openshift is the enterprise version of kubernetes. Its mainly using inside onpremise datacenter. Supporting only UTC is bogus and breaks a lot IT process in (german) datacenters.

The worst case would be that RedHat official supports UTC only.

my opinion, for 3.x we should hostmount /etc/localtime into the kube-state-metrics container, just like we do with the api and etcd containers.

for 4.x we should use UTC everywhere. We should not continue down this path.

FYI @Reamer @brancz @sdodson

I’ve verified timezone influence for CronJob as follows.

In my conclusion, CronJob starting time depends on control plane(controller) timezone, not kube-state-metrics timezone. But kube_cronjob_next_schedule_time value depends on kube-state-metrics timezone. Look the test2 section, it’s buggy.

  • test1>
    • api, controller, etcd timezone: UTC
    • kube-state-metrics: UTC
    • CronJob Schedule: 5 9 * * *
    • kube_cronjob_next_schedule_time:
    # TZ=UTC date -d @1553245500
    Fri Mar 22 09:05:00 UTC 2019

    # date -d @1553245500
    Fri Mar 22 18:05:00 JST 2019
  • test2> This pattern is buggy, look the next schedule time return as UTC timezone, even though CronJob is scheduled as JST. The time is same but timezone is different.
    • api, controller, etcd timezone: JST (UTC+9)
    • kube-state-metrics: UTC
    • CronJob Schedule: 50 18 * * *
    • kube_cronjob_next_schedule_time:
    # TZ=UTC date -d @1553194200
    Thu Mar 21 18:50:00 UTC 2019

    # date -d @1553194200
    Fri Mar 22 03:50:00 JST 2019
  • test3>
    • api, controller, etcd timezone: JST (UTC+9)
    • kube-state-metrics: JST (UTC+9)
    • CronJob Schedule: 0 19 * * *
    • kube_cronjob_next_schedule_time:
    # TZ=UTC date -d @1553248800
    Fri Mar 22 10:00:00 UTC 2019

    # date -d @1553248800
    Fri Mar 22 19:00:00 JST 2019
  • Refer the following testing evidences.

test1>

  # for ctr in $(oc get pod -o name -n kube-system); do echo "$ctr : $(oc rsh -n kube-system $ctr date)"; done
  pod/master-api-all.ocp311.example.com : Thu Mar 21 08:57:01 UTC 2019
  pod/master-controllers-all.ocp311.example.com : Thu Mar 21 08:57:02 UTC 2019
  pod/master-etcd-all.ocp311.example.com : Thu Mar 21 08:57:03 UTC 2019

  # oc rsh -n openshift-monitoring -c kube-state-metrics deployment/kube-state-metrics date
  Thu Mar 21 08:57:17 UTC 2019

  # oc create -f - <<EOF
  apiVersion: batch/v1beta1
  kind: CronJob
  metadata:
    name: testcronjob
  spec:
    jobTemplate:
      spec:
        template:
          spec:
            containers:
            - command:
              - date
              image: busybox
              imagePullPolicy: Always
              name: test
            restartPolicy: OnFailure
    schedule: '5 9 * * *'
    successfulJobsHistoryLimit: 3
    suspend: false
  EOF

  # date
  Thu Mar 21 18:03:40 JST 2019
  # TZ=UTC date
  Thu Mar 21 09:04:33 UTC 2019

  # oc describe cj testcronjob 
  Name:                       testcronjob
  Namespace:                  test
  Labels:                     <none>
  Annotations:                <none>
  Schedule:                   5 9 * * *
  ...
  Last Schedule Time:  Thu, 21 Mar 2019 18:05:00 +0900
  Active Jobs:         <none>
  Events:
    Type    Reason            Age   From                Message
    ----    ------            ----  ----                -------
    Normal  SuccessfulCreate  23s   cronjob-controller  Created job testcronjob-1553159100
    Normal  SawCompletedJob   3s    cronjob-controller  Saw completed job: testcronjob-1553159100


  # oc exec -n openshift-monitoring -c prometheus prometheus-k8s-0 -- curl -s \
            'http://localhost:9090/api/v1/query?query=kube_cronjob_next_schedule_time' | python -m json.tool
  {
      "data": {
          "result": [
              {
                  "metric": {
                      "__name__": "kube_cronjob_next_schedule_time",
                      "cronjob": "testcronjob",
                      "endpoint": "https-main",
                      "instance": "10.128.1.88:8443",
                      "job": "kube-state-metrics",
                      "namespace": "test",
                      "pod": "kube-state-metrics-75b9b8dcc4-wmkrm",
                      "service": "kube-state-metrics"
                  },
                  "value": [
                      1553159458.714,
                      "1553245500"
                  ]
              }
          ],
          "resultType": "vector"
      },
      "status": "success"
  }

  # TZ=UTC date -d @1553245500
  Fri Mar 22 09:05:00 UTC 2019

After changing UTC timezone to JST for only control plane.

test2>

  # for ctr in $(oc get pod -o name -n kube-system); do echo "$ctr : $(oc rsh -n kube-system $ctr date)"; done
  pod/master-api-all.ocp311.example.com : Thu Mar 21 18:42:43 JST 2019
  pod/master-controllers-all.ocp311.example.com : Thu Mar 21 18:42:47 JST 2019
  pod/master-etcd-all.ocp311.example.com : Thu Mar 21 18:42:49 JST 2019

  # oc rsh -n openshift-monitoring -c kube-state-metrics deployment/kube-state-metrics date
  Thu Mar 21 09:43:39 UTC 2019

  # date
  Thu Mar 21 18:44:13 JST 2019
  # TZ=UTC date
  Thu Mar 21 09:44:18 UTC 2019

  # oc edit cj/testcronjob
  ...
    schedule: 50 18 * * *
  ...

  # oc describe cj/testcronjob
  Name:                       testcronjob
  Namespace:                  test
  Labels:                     <none>
  Annotations:                <none>
  Schedule:                   50 18 * * *
  ...
  Last Schedule Time:  Thu, 21 Mar 2019 18:50:00 +0900
  Active Jobs:         <none>
  Events:
    Type    Reason            Age   From                Message
    ----    ------            ----  ----                -------
    Normal  SuccessfulCreate  45m   cronjob-controller  Created job testcronjob-1553159100
    Normal  SawCompletedJob   45m   cronjob-controller  Saw completed job: testcronjob-1553159100
    Normal  SuccessfulCreate  25s   cronjob-controller  Created job testcronjob-1553161800
    Normal  SawCompletedJob   5s    cronjob-controller  Saw completed job: testcronjob-1553161800

  # oc exec -n openshift-monitoring -c prometheus prometheus-k8s-0 -- curl -s \
             'http://localhost:9090/api/v1/query?query=kube_cronjob_next_schedule_time' | python -m json.tool
  {
      "data": {
          "result": [
              {
                  "metric": {
                      "__name__": "kube_cronjob_next_schedule_time",
                      "cronjob": "testcronjob",
                      "endpoint": "https-main",
                      "instance": "10.128.1.91:8443",
                      "job": "kube-state-metrics",
                      "namespace": "test",
                      "pod": "kube-state-metrics-75b9b8dcc4-wmkrm",
                      "service": "kube-state-metrics"
                  },
                  "value": [
                      1553161897.961,
                      "1553194200"
                  ]
              }
          ],
          "resultType": "vector"
      },
      "status": "success"
  }

  # TZ=UTC date -d @1553194200
  Thu Mar 21 18:50:00 UTC 2019

  # date -d @1553194200
  Fri Mar 22 03:50:00 JST 2019

After stop cluster-monitoring-operator and prometheus-operator, change the timezone to JST (UTC+9) for kube-state-metrics.

test3>

  # oc set env deployment/kube-state-metrics TZ=Asia/Tokyo -n openshift-monitoring
  deployment.extensions/kube-state-metrics updated

  # oc rsh -n openshift-monitoring -c kube-state-metrics deployment/kube-state-metrics date
  Thu Mar 21 18:57:44 JST 2019

  # oc edit cj/testcronjob
  ...
    schedule: 0 19 * * *
  ...

  # date
  Thu Mar 21 18:59:28 JST 2019
  # TZ=UTC date
  Thu Mar 21 09:59:34 UTC 2019

  # oc describe cj/testcronjob
  Name:                       testcronjob
  Namespace:                  test
  Labels:                     <none>
  Annotations:                <none>
  Schedule:                   0 19 * * *
  ...
  Last Schedule Time:  Thu, 21 Mar 2019 19:00:00 +0900
  Active Jobs:         <none>
  Events:
    Type    Reason            Age   From                Message
    ----    ------            ----  ----                -------
    Normal  SuccessfulCreate  55m   cronjob-controller  Created job testcronjob-1553159100
    Normal  SawCompletedJob   55m   cronjob-controller  Saw completed job: testcronjob-1553159100
    Normal  SuccessfulCreate  10m   cronjob-controller  Created job testcronjob-1553161800
    Normal  SawCompletedJob   10m   cronjob-controller  Saw completed job: testcronjob-1553161800
    Normal  SuccessfulCreate  23s   cronjob-controller  Created job testcronjob-1553162400
    Normal  SawCompletedJob   3s    cronjob-controller  Saw completed job: testcronjob-1553162400

  # oc exec -n openshift-monitoring -c prometheus prometheus-k8s-0 -- curl -s \
             'http://localhost:9090/api/v1/query?query=kube_cronjob_next_schedule_time' | python -m json.tool
  {
      "data": {
          "result": [
              {
                  "metric": {
                      "__name__": "kube_cronjob_next_schedule_time",
                      "cronjob": "testcronjob",
                      "endpoint": "https-main",
                      "instance": "10.128.1.120:8443",
                      "job": "kube-state-metrics",
                      "namespace": "test",
                      "pod": "kube-state-metrics-6484658f69-576sd",
                      "service": "kube-state-metrics"
                  },
                  "value": [
                      1553162486.08,
                      "1553248800"
                  ]
              }
          ],
          "resultType": "vector"
      },
      "status": "success"
  }

  # TZ=UTC date -d @1553248800
  Fri Mar 22 10:00:00 UTC 2019

  # date -d @1553248800
  Fri Mar 22 19:00:00 JST 2019

Hi @brancz, just update cluster-monitoring-operator. Your change works. Thank you. Screenshot_2019-05-13 Prometheus Time Series Collection and Processing Server

I feel like this should be brought up on a broader level (probably at least on aos-devel), but yes I agree with this.

I think it’d be much better if everything were UTC than having components respecting different timezones.

We could implement a new ansible-playbook variable: openshift_logging_kube_state_metrics_timezone: "Europe/Paris" Default value is generated via facts on the master.

This value will be set as env var or config parameter… on the cluster-monitoring-operator deployment.

If the cluster-monitoring-operator detects this variable / config value, it will add a TZ env var to the kube-state-metrics deploy.

The kube-state-metrics can now translate the kubernetes metrics timezone to UTC or export the time values with +x values for its timezone so that prometheus can convert it to UTC.

@brancz Can I get your opinion whether you think we should revert the changes that were made in the linked pull requests for openshift-ansible? We did this because people complained when the api/controllers/etcd processes moved from host services to static pods without access to /etc/localtime which meant their log timestamps were different from the rest of the system.

Hi @bysnupy,

Steps to reproduce:

  • Install Openshift 3.11 with ansible-playbook on machines, which are not in time zone UTC. My machines are in time zone Europe/Berlin
    • the version of ansible-playbook must include your time zone change. Your changes are in branch release-3.11.
  • the project openshift-monitoring should created by default
    • cluster-montirong-operator will setup prometheus ( with configuration and rules), grafana, node-exporter and kube-state-metrics
  • Install also the logging components with ansible playbook
    • this will create the curator cron job

With a time zone in kube-api pod, the time for the next cron job is reported not in UTC any more. The application of kube-state-metrics but still calculates with UTC. Now I have the same issue, which is described here https://github.com/kubernetes/kube-state-metrics/issues/500