origin: Console install failed when upgrading from 3.7 to 3.9 via ansible

While upgrading cluster from 3.7 to 3.9 with branch release-3.9, console refuses to start and upgrade is stopped. Pod webconsole in CrashLoopBackOff state with error “Error: unable to load server certificate: open /var/serving-cert/tls.crt: permission denied”

Version

oc v3.9.0+ba7faec-1 kubernetes v1.9.1+a0ce1bc657 features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://oomaster1.in.ac-caen.fr:8443 openshift v3.9.0+ba7faec-1 kubernetes v1.9.1+a0ce1bc657

Steps To Reproduce
  1. git clone -b release-3.9 https://github.com/openshift/openshift-ansible.git
  2. ansible-playbook openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_9/upgrade.yml
Current Result
TASK [openshift_web_console : debug] **********************************************************************************************************************************************************
ok: [oomaster1.in.ac-caen.fr] => {
    "msg": [
        "W0502 07:51:08.109897       1 start.go:93] Warning: config.clusterInfo.loggingPublicURL: Invalid value: \"\": required to view aggregated container logs in the console, web console start will continue.",
        "Error: unable to load server certificate: open /var/serving-cert/tls.crt: permission denied",
        "Usage:",
        "  origin-web-console [flags]",
        "",
        "Flags:",
        "      --alsologtostderr                                log to standard error as well as files",
        "      --audit-log-format string                        Format of saved audits. \"legacy\" indicates 1-line text format for each event. \"json\" indicates structured json format. Requires the 'AdvancedAuditing' feature gate. Known formats are legacy,json. (default \"json\")",
        "      --audit-log-maxage int                           The maximum number of days to retain old audit log files based on the timestamp encoded in their filename.",
        "      --audit-log-maxbackup int                        The maximum number of old audit log files to retain.",
        "      --audit-log-maxsize int                          The maximum size in megabytes of the audit log file before it gets rotated.",
        "      --audit-log-path string                          If set, all requests coming to the apiserver will be logged to this file.  '-' means standard out.",
        "      --audit-policy-file string                       Path to the file that defines the audit policy configuration. Requires the 'AdvancedAuditing' feature gate. With AdvancedAuditing, a profile is required to enable auditing.",
        "      --audit-webhook-batch-buffer-size int            The size of the buffer to store events before batching and sending to the webhook. Only used in batch mode. (default 10000)",
        "      --audit-webhook-batch-initial-backoff duration   The amount of time to wait before retrying the first failed requests. Only used in batch mode. (default 10s)",
        "      --audit-webhook-batch-max-size int               The maximum size of a batch sent to the webhook. Only used in batch mode. (default 400)",
        "      --audit-webhook-batch-max-wait duration          The amount of time to wait before force sending the batch that hadn't reached the max size. Only used in batch mode. (default 30s)",
        "      --audit-webhook-batch-throttle-burst int         Maximum number of requests sent at the same moment if ThrottleQPS was not utilized before. Only used in batch mode. (default 15)",
        "      --audit-webhook-batch-throttle-qps float32       Maximum average number of requests per second. Only used in batch mode. (default 10)",
        "      --audit-webhook-config-file string               Path to a kubeconfig formatted file that defines the audit webhook configuration. Requires the 'AdvancedAuditing' feature gate.",
        "      --audit-webhook-mode string                      Strategy for sending audit events. Blocking indicates sending events should block server responses. Batch causes the webhook to buffer and send events asynchronously. Known modes are batch,blocking. (default \"batch\")",
        "      --config string                                  filename containing the WebConsoleConfig",
        "      --contention-profiling                           Enable lock contention profiling, if profiling is enabled",
        "      --enable-swagger-ui                              Enables swagger ui on the apiserver at /swagger-ui",
        "      --log-flush-frequency duration                   Maximum number of seconds between log flushes (default 5s)",
        "      --log_backtrace_at traceLocation                 when logging hits line file:N, emit a stack trace (default :0)",
        "      --log_dir string                                 If non-empty, write log files in this directory",
        "      --logtostderr                                    log to standard error instead of files (default true)",
        "      --profiling                                      Enable profiling via web interface host:port/debug/pprof/ (default true)",
        "      --stderrthreshold severity                       logs at or above this threshold go to stderr (default 2)",
        "  -v, --v Level                                        log level for V logs",
        "      --vmodule moduleSpec                             comma-separated list of pattern=N settings for file-filtered logging",
        "",
        "F0502 07:51:08.126954       1 console.go:35] unable to load server certificate: open /var/serving-cert/tls.crt: permission denied"
    ]
}

TASK [openshift_web_console : Remove temp directory] ******************************************************************************************************************************************
ok: [oomaster1.in.ac-caen.fr]

TASK [openshift_web_console : Report console errors] ******************************************************************************************************************************************
fatal: [oomaster1.in.ac-caen.fr]: FAILED! => {"changed": false, "msg": "Console install failed."}
        to retry, use: --limit @/root/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_9/upgrade.retry

PLAY RECAP ************************************************************************************************************************************************************************************
localhost                  : ok=25   changed=0    unreachable=0    failed=0
oomaster1.in.ac-caen.fr    : ok=591  changed=92   unreachable=0    failed=1



Failure summary:


  1. Hosts:    oomaster1.in.ac-caen.fr
     Play:     Upgrade web console
     Task:     Report console errors
     Message:  Console install failed.
Expected Result

Console starts and upgrade continues.

Additional Information

Nodes remain in v3.7.0 :

[root@oomaster1 ~]# oc get nodes
NAME                      STATUS    ROLES     AGE       VERSION
oomaster1.in.ac-caen.fr   Ready     master    102d      v1.9.1+a0ce1bc657
oonode1.in.ac-caen.fr     Ready     <none>    102d      v1.7.6+a08f5eeb62
oonode2.in.ac-caen.fr     Ready     <none>    102d      v1.7.6+a08f5eeb62

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 18 (5 by maintainers)

Most upvoted comments

PR to change the file permissions to avoid this problem:

https://github.com/openshift/openshift-ansible/pull/8558