kops: apiserver not starting with audit logging (STDOUT)
- What
kopsversion are you running?Version 1.8.0 (git-5099bc5) - What Kubernetes version are you running?
Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.0", GitCommit:"6e937839ac04a38cac63e6a7a306c5d035fe7b0a", GitTreeState:"clean", BuildDate:"2017-09-28T22:46:41Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
- What cloud provider are you using? aws
- What commands did you run? What is the simplest way to reproduce this issue?
kops edit
kubeAPIServer:
auditLogPath: '-'
auditPolicyFile: /srv/kubernetes/audit-policy.yaml
kops update
kops rolling-update cluster \
--name=$(CLUSTER_NAME) \
--fail-on-validate-error="$(KOPS_VERIFY)" \
--instance-group masters\
--force \
--yes
- What happened after the commands executed? kube-apiserver pod is not starting, see this in syslog at the master:
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: I0105 17:22:56.225033 2882 kube_boot.go:171] kubelet systemd service already running
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: I0105 17:22:56.225044 2882 channels.go:31] checking channel: "s3://xxx/k8s/k8s-cluster0
1.dev.xxx.yy/addons/bootstrap-channel.yaml"
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: I0105 17:22:56.225143 2882 channels.go:45] Running command: channels apply channel s3://xxx
tion/k8s/k8s-cluster01.dev.xxx.yy/addons/bootstrap-channel.yaml --v=4 --yes
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: I0105 17:22:56.314835 2882 channels.go:48] error running channels apply channel s3://xxx/k8s/k8s-cluster01.dev.xxxx.yy/addons/bootstrap-channel.yaml --v=4 --yes:
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: I0105 17:22:56.314901 2882 channels.go:49] Error: error querying kubernetes version: Get https://127.0.0.1/version: dial tcp 127.0.0.1:443: getsockopt: connection refused
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: Usage:
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: channels apply channel [flags]
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: Flags:
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: -f, --filename stringSlice Apply from a local file
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: --yes Apply update
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: Global Flags:
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: --alsologtostderr log to standard error as well as files
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: --config string config file (default is $HOME/.channels.yaml)
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: --log_backtrace_at traceLocation when logging hits line file:N, emit a stack trace (default :0)
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: --log_dir string If non-empty, write log files in this directory
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: --logtostderr log to standard error instead of files (default false)
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: --stderrthreshold severity logs at or above this threshold go to stderr (default 2)
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: -v, --v Level log level for V logs (default 0)
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: --vmodule moduleSpec comma-separated list of pattern=N settings for file-filtered logging
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: error querying kubernetes version: Get https://127.0.0.1/version: dial tcp 127.0.0.1:443: getsockopt: connection refused
Jan 5 17:22:56 ip-172-24-15-90 docker[2826]: I0105 17:22:56.314939 2882 channels.go:34] apply channel output was: Error: error querying kubernetes version: Get https://127.0.0.1/version: dial tcp 127.0.0.1:443: getsockopt: connection refused
- What did you expect to happen? A running apiserver and auditlogs in STDOUT
- Please provide your cluster manifest.
apiVersion: kops/v1alpha2
kind: Cluster
metadata:
creationTimestamp: 2017-03-15T17:28:12Z
name: k8s-cluster01.dev.xxx.de
spec:
additionalPolicies:
master: |
[
{
"Action": [
"autoscaling:Describe*",
"cloudwatch:*",
"logs:*",
"sns:*"
],
"Effect": "Allow",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "route53:ListHostedZonesByName",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "elasticloadbalancing:DescribeLoadBalancers",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "route53:ChangeResourceRecordSets",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"es:ESHttpGet",
"es:ESHttpHead",
"es:ESHttpPost",
"es:ESHttpPut"
],
"Resource": [
"arn:aws:es:eu-central-1:xxx:domain/xxxx-ek/*"
]
}
]
node: |
[
{
"Action": [
"autoscaling:Describe*",
"cloudwatch:*",
"logs:*",
"sns:*"
],
"Effect": "Allow",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "route53:ListHostedZonesByName",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "elasticloadbalancing:DescribeLoadBalancers",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "route53:ChangeResourceRecordSets",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "elasticfilesystem:Describe*",
"Resource": "*"
}
]
api:
loadBalancer:
type: Public
authorization:
alwaysAllow: {}
channel: stable
cloudLabels:
service: kubernetes
team: k8s
vertical: k8s
cloudProvider: aws
configBase: s3://xxxnger-dev-production-configuration/k8s/k8s-cluster01.dev.xxx.de
dnsZone: dev.xxx.de
etcdClusters:
- etcdMembers:
- instanceGroup: master-eu-central-1b-1
name: b-1
- instanceGroup: master-eu-central-1a-1
name: a-1
- instanceGroup: master-eu-central-1a-2
name: a-2
name: main
- etcdMembers:
- instanceGroup: master-eu-central-1b-1
name: b-1
- instanceGroup: master-eu-central-1a-1
name: a-1
- instanceGroup: master-eu-central-1a-2
name: a-2
name: events
fileAssets:
- content: |
# Log all requests at the Metadata level.
apiVersion: audit.k8s.io/v1beta1 # This is required.
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
omitStages:
- "RequestReceived"
rules:
# Log pod changes at RequestResponse level
- level: RequestResponse
resources:
- group: ""
# Resource "pods" doesn't match requests to any subresource of pods,
# which is consistent with the RBAC policy.
resources: ["pods"]
# Log "pods/log", "pods/status" at Metadata level
- level: Metadata
resources:
- group: ""
resources: ["pods/log", "pods/status"]
# Don't log requests to a configmap called "controller-leader"
- level: None
resources:
- group: ""
resources: ["configmaps"]
resourceNames: ["controller-leader"]
# Don't log watch requests by the "system:kube-proxy" on endpoints or services
- level: None
users: ["system:kube-proxy"]
verbs: ["watch"]
resources:
- group: "" # core API group
resources: ["endpoints", "services"]
# Don't log authenticated requests to certain non-resource URL paths.
- level: None
userGroups: ["system:authenticated"]
nonResourceURLs:
- "/api*" # Wildcard matching.
- "/version"
# Log the request body of configmap changes in kube-system.
- level: Request
resources:
- group: "" # core API group
resources: ["configmaps"]
# This rule only applies to resources in the "kube-system" namespace.
# The empty string "" can be used to select non-namespaced resources.
namespaces: ["kube-system"]
# Log configmap and secret changes in all other namespaces at the Metadata level.
- level: Metadata
resources:
- group: "" # core API group
resources: ["secrets", "configmaps"]
# Log all other resources in core and extensions at the Request level.
- level: Request
resources:
- group: "" # core API group
- group: "extensions" # Version of group should NOT be included.
# A catch-all rule to log all other requests at the Metadata level.
- level: Metadata
# Long-running requests like watches that fall under this rule will not
# generate an audit event in RequestReceived.
omitStages:
- "RequestReceived"
name: audit-policy-file
path: /srv/kubernetes/audit-policy.yaml
roles:
- Master
iam:
legacy: true
kubeAPIServer:
authorizationMode: RBAC
auditLogPath: '-'
auditPolicyFile: /srv/kubernetes/audit-policy.yaml
kubernetesApiAccess:
- 0.0.0.0/0
kubernetesVersion: 1.8.0
masterInternalName: api.internal.k8s-cluster01.dev.xxx.yy
networkCIDR: 172.24.0.0/16
networkID: vpc-yyyyyyy
networking:
flannel:
backend: udp
nonMasqueradeCIDR: 100.64.0.0/10
sshAccess:
- 0.0.0.0/0
subnets:
- cidr: 172.24.0.0/19
id: subnet-yyyyyy
name: eu-central-1a
type: Private
zone: eu-central-1a
- cidr: 172.24.32.0/19
id: subnet-yyyyyyy
name: eu-central-1b
type: Private
zone: eu-central-1b
- cidr: 172.24.128.0/19
id: subnet-yyyyyy
name: utility-eu-central-1a
type: Utility
zone: eu-central-1a
- cidr: 172.24.160.0/19
id: subnet-yyyyyy
name: utility-eu-central-1b
type: Utility
zone: eu-central-1b
topology:
dns:
type: Public
masters: private
nodes: private
- Anything else do we need to know? I have a dirty workaround: deleted
auditLogPath: '-'
auditPolicyFile: /srv/kubernetes/audit-policy.yaml
from cluster config and activate audit logging by editing die kube-apiserver.manifes
- adding args to /usr/local/bin/kube-apiserver call
--audit-policy-file=/srv/kubernetes/audit-policy.yaml --audit-log-path=- - remove the link of STDERR to STDOUT
2>&1The working manifest is then:
apiVersion: v1
kind: Pod
metadata:
annotations:
dns.alpha.kubernetes.io/internal: api.internal.k8s-cluster01.dev.xxxx.yy
scheduler.alpha.kubernetes.io/critical-pod: ""
creationTimestamp: null
labels:
k8s-app: kube-apiserver
name: kube-apiserver
namespace: kube-system
spec:
containers:
- command:
- /bin/sh
- -c
- /usr/local/bin/kube-apiserver --address=127.0.0.1 --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,PersistentVolumeLabel,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,Priority,ResourceQuota
--allow-privileged=true --anonymous-auth=false --apiserver-count=3 --authorization-mode=RBAC
--basic-auth-file=/srv/kubernetes/basic_auth.csv --client-ca-file=/srv/kubernetes/ca.crt
--audit-policy-file=/srv/kubernetes/audit-policy.yaml --audit-log-path=-
--cloud-provider=aws --etcd-servers-overrides=/events#http://127.0.0.1:4002
--etcd-servers=http://127.0.0.1:4001 --insecure-port=8080 --kubelet-preferred-address-types=InternalIP,Hostname,ExternalIP
--proxy-client-cert-file=/srv/kubernetes/apiserver-aggregator.cert --proxy-client-key-file=/srv/kubernetes/apiserver-aggregator.key
--requestheader-allowed-names=aggregator --requestheader-client-ca-file=/srv/kubernetes/apiserver-aggregator-ca.cert
--requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group
--requestheader-username-headers=X-Remote-User --secure-port=443 --service-cluster-ip-range=100.64.0.0/13
--storage-backend=etcd2 --tls-cert-file=/srv/kubernetes/server.cert --tls-private-key-file=/srv/kubernetes/server.key
--token-auth-file=/srv/kubernetes/known_tokens.csv --v=2 | /bin/tee -a
/var/log/kube-apiserver.log
image: gcr.io/google_containers/kube-apiserver:v1.8.0
livenessProbe:
httpGet:
host: 127.0.0.1
path: /healthz
port: 8080
initialDelaySeconds: 15
timeoutSeconds: 15
name: kube-apiserver
ports:
- containerPort: 443
hostPort: 443
name: https
- containerPort: 8080
hostPort: 8080
name: local
resources:
requests:
cpu: 150m
volumeMounts:
- mountPath: /etc/ssl
name: etcssl
readOnly: true
- mountPath: /etc/pki/tls
name: etcpkitls
readOnly: true
- mountPath: /etc/pki/ca-trust
name: etcpkica-trust
readOnly: true
- mountPath: /usr/share/ssl
name: usrsharessl
readOnly: true
- mountPath: /usr/ssl
name: usrssl
readOnly: true
- mountPath: /usr/lib/ssl
name: usrlibssl
readOnly: true
- mountPath: /usr/local/openssl
name: usrlocalopenssl
readOnly: true
- mountPath: /var/ssl
name: varssl
readOnly: true
- mountPath: /etc/openssl
name: etcopenssl
readOnly: true
- mountPath: /var/log/kube-apiserver.log
name: logfile
- mountPath: /srv/kubernetes
name: srvkube
readOnly: true
- mountPath: /srv/sshproxy
name: srvsshproxy
readOnly: true
hostNetwork: true
tolerations:
- key: CriticalAddonsOnly
operator: Exists
volumes:
- hostPath:
path: /etc/ssl
name: etcssl
- hostPath:
path: /etc/pki/tls
name: etcpkitls
- hostPath:
path: /etc/pki/ca-trust
name: etcpkica-trust
- hostPath:
path: /usr/share/ssl
name: usrsharessl
- hostPath:
path: /usr/ssl
name: usrssl
- hostPath:
path: /usr/lib/ssl
name: usrlibssl
- hostPath:
path: /usr/local/openssl
name: usrlocalopenssl
- hostPath:
path: /var/ssl
name: varssl
- hostPath:
path: /etc/openssl
name: etcopenssl
- hostPath:
path: /var/log/kube-apiserver.log
name: logfile
- hostPath:
path: /srv/kubernetes
name: srvkube
- hostPath:
path: /srv/sshproxy
name: srvsshproxy
status: {}
Now i have auditlogs in STDOUT of the POD and fluentd is able to ship.
PS:
I see the same error when logging to a file like /tmp/foo with 2>&1. If i delete 2>&1, then i see audit logs in /tmp/foo in the container.
spec:
containers:
- command:
- /bin/sh
- -c
- /usr/local/bin/kube-apiserver --address=127.0.0.1 --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,PersistentVolumeLabel,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,Priority,ResourceQuota
--allow-privileged=true --anonymous-auth=false --apiserver-count=3 --authorization-mode=RBAC
--basic-auth-file=/srv/kubernetes/basic_auth.csv --client-ca-file=/srv/kubernetes/ca.crt
--audit-policy-file=/srv/kubernetes/audit-policy.yaml --audit-log-path=/tmp/foo
--audit-log-maxage=10 --audit-log-maxbackup=1 --audit-log-maxsize=100
--cloud-provider=aws --etcd-servers-overrides=/events#http://127.0.0.1:4002
--etcd-servers=http://127.0.0.1:4001 --insecure-port=8080 --kubelet-preferred-address-types=InternalIP,Hostname,ExternalIP
--proxy-client-cert-file=/srv/kubernetes/apiserver-aggregator.cert --proxy-client-key-file=/srv/kubernetes/apiserver-aggregator.key
--requestheader-allowed-names=aggregator --requestheader-client-ca-file=/srv/kubernetes/apiserver-aggregator-ca.cert
--requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group
--requestheader-username-headers=X-Remote-User --secure-port=443 --service-cluster-ip-range=100.64.0.0/13
--storage-backend=etcd2 --tls-cert-file=/srv/kubernetes/server.cert --tls-private-key-file=/srv/kubernetes/server.key
--token-auth-file=/srv/kubernetes/known_tokens.csv --v=2 | /bin/tee -a
/var/log/kube-apiserver.log
image: gcr.io/google_containers/kube-apiserver:v1.8.0
...
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 24 (9 by maintainers)
Commits related to this issue
- Don't mount volume for auditLog when STDOUT is configured as path Fixes #4202 — committed to kampka/kops by kampka 6 years ago
I’ve opened a pull request with a potential fix for this issue, as far as I can reproduce it. Additional testing is appreciated though.
When testing this issue, please keep in mind that this happens in
nodeupwhich can be tricky to test. I advise taking the long way for testing.