cilium: Intermittent report of dropped logs from hubble.
Hello, we’ve been noticing some unusual behavior with cilium/hubble and we’re curious if anyone else has seen anything similar. Essentially, we’ve written network policies that allow traffic to a particular CIDR block/port, but we’re seeing intermittent drop logs from hubble with the log code 133 (policy interruption) where we know the traffic should be allowed (and usually is allowed). However, we haven’t actually actually noticed any service interruption due to those drops, so we’re unsure about whether these are real intermittent cilium drops or if hubble might be falsely reporting drops. Has anyone seen anything similar, whether the cause might be false reporting by hubble or real intermittent drops by cilium? We’re running cilium v1.8.6 on kubernetes 1.18.
here’s an example of a drop:
"attributes": {
"source": {
"namespace": "studio-live",
"identity": 42150,
"ID": 37,
"labels": [
"k8s:app.kubernetes.io/managed-by=spinnaker",
"k8s:app.kubernetes.io/name=baas",
"k8s:app=baas-config",
"k8s:io.cilium.k8s.namespace.labels.field.cattle.io/projectId=p-v9rth",
"k8s:io.cilium.k8s.policy.cluster=default",
"k8s:io.cilium.k8s.policy.serviceaccount=default",
"k8s:io.kubernetes.pod.namespace=studio-live",
"k8s:studio-context=live"
],
"pod_name": "baas-config-d84bc885f-8s564"
},
"IP": {
"ipVersion": "IPv4",
"destination": "10.8.2.59",
"source": "10.8.1.121"
},
"l4": {
"TCP": {
"destination_port": 26257,
"source_port": 48794,
"flags": {
"PSH": true,
"ACK": true
}
}
},
"destination": {
"identity": 2,
"labels": [
"reserved:world"
]
},
"node_name": "ip-10-8-1-11.us-east-2.compute.internal",
"traffic_direction": "EGRESS",
"ethernet": {
"destination": "2e:cf:00:b3:44:3c",
"source": "86:4f:19:2d:9b:ef"
},
"Type": "L3_L4",
"event_type": {
"sub_type": 133,
"type": 1
},
"verdict": "DROPPED",
"Summary": "TCP Flags: ACK, PSH",
"time": "2021-03-19T18:08:08.011207109Z",
"drop_reason": 133
}
policy:
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
annotations:
artifact.spinnaker.io/location: studio-live
artifact.spinnaker.io/name: network-config
artifact.spinnaker.io/type: kubernetes/CiliumNetworkPolicy.cilium.io
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"cilium.io/v2","kind":"CiliumNetworkPolicy", [...]}
moniker.spinnaker.io/application: baas
moniker.spinnaker.io/cluster: CiliumNetworkPolicy.cilium.io network-config
creationTimestamp: "2021-01-24T15:04:42Z"
generation: 4
labels:
app.kubernetes.io/managed-by: spinnaker
app.kubernetes.io/name: baas
managedFields:
- apiVersion: cilium.io/v2
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:artifact.spinnaker.io/location: {}
f:artifact.spinnaker.io/name: {}
f:artifact.spinnaker.io/type: {}
f:kubectl.kubernetes.io/last-applied-configuration: {}
f:moniker.spinnaker.io/application: {}
f:moniker.spinnaker.io/cluster: {}
f:labels:
.: {}
f:app.kubernetes.io/managed-by: {}
f:app.kubernetes.io/name: {}
f:spec:
.: {}
f:egress: {}
f:endpointSelector:
.: {}
f:matchLabels:
.: {}
f:app: {}
manager: kubectl
operation: Update
time: "2021-03-06T23:00:22Z"
name: network-config
namespace: studio-live
resourceVersion: "64708762"
selfLink: /apis/cilium.io/v2/namespaces/studio-live/ciliumnetworkpolicies/network-config
uid: af4b7347-921f-436d-ab8f-102933da6bf7
spec:
egress:
- toCIDRSet:
- cidr: 10.0.0.0/8
- toCIDR:
- 169.254.169.254/32
- 184.32.0.0/16
- toFQDNs:
- matchName: bondtech-docker-local.jfrog.io
- matchPattern: baas-*.*.*.rds.amazonaws.com
[...]
- toPorts:
- ports:
- port: "53"
protocol: ANY
rules:
dns:
- matchPattern: '*'
- ports:
- port: "8125"
protocol: UDP
- ports:
- port: "6000"
protocol: TCP
- port: "7000"
protocol: TCP
- toEndpoints:
- matchLabels:
app: baas-auth
- matchLabels:
k8s:app: cockroachdb
k8s:io.kubernetes.pod.namespace: crdb-dev3
[...]
endpointSelector:
matchLabels:
app: baas-config
How to reproduce the issue
- Intermittent. This seems to happen randomly so we do not have a concrete way to reproduce at will. We have asked in the slack and after initially triaging with @pchaigno and @aditighag the recommendation was to gather a dump and post a bug report for further review. cilium-sysdump-prod2-20210324-012008.zip cilium-sysdump-prod3-20210324-012151.zip
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 1
- Comments: 29 (13 by maintainers)
@joaoubaldo Could you please open a new issue with the same information as above plus a full dump of those Hubble drops (
-o jsonpb)? A Cilium sysdump of the cluster would also help.If the drops are intermittent, they might not cause connectivity issues. That seems more likely than Hubble getting packet drop notifications out of nowhere 🙂