orleans: Orleans.Hosting.Kubernetes monitoring IOException

This problem is similar to #6759 as I’ve configured the service- and clusterId + ENV variables in the same way. I’m working with a RBAC-enabled cluster.

When using UseKubernetesHosting I get Error monitoring Kubernetes pods in my log with the following exception information:

System.Threading.Tasks.TaskCanceledException: The operation was canceled.
 ---> System.IO.IOException: Unable to read data from the transport connection: Operation canceled.
 ---> System.Net.Sockets.SocketException (125): Operation canceled
   --- End of inner exception stack trace ---
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.GetResult(Int16 token)
   at System.Net.Security.SslStream.<FillBufferAsync>g__InternalFillBufferAsync|215_0[TReadAdapter](TReadAdapter adap, ValueTask`1 task, Int32 min, Int32 initial)
   at System.Net.Security.SslStream.ReadAsyncInternal[TReadAdapter](TReadAdapter adapter, Memory`1 buffer)
   at System.Net.Http.HttpConnection.FillAsync()
   at System.Net.Http.HttpConnection.ChunkedEncodingReadStream.ReadAsyncCore(Memory`1 buffer, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.HttpConnection.ChunkedEncodingReadStream.ReadAsyncCore(Memory`1 buffer, CancellationToken cancellationToken)
   at k8s.WatcherDelegatingHandler.CancelableStream.ReadAsync(Byte[] buffer, Int32 offset, Int32 count, CancellationToken cancellationToken)
   at System.IO.StreamReader.ReadBufferAsync(CancellationToken cancellationToken)
   at System.IO.StreamReader.ReadLineAsyncInternal()
   at k8s.WatcherDelegatingHandler.PeekableStreamReader.PeekLineAsync()
   at k8s.WatcherDelegatingHandler.LineSeparatedHttpContent.SerializeToStreamAsync(Stream stream, TransportContext context)
   at System.Net.Http.HttpContent.LoadIntoBufferAsyncCore(Task serializeToStreamTask, MemoryStream tempBuffer)
   at System.Net.Http.HttpClient.FinishSendAsyncBuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
   at k8s.Kubernetes.ListNamespacedPodWithHttpMessagesAsync(String namespaceParameter, Nullable`1 allowWatchBookmarks, String continueParameter, String fieldSelector, String labelSelector, Nullable`1 limit, String resourceVersion, Nullable`1 timeoutSeconds, Nullable`1 watch, String pretty, Dictionary`2 customHeaders, CancellationToken cancellationToken)
   at Orleans.Hosting.Kubernetes.KubernetesClusterAgent.MonitorKubernetesPods()

This is my current deployment.yaml, in which I added the core API group in apiGroups and the pods under resources:

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: clusterversions.orleans.dot.net
spec:
  group: orleans.dot.net
  version: v1
  scope: Namespaced
  names:
    plural: clusterversions
    singular: clusterversion
    kind: OrleansClusterVersion
    shortNames:
    - ocv
---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: silos.orleans.dot.net
spec:
  group: orleans.dot.net
  version: v1
  scope: Namespaced
  names:
    plural: silos
    singular: silo
    kind: OrleansSilo
    shortNames:
    - oso

---

apiVersion: v1
kind: ServiceAccount
metadata:
  name: conversational-orleans-service-account
  
---

apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
  name: conversational-orleans-service-role
subjects:
  - kind: ServiceAccount
    name: conversational-orleans-service-account
    namespace: apps
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: orleans-service-role

---

apiVersion: v1
kind: Service
metadata:
  name: conversational-orleans-silo
spec:
  selector:
    app: conversational-orleans-silo
  ports:
    - port: 80
---

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: orleans-service-role
rules:
- apiGroups: ["orleans.dot.net", ""] # "" indicates the core API group
  resources: ["silos", "clusterversions", "pods"]
  verbs: ["create", "get", "list", "watch", "update", "patch", "delete"]

I also tried running it with the exact example RBAC config of the docs, to no avail.

# The same CRD's + ServiceAccount as above is used here

apiVersion: v1
kind: Service
metadata:
  name: conversational-orleans-silo
spec:
  selector:
    app: conversational-orleans-silo
  ports:
    - port: 80
---

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: pod-reader
rules:
- apiGroups: [ "" ]
  resources: ["pods"]
  verbs: ["get", "watch", "list"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: pod-reader-binding
subjects:
- kind: ServiceAccount
  name: default
  apiGroup: ''
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: ''

Maybe this has to do with the fact that my silo’s run in the namespace apps rather than default?

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 27 (10 by maintainers)

Most upvoted comments

After cloning the package, referencing it from within ours and debugging, I found out the clusterId deviated from what I expected. The clusterId value was replaced by the one in appSettings. This wasn’t shown by Kubernetes as a describe on the pod returned a clusterId of (v1:metadata.labels['orleans/clusterId']) which I assumed was filled with the hardcoded value from the deployment.yaml.

Everything seems to work now, including clustering. In hindsight the exception can be explained but more concrete pointers would be useful.

Maybe you have a similar problem @worldspawn