temporal: Prevent incorrect service discovery with multiple Temporal clusters

Is your feature request related to a problem? Please describe.

I run multiple separate Temporal clusters within a single k8s cluster. Each Temporal cluster has its own separate set of frontend, history, and matching services as well as persistence. Let’s say I am running two Temporal clusters called “A” and “B” in a single k8s cluster. Note that in my setup, there are no networking restrictions on pods within the k8s cluster – any pod may connect to any other pod if the IP address is known.

I recently encountered a problem where it appeared that a frontend service from Temporal cluster A was talking to a matching service from Temporal cluster B. This happened during a time where the pods in both of the Temporal clusters were getting cycled a lot due to some AZ balancing automation. It also happens that this particular k8s cluster is configured in such a way that pod IP address reuse is more likely than usual.

Both Temporal cluster A and B are running 3 matching nodes each. However, I saw this log line on Temporal cluster A’s frontend service:

{"level":"info","ts":"2021-01-27T00:34:15.414Z","msg":"Current reachable members","service":"frontend","component":"service-resolver","service":"matching","addresses":"[100.123.207.80:7235 100.123.65.65:7235 100.123.120.28:7235 100.123.60.187:7235 100.123.17.255:7235 100.123.203.172:7235]","logging-call-at":"rpServiceResolver.go:266"}

This is saying that Temporal cluster A’s frontend service is seeing 6 matching nodes, three from A and three from B. Yikes.

I believe what led to this is something like:

A matching pod in cluster A gets replaced, releasing its IP address. This IP address remains in cluster A’s cluster_metadata table.
A matching pod is created in cluster B re-using this IP address.
An event occurs that causes a frontend in cluster A to re-read the the node membership for its matching nodes. It finds the original matching node’s IP address still in the table and it can still connect to it even though it is actually now a matching node in cluster B.
Through this matching node in cluster B the the other cluster B matching nodes are located.

My fix for this is to make sure that each Temporal cluster has its own set of membership ports for each service. This would have prevented the discovery process in cluster A from seeing the pods in cluster B since it would be trying to connect on a different port.

Describe the solution you’d like

It may be possible to prevent this by including a check that a given node is indeed part of the correct cluster before adding it to the ring.

Describe alternatives you’ve considered

I don’t believe our k8s environment has an easy way to prevent this using networking restrictions.

About this issue

Original URL
State: closed
Created 3 years ago
Reactions: 4
Comments: 16 (8 by maintainers)

Most upvoted comments

The documentation update is a welcome move. Thanks. We were hit by this hard a couple of days ago. We have had two Temporal instances (staging and production) running in a single kubernetes cluster for almost a year without any issues. An automated GKE upgrade changed something and triggered this issue. Pods from another environment were still listed in the matcher/frontend logs even after we changed the ports for some reason. NetworkPolicy wouldn’t have made much sense for us because of how we interact with Temporal.

deltasquare4 on Oct 18, 2021