cortex: the ring never removes old ingester even if the ingester pod is evicted
I have a similar problem as #1502 when my ingester pod was evicted , a new ingester pod will be created . now the ring has two ingester, but only one (the new one) is healthy. the old one will not be removed from the ring, even if I delete the evict pod manually. the ring information as follows:
`
<tr>
<td>ingester-7fc8759d7f-nzb6g</td>
<td>ACTIVE</td>
<td>172.16.0.62:9095</td>
<td>2019-07-19 03:33:32 +0000 UTC</td>
<td>128</td>
<td>45.739077787319914%</td>
<td><button name="forget" value="ingester-7fc8759d7f-nzb6g" type="submit">Forget</button></td>
</tr>
<tr>
<td>ingester-7fc8759d7f-wmnms</td>
<td>Unhealthy</td>
<td>172.16.0.93:9095</td>
<td>2019-07-18 14:46:18 +0000 UTC</td>
<td>128</td>
<td>54.260922212680086%</td>
<td><button name="forget" value="ingester-7fc8759d7f-wmnms" type="submit">Forget</button></td>
</tr>
` and the ingester’s status is always unready, with distributor’s error
level=warn ts=2019-07-19T03:41:45.413839063Z caller=server.go:1995 traceID=daf4028f530860f msg="POST /api/prom/push (500) 727.847µs Response: \"at least 1 live ingesters required, could only find 0\\n\" ws: false; Connection: close; Content-Encoding: snappy; Content-Length: 3742; Content-Type: application/x-protobuf; User-Agent: Prometheus/2.11.0; X-Forwarded-For: 172.16.0.17; X-Forwarded-Host: perf.monitorefk.huawei.com; X-Forwarded-Port: 443; X-Forwarded-Proto: https; X-Original-Uri: /api/prom/push; X-Prometheus-Remote-Write-Version: 0.1.0; X-Real-Ip: 172.16.0.17; X-Request-Id: 62a470dc6de7a83c8974e3411fa63e40; X-Scheme: https; X-Scope-Orgid: custom; "
I wonder if there is any solution to deal with the situaton automatically? maybe to check the replicas-refactor and remove unhealthy excess ingesters from the ring?
About this issue
- Original URL
- State: open
- Created 5 years ago
- Reactions: 8
- Comments: 45 (18 by maintainers)
The
ingester.autoforget_unhealthy
configuration item exists in Loki since this pull request was merged https://github.com/grafana/loki/pull/3919.Would it be possible to add the same functionality into Cortex?
Or is there another way to facilitate the same behaviour as Loki’s
ingester.autoforget_unhealthy
?I would be happy to take a stab at writing
ingester.autoforget_unhealthy
based on the loki implementation if the maintainers think it makes sense.We ended up adding these Kubernetes resources for an automatic cleanup of unhealthy ingesters:
Hi – FYI I’ve found this
/ready
behavior plays badly with StatefulSets using an “Ordered” Pod Management Policy (the default). I believe the fix is easy – use a “Parallel” policy – but documenting the problematic scenario:Suppose you have 3 SS replicas with “Ordered” policy:
I experienced this running with preemptible nodes (I know, I know) and confirmed with manual testing. If the “Parallel” policy is used instead then pod-1 & pod-2 start in parallel and pick up their former places in the ring.
In a Kubernetes & Helm based scenario, these Helm values could be a workaround:
Please be aware that you need to change the two urls in conformance to your Helm release name. Here it is
cortex
, so the url ishttp://{{ .Release.Name }}-distributor:8080/ingester/ring
. Please test thoroughly and contribute your enhancements.Nobody has coded one for Cortex, to my knowledge.
We tell you not to do this in the docs.
ingester.autoforget_unhealthy will be amazing as deploying to AWS with spot instances, get ingesters destroyed and re span up. Exposing the Cortex Ring Status web interface to manually remove unhealthy ingesters is not practical , and it is a security concern.
Got bitten bit this terribly several times now, and lost a lot of time and data 😦, would really love to see
ingester.autoforget_unhealthy
support in Cortex.I’ve read through this issue and the linked issues, and it’s still unclear to me whether there is a way to have the ingester ring self-heal in case of unclean shutdowns. Not needing human operator intervention would be extremely valuable to us, as we are losing much more data due to ingesters being down compared to what we would lose by auto-forgetting unhealthy ingesters from the ring.
Now that chunks storage is deprecated and we use blocks storage, we no longer “hand-over” from one ingester to another. So one justification for this behaviour has disappeared.
Happy to hear experience reports from people who did automate it.