operator: traefik errors on cos layer with error: hook failed: "metrics-endpoint-relation-joined"
Bug Description
Solutions QA team has two runs in which traefik errors on cos layer with error: hook failed: “metrics-endpoint-relation-joined”
The cos layer is built on top of kubernetes-aws.
From the logs:
2022-08-21 02:20:42 ERROR juju-log metrics-endpoint:10: Uncaught exception while in charm code: Traceback (most recent call last): File “./src/charm.py”, line 678, in <module> main(TraefikIngressCharm) File “/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/main.py”, line 431, in main _emit_charm_event(charm, dispatcher.event_name) File “/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/main.py”, line 142, in _emit_charm_event event_to_emit.emit(*args, **kwargs) File “/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/framework.py”, line 316, in emit framework._emit(event) File “/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/framework.py”, line 784, in _emit self._reemit(event_path) File “/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/framework.py”, line 857, in _reemit custom_handler(event) File “/var/lib/juju/agents/unit-traefik-0/charm/lib/charms/prometheus_k8s/v0/prometheus_scrape.py”, line 1545, in _set_scrape_job_spec self._set_unit_ip(event) File “/var/lib/juju/agents/unit-traefik-0/charm/lib/charms/prometheus_k8s/v0/prometheus_scrape.py”, line 1576, in _set_unit_ip unit_ip = str(self._charm.model.get_binding(relation).network.bind_address) File “/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/model.py”, line 679, in network self._network = self._network_get(self.name, self._relation_id) File “/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/model.py”, line 672, in _network_get return Network(self._backend.network_get(name, relation_id)) File “/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/model.py”, line 724, in init self.ingress_addresses.append(ipaddress.ip_address(address)) File “/usr/lib/python3.8/ipaddress.py”, line 53, in ip_address raise ValueError(‘%r does not appear to be an IPv4 or IPv6 address’ % ValueError: ‘acb396cf5563d429f9ebc8aa23ae47ed-1516332866.us-east-1.elb.amazonaws.com’ does not appear to be an IPv4 or IPv6 address
Failed runs:
https://solutions.qa.canonical.com/testruns/testRun/0b35ee4c-efcf-4f50-a393-696803b31ac9 https://solutions.qa.canonical.com/testruns/testRun/ef230601-4af3-4d6f-89af-4e578985e666
Logs are found on the bottom on the page, on the artifacts repository.
To Reproduce
These errors were found on the fkb-master-kubernetes-focal-aws SKU of our automated test suite.
Environment
charmed-kubernetes 1.24 stable channel for cos charms.
Relevant log output
Traceback (most recent call last):
File "./src/charm.py", line 678, in <module>
main(TraefikIngressCharm)
File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/main.py", line 431, in main
_emit_charm_event(charm, dispatcher.event_name)
File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/main.py", line 142, in _emit_charm_event
event_to_emit.emit(*args, **kwargs)
File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/framework.py", line 316, in emit
framework._emit(event)
File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/framework.py", line 784, in _emit
self._reemit(event_path)
File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/framework.py", line 857, in _reemit
custom_handler(event)
File "/var/lib/juju/agents/unit-traefik-0/charm/lib/charms/prometheus_k8s/v0/prometheus_scrape.py", line 1545, in _set_scrape_job_spec
self._set_unit_ip(event)
File "/var/lib/juju/agents/unit-traefik-0/charm/lib/charms/prometheus_k8s/v0/prometheus_scrape.py", line 1576, in _set_unit_ip
unit_ip = str(self._charm.model.get_binding(relation).network.bind_address)
File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/model.py", line 679, in network
self._network = self._network_get(self.name, self._relation_id)
File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/model.py", line 672, in _network_get
return Network(self._backend.network_get(name, relation_id))
File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/model.py", line 724, in __init__
self.ingress_addresses.append(ipaddress.ip_address(address))
File "/usr/lib/python3.8/ipaddress.py", line 53, in ip_address
raise ValueError('%r does not appear to be an IPv4 or IPv6 address' %
ValueError: 'acb396cf5563d429f9ebc8aa23ae47ed-1516332866.us-east-1.elb.amazonaws.com' does not appear to be an IPv4 or IPv6 address
Additional context
No response
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 24 (19 by maintainers)
I’m following up with juju folks to find a solution to this, but for now it looks like it’s not on our side.
just fyi, juju version in the failed runs & in the passing run is the same: 2.9.33 (traefik version is different, as we already discussed).
I agree there may be some issues in juju. However, after chatting with juju folks, it sounds like getting a non-ip from juju needs to be allowed by ops - even if it’s not necessarily the actual full fix that’s needed to resolve this issue.
thank you all for your efforts on this. From what I got, you are able to reproduce it, otherwise we are happy to give you access to an environment demonstrating this or be more specific about the scenario, if that helps better. About the workaround, we have no problem trying it out in the short term, unless a fix is underway.
I’ll get started on monday, I’m in meetings until EOD today.
@PietroPasotti mind taking a look? It sounds on @marosg42 like this might be a regression in the charm after all
Can’t help you in transferring the issue, but I’m sure @PietroPasotti, or @jnsgruk, should be able to.