email-oauth2-proxy: sometimes doesn't recover from network errors on Linux
I’m using the proxy as a systemd service, and if I, for example, unplug my router, then I get:
Dec 15 23:13:17 rent python3[481687]: Email OAuth 2.0 Proxy: Caught network error in IMAP server at [::]:1993 (unsecured) proxying outlook.office365.com:993 (SSL/TLS) - is there a network connection? Error type
<class 'socket.gaierror'> with message: [Errno -2] Name or service not known
That’s fine, of course, but the proxy never recovers from this state when networking is up again. I have to restart it.
About this issue
- Original URL
- State: open
- Created 7 months ago
- Comments: 24 (8 by maintainers)
Commits related to this issue
- Try all `getaddrinfo` results when connecting to a server - see #215 — committed to simonrob/email-oauth2-proxy by simonrob 4 months ago
Yes, the fix for the exception loop should be made regardless; it’s not related to this ticket, but it does look like a bug.
Re the proposed change, this would probably not help as all name resolution is broken at this point in glibc. I spent some time in its bowels and got this far:
https://sourceware.org/pipermail/libc-alpha/2024-March/155234.html
To my eyes this is a plain glibc bug but we’ll see.
I’m running a chattier, fatal version of the above (since I want to catch it to prove the theory). It may take a long time to see it though.
I think you need to still re-raise the exception if you’ve run out of “a” to try BTW
Nope! Feel free to close. It’s either your theory above, or I’ve just been lucky. Either way I can file a new one if needed. Thanks!
I’ve been watching like a hawk for problems since filing this ticket and predictably it’s behaved flawlessly. Nonetheless, I’ve updated to include that change. If I can reproduce any issue again I’ll update.
I don’t know offhand of any such API on Linux. Python itself shouldn’t be caching DNS, I don’t think.
Since this only happens sometimes, I’m going to try strace() to see if I can winkle out the circumstances. It’s definitely related to VPN teardown, but happens (sometimes) even when DNS is pointing at the right server again.