go: proposal: fmt: don't recover panics except for dereferencing nil fmt.Stringer receivers
When a type passed to a fmt function implements fmt.Stringer, the fmt package recovers any panics that the String method produces.
That’s helpful for the common case of nil pointers (https://play.golang.org/p/YbmlVGcfPvO), but in other cases masks real, important bugs — and can lead to unexpected deadlocks (https://play.golang.org/p/GjOv8M75GqG; see also #25245).
I propose that, for Go 2, the fmt package should only recover a panic from a fmt.Stringer if:
- the value underlying the
fmt.Stringeris anilvalue of a pointer type, and - the panic occurred as a result of dereferencing the receiver (not due to an explicit
paniccall or a dereference of some other value).
Enforcing the latter condition may require deeper runtime support.
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 21
- Comments: 21 (21 by maintainers)
The fmt package tries hard to do something always. The idea is that if you’re printing something, and there’s a problem, the problem is likely due to code that hasn’t run before and/or is hard to get to run at all, so it’s better to print something than crash or return error.
I’d like to keep it that way. It’s worth not crashing a program because a log statement has a bad argument, for example. I admit this can cause problems sometimes, but all decisions do, and your proposal adds complexity and removes utility in the aid of relatively rare cases. The tradeoff doesn’t feel right.
Also worth noting that
text/templatenow recovers panics encountered while calling template functions: https://github.com/golang/go/issues/28242One of the big points in favor was that
fmtdid something very similar. So iffmtchanges in Go2, perhaps other pieces of the standard library liketext/templateshould be reconsidered too.There’s one big difference, however. If
fmtrecovers a panic, it can be easy for it to go unnoticed, whereasTemplate.Executewill simply return the recovered panic as an error. That should be much more obvious.If the problem is “due to code that hasn’t run before and/or is hard to get to run at all,” that seems to make it even more important to preserve the stack trace that led to it — which is exactly the information that
recoverdestroys.I know I’m unlikely to sway anyone, but I’ll leave the obligatory “I don’t think we should recover any panics” comment.
I do occasionally get frustrated having to debug a panic in a
Stringerorerrorimplementation, because of the lack of context available, so I would welcome iffmtwould leave them alone. And I don’t like the idea of inspecting the dynamic value of interface implementations - especially not the idea ofnilpointers being treated differently. I realizefmtis probably already doing that, but to me, it’s not an exception to this rule. I also don’t really see a difference between, say, a nil-pointer dereference and an out-of-bounds slice access, except maybe that slice-receiver are less common than pointer-receiver.So if it where up to me,
fmt(andtext/template) would just leave panics alone.E.g. this code (https://play.golang.org/p/g0_AlCKKOa3) right now produces
Along with the error message, it’d include a full stack trace to make it easier to debug.
I know what you’re trying to say, but I disagree. I’d rather see bogus output from a rare log statement than crash a production job.
I do not want to change this behavior in the fmt package.
I looked into Google bug 8658139, which turns out to have been about html/template, not fmt. Specifically, a method was being called in a template to create a web page, and it panicked while holding a lock, which made future attempts to render that specific page block forever. Now, it may be that in this specific instance html/template had called fmt, which ate the panic, but if fmt hadn’t eaten the panic, then html/template would have instead (via text/template). And if html/template hadn’t, then net/http would have. That is, there were three different packages stacked up, all of which would have recovered the panic. The code in question was only reading a data structure, and it should have used defer to unlock its lock. (This was in the bad old days when people avoided defer due to perceived performance problems. Those were never very large, and now they are completely gone.)
The fundamental question is whether it is permissible for Go packages to defend against problems in the code they call by recovering from panics. There are good arguments on both sides.
Against: It is possible that recovering a panic will mean that a deferred lock is unlocked with the data it protects in an inconsistent state. Or it is possible, as in the deadlock above, that a lock won’t be unlocked or some other cleanup won’t be done. (Of course, if we decide that recovering is OK, then not deferring the unlock is a bug, so the main concern is the inconsistent data.)
In favor: Recovering most panics leads to more stable, reliable programs. Not every program runs in a context where it will be automatically restarted when it crashes. Even those that do pay a restart cost that can be quite high. Keeping the program running is what we already do today, and despite the potential for inconsistency, empirically it works very well most of the time.
Personally I think the arguments come out in favor of “in favor”, although I could be convinced they are merely balanced. Either way, that suggests we should preserve the status quo rather than make a very significant semantic change in the many packages that recover from panics: not just fmt, but also text/template, net/http, and probably others I am forgetting.
I admit the technical incorrectness of recovering, but pragmatically it seems to be much better than not.
I am not in favor of adding stack traces automatically, as I always groan when I get dumped a stack trace by some program I’m not responsible for. Perhaps under a flag or something, but not by default, please.
Also, how would you do that? The
runtime/debugpackage already depends onfmt. Thefmtpackage is a critical piece of the core and most of the library already uses it. You can’t havefmtcall out to much of the library without creating a circular dependency.I’ve been wondering if this means that this program https://go.dev/play/p/Cg-49ZzTcjn should not crash (even though the crash is implied by the documentation of fmt).
No change in consensus, so declined. — rsc for the proposal review group
I’m more sympathetic to that argument for functions that are specific to logging, such as in the
logandlog/slogpackages; however,fmt.Fprintfand similar are not specific to logging, and are used in packages liketext/templatethat are mostly not used in a logging context.I have definitely seen deadlocks caused by
fmtrecovering panics in real systems. (Google bug # 8658139 is one such example, which motivated me to file issue #5350 over ten years ago. 😅)I don’t have any handy examples of data corruption, although I’m also not entirely sure how to go about looking for those. (Although I have been maintaining Go programs for a long time, I haven’t maintained many production systems in Go that produced long-lived or user-facing data using
fmtor packages such astext/templatethat build on it.)The counterargument is that a lot of the time when you are debugging a mysterious problem, new prints get exercised. The last thing you want is to lose the info being printed - or for the whole server to crash - just because of a bad print. It is almost always better to format something and see a weird-looking print in a log than to lose the trace of the problem as a whole. I’ve seen that happen in other systems (not Go, since we guarded against this).
Have you seen the kind of corruption you are hypothesizing caused by fmt recovering panics in real systems?
Stack trace includes internal identifiers and paths, so putting them into
fmtoutput might inadvertently expose them to the clients/users/other parties who should not have access to this information, unliketemplatewhere recovered panics do not end up in the rendered output. Even panic message is arguably too much information revealed.Why don’t we make fmt show the stack trace as well instead of just the panic value? That would be a reasonable middle ground.
The
recoverinfmtis especially insidious because — unlike therecoveradded for #28242 — it does not result in a user-visible error.(Users often omit error checks for
fmtcalls, and even if they did check, thefmtfunctions don’t return an error on panic anyway: https://play.golang.org/p/x92v-MWuzPF)