go: proposal: cmd/vet: check for missing Err calls for bufio.Scanner and sql.Rows

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.7.3 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/matt"
GORACE=""
GOROOT="/home/matt/.gvm/gos/go1.7.3"
GOTOOLDIR="/home/matt/.gvm/gos/go1.7.3/pkg/tool/linux_amd64"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build698545690=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"

What did you do?

If possible, provide a recipe for reproducing the error. A complete runnable program is good. A link on play.golang.org is best.

https://play.golang.org/p/zvrWRVgw4M

Used a bufio.Scanner without checking bufio.Scanner.Err() for error. The same mistake occurs frequently with sql.Rows.Err().

What did you expect to see?

go vet should warn that s.Err() was not called, and that it should be checked to determine if an error occurred while scanning.

This frequently occurs in our internal codebase at work. Arguably, go vet could check if any type has a method Err() error and report if the error is not checked or returned, instead of just checking the special cases with bufio.Scanner and sql.Rows. There may also be more types that use this pattern in the standard library.

What did you see instead?

No warnings from go vet.

About this issue

  • Original URL
  • State: open
  • Created 8 years ago
  • Reactions: 9
  • Comments: 44 (36 by maintainers)

Most upvoted comments

It sounds like it would be reasonable to write a checker for bufio.Scanner and sql.Rows not having their Err methods called. We can see what the precision is on the latter, but the former seems OK.

Does anyone object to moving forward with this?

bufio.Scanner can return errors no matter what kind of reader is passed into it.

That’s an excellent point that I hadn’t realised (the specific error it can return is ErrTooLong). In the light of that, I support this proposal - I’ve definitely left out calls to Err in the past that I should have made.

@Merovius, I don’t think the analogy to Close quite fits. An error from Close means “something you wrote might not have been flushed”, whereas an error from an Err method means “you might not have read everything”. (They’re dual.)

The Err methods on bufio.Scanner and sql.Rows are pretty clearly the latter, not the former. (Note that Rows also already has a Close method, which releases the resources in the event of an early return.)

It’s safe to ignore an error from Close if you didn’t write anything (for example, if you only read things), but in general you may have to call it in order to release resources.

It’s safe to ignore an error from Err() if you saw everything you needed and the rest of the contents aren’t relevant. (But if that’s the case, you probably shouldn’t have called Scan after you found what you were looking for, either…)

If purposely ignoring the error is your intent, I see nothing wrong with the following to make that intent clear:

s := bufio.NewScanner(r)
for s.Next() {
   // ...
}
_ = s.Err()
return nil

I’ve repeatedly ignored the error in the past when, if my scanner errors, I don’t care and either exit the program (more common) or accept the best-effort of what was processed (less common). It’d be a little annoying to be forced to write:

if scanner.Err() != nil {
        // don't care
}

in these patterns.

To me, considering it is sometimes ok to ignore this error, forcing the error to be checked seems like a job better suited for an external linter. The documentation on Scan mentions Err twice. I’m not sure how scanner.Err() is more special than any other function that returns errors – even though the error checking is one extra function, I can just as easily accidentally ignore errors on other far more important function calls.

We recently discovered that missing calls to bufio.Scanner.Err() and sql.Rows.Err() were widespread in our codebase (appeared in ~10% of uses). Auditing our use of those types in our codebase, in all ~100 cases in which we called bufio.Scanner.Scan()/sql.Rows.Next(), it was correct for us to check the result of Err() afterward.

This check (or something similar) would be very helpful for us.

For what it’s worth, I still think this should be applied to sql.Rows at very least as well. That and bufio.Scanner are the most common misuses I see in stdlib.

But as for the generalizing to any Err() error after iteration, I’d still be +1 as I have mimicked the scanner APIs from the stdlib and require an error check for those APIs in my own libraries in the same way.

I would like to collect more evidence before we make a decision here. I can volunteer to do this. Specifically I want to look at a few dozen more scanner.Err() cases for false positive prevalence and a couple dozen sql.Rows’s Err cases. It may be a couple of weeks before I get to it though.

Based on the discussion above, this proposal seems like a likely accept. — rsc for the proposal review group

@twnb This has been suggested before, but there are too many instances of functions whose errors are not idomiatically checked (Printf is the canonical example) and others where the error is impossible but is present to satisfy an interface. It was deemed too messy in the implementation of the checker to keep programmer style consistent, or else too messy for programmers to put the underscores everywhere. No one wants the equivalent of the ugly (void) casts the C linter foisted on everyone in the 1980s.

I have not yet managed to collect the evidence I was hoping to. This may still take a while as we working on better ways of systematically testing such questions.

@bcmills Hm. I’m not sure I agree. I quickly grepped through the stdlib and there are three Err() error methods: In addition to the two we talked about, conforming to the “it’s for reading” heuristic, there’s scanner.ErrorList. But, importantly, there are many examples for Close() error which have nothing to do with writing. For example: io.PipeReader, syscall.Token (on windows), http.Response.Body and golang.org/x/exp/mmap.ReaderAt.

To me, “Close is something that must be called” seems a pretty consistent pattern (though not universal - e.g. PipeReader.Close doesn’t have to be called). It’s definitely firmly lodged into my brain. Err() error doesn’t really have enough data-points to suggest a pattern (at least not in the stdlib), but at least the existence of scanner.ErrorList.Err seems to suggest that calling Err isn’t really required.

Maybe the rule-ish is “Close has side-effects on the state of the type, whereas Err purely inspects it”.

In any case. I still don’t like this check. Personally, I just really dislike having to write _ = x.Err() or other somesuch patterns, just to pacify a linter.

To me, the difference between Close() error and Err() error is exactly that the former says “you need to call this and check for errors”, while the latter says “you can call this to get an error”. So, IMO, if we need a checker that Err was called, the method should’ve been called Close to begin with.

I’d be far more happy with a generic io.Closer check and adding a Close method to sql.Rows. If we don’t want to pollute the API with a second method (which has to do the same thing, for backwards compatibility), a check for specific known types might be fine.

But a generic check for Err() error is IMO counterproductive, as we already have a well-known way to signal that a method must be called for error checking: Call it Close.

@timothy-king, it sounds like we should investigate the effect of checking for sql.Rows’s Err method as well. That would be a nice second instance that we could treat as evidence for a general checker (but still limited to known types).

I’ve repeatedly ignored the error in the past when, if my scanner errors, I don’t care and either exit the program (more common) or accept the best-effort of what was processed (less common)

@twmb [or anyone else] Can you point to a real world example where you made either of these decisions? (Or provide an abstraction if it is proprietary?) The evidence I have looked at so far is a few dozen examples. All of this is code I did not write. I cannot see into the minds of what the authors were aware of or thinking. My best guess is that all of the ones I have looked at have a rare bug that is sneaking by. A real world example where the discussed check on bufio.Scanner would have a false positive would be evidence against going forward with it in cmd/vet.

bufio.Scanner can return errors no matter what kind of reader is passed into it. That is, I don’t think we need to know anything about the type passed to bufio.NewScanner.

It still sounds like this is doable with bottom-up summaries, since we have no interfaces with Err() error methods, so there’s very little uncertainty about what is going on in any given function.

The question remains: how many error reports would this produce, and are they useful?

This missing Err() checker is more complicated though, since the buffer object may be passed around to many places including transitive callees, while the explicit cancel() function usually will not.

The small sample of bufio.Scanner examples I took a look at do not feature a Scanner that can syntactically escape. Do we have evidence about the frequency of Scanners that are passed around/might escape?

There are legitimate cases in which you might elide the s.Err check, such as when reading lines from a strings.Reader; we know a priori that that cannot fail. False positives from vet are painful.

This frequently occurs in our internal codebase at work.

How frequently is frequently? How bad are the resulting bugs?

cc @robpike