go: bytes: bytes.Reader returns EOF on zero-byte Read, which doesn't conform with io.Reader interface documentation

What version of Go are you using (go version)?

$ go version

go version go1.14.4 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env

GOARCH=“amd64” GOHOSTARCH=“amd64” GOHOSTOS=“linux”

What did you do?

I was deserialising binary data and I hit an unexpected io.EOF when reading zero bytes. Here is a minimal example that illustrates the problem.

package main

import (
	"fmt"
	"bytes"
	"encoding/binary"
)

func main() {
    r := bytes.NewReader([]byte{0,0,0,0})
    
    var length uint32                                                               
    err := binary.Read(r, binary.BigEndian, &length)                   
    if err != nil{
        panic(err)
    }   
    fmt.Printf("length = %d\n", length)
    
    rest := make([]byte, length)
    _, err = r.Read(rest)
    fmt.Printf("error = %s\n", err)
}

What did you expect to see?

length = 0
error = %!s(<nil>)

What did you see instead?

length = 0
error = EOF

On conformity with io.Reader and other standards

An excerpt from the documentation of io.Reader with the important parts emboldened:

type Reader interface {
   Read(p []byte) (n int, err error)
}

Implementations of Read are discouraged from returning a zero byte count with a nil error, except when len(p) == 0. Callers should treat a return of 0 and nil as indicating that nothing happened; in particular it does not indicate EOF.

An excerpt from the man page of read(3):

This volume of IEEE Std 1003.1-2001 requires that no action be taken for read() or write() when nbyte is zero. This is not intended to take precedence over detection of errors (such as invalid buffer pointers or file descriptors). This is consistent with the rest of this volume of IEEE Std 1003.1-2001, but the phrasing here could be misread to require detection of the zero case before any other errors. A value of zero is to be considered a correct value, for which the semantics are a no-op.

Related issues

It looks like this issue is in stark contrast to Go2 proposal issue #27531. There is also an interesting discussion (#5310) about returning (0, nil) from a Read(p []byte).

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 34 (15 by maintainers)

Commits related to this issue

Most upvoted comments

The io.Reader interface is like no other in the Go ecosystem. Read is the only method where the caller must examine the other values returned from a function/method call before examining the error value.

Right now, today, the behavior of bytes.Reader is permitted according to the docs. If we change the docs as suggested in option 1 of https://github.com/golang/go/issues/40385#issuecomment-670142267, then bytes.Reader will be broken. This is not a matter of code that checks #bytes != 0. This is a matter of bytes.Reader itself. So I don’t agree that approach 1 will break already incorrect code. It will break already correct code. If that code is not changed to conform to the new requirements, then it will break future code that assumes that the documented requirements are implemented by existing readers.

I vote for 2. /cc @minux who spent a lot of time arguing for this a few years back.

@dashjay There is no expectation that all Readers behave the same way, even all Readers in the standard library. Different Readers are free to employ different buffering and error handling strategies.

Since the input is a byte slice, and not a stream, it doesn’t make much sense to waste CPU cycles on io.ReadFull(), if bytes.Reader.Read() would do the same job. However, it seems that io.ReadFull actually fixes the issue.

This argument is specious; if you’re worried about the overhead of io.ReadFull when you have a []byte slice then you’re probably also worried about the overhead of an interface call over just scrobbling in the []byte slice directly.

You are probably referring to this paragraph, which is either ambiguous or it doesn’t cover the case n = 0.

Callers should always process the n > 0 bytes returned before considering the error err. Doing so correctly handles I/O errors that happen after reading some bytes and also both of the allowed EOF behaviors.

Since the input is a byte slice, and not a stream, it doesn’t make much sense to waste CPU cycles on io.ReadFull(), if bytes.Reader.Read() would do the same job. However, it seems that io.ReadFull actually fixes the issue.

@davecheney I am not sure what you want, but lets take those two cases:

// Deserialisation using bytes.Reader https://play.golang.org/p/3qRu_LlDa6h

// Deserialisation using bytes.Buffer… the same code, single line changed. https://play.golang.org/p/0Lr0_9rFi1O

The deserialisation of fields follow the structure, read field length, if necessary, then read content. If there is an error or the length is different, return an error. The first example fails when the last fields is length-prefixed with zero length and the second example succeeds just because we are using bytes.Buffer, instead of bytes.Reader.

@davecheney Yes, indeed. I am not arguing that bytes.Reader is broken according to io.Reader contract, but it feels inconsistent.

About the warning / notice, I am thinking of something like: The bytes.Reader returns io.EOF as soon as it reaches the end of the byte slice, after which a read of zero bytes will yield an error. This may pose an issue in cases where you are deserialising data and the last field is length-prefixed with length equals zero.

Is there a reason why there is no EOF() bool method in Reader structs in the standard library? I think it is why zero-length Read would return an io.EOF, because it’s the only way to check if you are at the end, except for pos, _ := Seek(0, current); pos == Len() or Size(), which is probably not available for all Reader structs.

Add. I will reluctantly agree that the documentation does not prohibit (0, io.EOF) in this case, but it was my take on “except when len(p) == 0”, which made me think that it’s just natural to have a (0, nil) on a zero-byte Read().