go: image/gif: gif decoding fails `with too much image data`
Please answer these questions before submitting your issue. Thanks!
- What version of Go are you using (
go version
)? go version devel +1f44643 Wed Jun 22 00:12:55 2016 +0000 darwin/amd64 - What operating system and processor architecture are you using (
go env
)? GOARCH=“amd64” GOBIN=“/Users/emmanuelodeke/go/bin” GOEXE=“” GOHOSTARCH=“amd64” GOHOSTOS=“darwin” GOOS=“darwin” GOPATH=“/Users/emmanuelodeke/go” GORACE=“” GOROOT=“/Users/emmanuelodeke/go/src/go.googlesource.com/go” GOTOOLDIR=“/Users/emmanuelodeke/go/src/go.googlesource.com/go/pkg/tool/darwin_amd64” CC=“clang” GOGCCFLAGS=“-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/kn/s30g1srd4h50bh6sd322tppm0000gn/T/go-build038095302=/tmp/go-build -gno-record-gcc-switches -fno-common” CXX=“clang++” CGO_ENABLED=“1” - What did you do? Run program at https://play.golang.org/p/NqqvAkiIao with source of http://ualberta.ca/~odeke/tx.gif
$ go run NqqvAkiIao.go --source http://ualberta.ca/~odeke/tx.gif
If possible, provide a recipe for reproducing the error. A complete runnable program is good. A link on play.golang.org is best.
- What did you expect to see? Successful decoding of a gif with no errors
- What did you see instead?
gif: too much image data
/cc @montanaflynn who reported this issue first.
Code inlined
/*
After running this gif
https://67.media.tumblr.com//b02659b27081314a412215f4eb31dacf/tumblr_nu2x4whJgy1udf6d3o1_400.gif
through ffmpeg, Go fails to decode the gif with error:
`gif: too much image data`
Command:
$ ffmpeg -i https://67.media.tumblr.com//b02659b27081314a412215f4eb31dacf/tumblr_nu2x4whJgy1udf6d3o1_400.gif outf.gif
$ go run main.go --source outf.gif
- Repro URL
http://ualberta.ca/~odeke/tx.gif
Interestingly:
- The outfile is 1.0MB.
- The original file is 1.4MB.
*/
package main
import (
"flag"
"image/gif"
"log"
"net/http"
"path/filepath"
"strings"
)
func localAndNetBasedClient() *http.Client {
transport := new(http.Transport)
transport.RegisterProtocol("file", http.NewFileTransport(http.Dir("/")))
client := new(http.Client)
client.Transport = transport
return client
}
var httpPrefixes = []string{"http", "https"}
func ensureFitForFetch(source string) string {
for _, prefix := range httpPrefixes {
if strings.HasPrefix(source, prefix) {
return source
}
}
absSource, err := filepath.Abs(source)
if err == nil {
source = absSource
}
return strings.Join([]string{"file://", source}, "/")
}
func main() {
var source string
flag.StringVar(&source, "source",
"https://67.media.tumblr.com//b02659b27081314a412215f4eb31dacf/tumblr_nu2x4whJgy1udf6d3o1_400.gif",
"the URI of the gif you want to ensure properly decodes in Go")
flag.Parse()
client := localAndNetBasedClient()
source = ensureFitForFetch(source)
res, err := client.Get(source)
if err != nil {
log.Fatal(err)
}
defer res.Body.Close()
if _, err := gif.DecodeAll(res.Body); err != nil {
log.Fatal(err)
}
}
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 15 (9 by maintainers)
I’ve debugged this. I have an explanation and a recommendation.
The problem gifs in question (tx.gif from @odeke-em and Z6KoKJ.gif from @montanaflynn) indeed have extra/too much data. In both cases, the extra data is a data sub-block in the image data with size 1 containing 0x00. In both cases, there is an LZW End of Information Code, then this extra sub-block, and then the Block Terminator.
In tx.gif this happens in the second image in the gif. In Z6KoKJ.gif it happens in the 11th image in the gif.
In well formed gifs, the last sub-block would not be present. It does not need to be there since we completed decompression.
Currently the decoder fully reads/decompresses the image’s data, and then uses the block reader to check whether there is more. It sees there is more data (in this case, there is a block of 1 byte found) and raises this error.
If we don’t decompress the data and simply look at the image data as a series of data sub-blocks, it is not extra. That is, the data sub-blocks are well formed. It is extra only when we take into account reaching the LZW End of Information code.
The question is what we should be doing. Other decoders (GIFLIB, browsers) apparently accept this. As far as the specification goes, it does not say one way or the other. In Appendix F it says “the code that terminates the LZW compressed data must appear before Block Terminator”. It does not state what to do if there is data after the LZW data ends, nor that the code must be the last piece of data. Section 22 states that “[t]he sequence of indices is encoded using the LZW Algorithm […]” so I think this data should not be treated as an index. But it is not useful and presumably is technically invalid.
I looked at what the most recent version of GIFLIB (5.1.4) does. After it fully decompresses an image, it silently reads and discards any extra data such as this. It stops when it reads the Block Terminator. You can see this in its
DGifGetLine()
function (thedo while
loop). I think doing this would work for Go’s decoder too. Once we read and decompress the image, any extra data in its sub-blocks can be discarded. There should not be any, but if there are, ignore them.As for where these problem gifs are coming from, given the transformation using
ffmpeg
produced one with the defect, its gif encoder may have a bug!Note I didn’t analyze @vsiv’s extra_large_cropped_wino.gif. It appears corrupted to me in Firefox and is showing a different error, so possibly it is either truly corrupted or a different issue.
I wrote a patch with a possible solution. I realize this will need to be submitted differently, but I thought I would include it and get feedback here before doing so. https://github.com/horgh/go/commit/ab144b8f336c0b99eec9b416a7577a305190dbdc
Thanks for any feedback!
Hi @horgh, nice debugging work. I suggest you submit a CL. We cannot review code without a signed CLA, so a Gerrit CL is the only format for discussing patches.