go: encoding/xml: Unmarshalling empty XML attribute to int throws error

Go version: 1.8 (regression from 1.7.x --> 1.8.0)

Environment: Dockerized Alpine (golang:1.8-alpine and golang:1.7-alpine)

Description: In Go 1.7, an XML entity like <Object foo=""></Object> could be unmarshalled into a struct where Foo is an int value without throwing an error. When upgrading to Go 1.8.0, the same code throws an error due to strconv.ParseInt being called with the empty string as an argument.

This seems to be a behavioral regression between Go 1.7.x and Go 1.8.0.

Example code: https://play.golang.org/p/GIsONzXQQQ

Expected behavior: Example code should return without error, printing Foo: 0

Actual behavior: An error is thrown, indicating that ParseInt was called with an empty string.

Compiling this same code with Go 1.7.x results in the expected behavior.

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 5
  • Comments: 18 (16 by maintainers)

Commits related to this issue

Most upvoted comments

Here is a test program. It covers the cases in this issue (attributes) and the ones in #13417 (elements):

package main

import (
	"encoding/xml"
	"fmt"
)

func main() {
	type X struct {
		XMLName xml.Name `xml:"X"`
		A       int      `xml:",attr"`
		C       int
	}

	var tests = []struct {
		input string
		ok    bool
	}{
		{`<X></X>`, true},
		{`<X A=""></X>`, true},
		{`<X A="bad"></X>`, false},
		{`<X></X>`, true},
		{`<X><C></C></X>`, true},
		{`<X><C/></X>`, true},
		{`<X><C>bad</C></X>`, false},
	}

	for _, tt := range tests {
		err := xml.Unmarshal([]byte(tt.input), new(X))
		if err != nil {
			fmt.Printf("%-20s ERROR %v\n", tt.input, err)
		} else {
			fmt.Printf("%-20s ok\n", tt.input)
		}
	}
}

Go 1.2 through Go 1.7 were consistent: attributes unchecked, children strictly checked:

$ go1.7 run /tmp/x.go
<X></X>              ok
<X A=""></X>         ok
<X A="bad"></X>      ok
<X></X>              ok
<X><C></C></X>       ERROR strconv.ParseInt: parsing "": invalid syntax
<X><C/></X>          ERROR strconv.ParseInt: parsing "": invalid syntax
<X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax
$

Go 1.8 made attributes strictly checked, matching children:

$ go1.8 run /tmp/x.go
<X></X>              ok
<X A=""></X>         ERROR strconv.ParseInt: parsing "": invalid syntax
<X A="bad"></X>      ERROR strconv.ParseInt: parsing "bad": invalid syntax
<X></X>              ok
<X><C></C></X>       ERROR strconv.ParseInt: parsing "": invalid syntax
<X><C/></X>          ERROR strconv.ParseInt: parsing "": invalid syntax
<X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax
$ 

Go 1.9 will (at least right now) relax things so that only non-empty bad inputs are checked:

$ go run /tmp/x.go  # Go 1.9 development
<X></X>              ok
<X A=""></X>         ok
<X A="bad"></X>      ERROR strconv.ParseInt: parsing "bad": invalid syntax
<X></X>              ok
<X><C></C></X>       ok
<X><C/></X>          ok
<X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax
$ 

For Go 1.8.1, I will revert the checking behavior back to Go 1.7 for now. We can try another attempt at attribute checking, with @adams-sarah’s strictness relaxation applied both to attributes and children, in Go 1.9.

$ go run /tmp/x.go  # Go 1.8.1 development
<X></X>              ok
<X A=""></X>         ok
<X A="bad"></X>      ok
<X></X>              ok
<X><C></C></X>       ERROR strconv.ParseInt: parsing "": invalid syntax
<X><C/></X>          ERROR strconv.ParseInt: parsing "": invalid syntax
<X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax

Recently tried upgrading to 1.8 and ran into this.

Many of our XML-based 3rd parties leave attributes empty for prices, price="". I think unmarshalling to the type default is expected here.

Like @SamWhited said, it makes the most sense to special case attr="" as the Go type’s zero value.

I created the original issue because I tried to marshal a single letter into a byte and instead of returning an error saying, “you can’t do that” it just silently left the field as 0.

The alternative (reverting https://github.com/golang/go/commit/2c58cb36f971aed484e880769eb2b0a21654459a) means if the client sends an invalid XML request (e.g., attr="foobar" into an int64) the server has no way to determine whether it was a bad request or the field was meant to be 0.

I believe that behavior is definitely a bug in the XML package.

No worries @adams-sarah, it’s all very confusing around point release milestones. We hope to have a bot help with this soon.

OK but the bug was marked Go 1.8.1 and closed with a link to that commit. That commit is an inappropriate fix for Go 1.8.1 (and not cherry-picked yet anyway), so reopening.