go: encoding/xml: incompatible changes in the Go 1.21.0

What version of Go are you using (go version)?

Go1.21.0

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

darwin/amd64

What did you do?

There are some incompatible changes in the Go 1.21.0 encoding/xml library. Please take a look at following example:

package main

import (
    "encoding/xml"
    "fmt"
)

func main() {
    type B struct {
        XMLName xml.Name `xml:"b"`
        C       bool     `xml:"c"`
    }

    type A struct {
        XMLName xml.Name `xml:"http://example.com a"`
        B       B
    }

    var a A

    input := `<a xmlns="http://example.com"><b><c>true</c></b></a>`

    if err := xml.Unmarshal([]byte(input), &a); err != nil {
        fmt.Print(err)
    }

    output, err := xml.Marshal(&a)
    if err != nil {
        fmt.Println(err)
    }

    fmt.Printf("input:  %s\r\noutput: %s", input, string(output))
}

What did you expect to see?

Output by Go 1.20.7 and previous Go released version

input:  <a xmlns="http://example.com"><b><c>true</c></b></a>
output: <a xmlns="http://example.com"><b><c>true</c></b></a>

What did you see instead?

Output by Go 1.21.0

input:  <a xmlns="http://example.com"><b>true</b></a>
output: <a xmlns="http://example.com"><b xmlns=""><c>true</c></b></a>

There are new empty XML namespace xmlns attributes in the serialized output, which made the input and output XML content inconsistent without any modify.

I created a patch for it and still waiting for a reply, follow up to https://go.dev/cl/466295. Relates to #58401, and external Excelize project issues #1465, #1595 and #1603.

Thanks.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 46
  • Comments: 15 (5 by maintainers)

Commits related to this issue

Most upvoted comments

The new behavior is considered incorrect, and we will fix it in the next minor release.

Is the milestone Go 1.22 going to be kept ? Shouldnt this be fixed in 1.21 minor version ?

The new behavior is considered incorrect, and we will fix it in the next minor release.

“Considered” or “is”?

My understanding though is that the two “different” outputs from the point of view of standard can be the same depending on the requirements.

https://stackoverflow.com/questions/1587891/is-xmlns-a-valid-xml-namespace

As I understand it, including an empty namespace attribute is not the same as omitting it. The accepted answer in the link says as much: “It is legal, and this is the way to bring an element into the global namespace.”

The new behaviour which is considered incorrect is the change to the output when encoding Go data structures into XML. It is no longer possible to define what the element name will be without also overriding the default namespace, i.e. the one from the parent element.

For example, this struct defines a name without a namespace:

type Item struct {
	XMLName xml.Name `xml:"item"`
	Value   string `xml:"value,attr"`
}

It used to be possible to include this in another struct, which has a namespace definition, and have it inherit that one.

type Names struct {
	XMLName xml.Name `xml:"http://some.ns.org/names names"`
	Values  []Item
}

var names = Names{Values: []Item{{Value: "a"}, {Value: "b"}}}

I would expect names to marshal as:

<names xmlns="http://some.ns.org/names">
  <item value="a"></item>
  <item value="b"></item>
</names>

Note there is no xmlns attribute on each item, so they’re considered to be in http://some.ns.org/names by default. But instead what we now see in Go 1.21 is:

<names xmlns="http://some.ns.org/names">
  <item value="a" xmlns=""></item>
  <item value="b" xmlns=""></item>
</names>

Which means those 2 items are no longer in http://some.ns.org/names, they’re in the global namespace.

You can see this in the playground if you switch between Go 1.20 and 1.21: https://go.dev/play/p/i1pHuben06z

So it seems this isn’t so much about XML standards, but about being able to write Go code which produces whichever XML representation is needed in any given context. Before the bug fix for #7113, it wasn’t possible to override the default namespace by including an empty namespace, which is a valid requirement. Now it’s no longer possible not to, which is also valid.

This has me in a bind. 😦. I’m having to advise people using my code to copy go1.20’s encoding/xml on top of go1.21 as a stopgap.

Extra annoying this wasn’t in the release notes.

Thank you for the deatiled explanation.

@gopherbot, please backport to Go 1.21. Per #56986 this should at least have a release note and a GODEBUG guard.

Per #56986, though, it seems like this change it behavior should at least have been guarded by a new GODEBUG value. 🤔

You can set the NS of the children explicitly to the parent’s one. But this is awkward, and it leads to a verbose XML serialisation because the marshaller does not detect that.