serde-xml-rs: Deserializing Vec fails if there's something in between
Using this code…
#[macro_use] extern crate serde_derive;
extern crate serde_xml_rs;
use std::io::stdin;
#[derive(Debug, Deserialize)]
struct Root {
foo: Vec<String>,
bar: Vec<String>
}
fn main() {
let res: Root = match serde_xml_rs::deserialize(stdin()) {
Ok(r) => r,
Err(e) => panic!("{:?}", e),
};
println!("{:?}", res);
}
…to deserialize this file…
<root>
<foo>abc</foo>
<foo>def</foo>
<bar>lmn</bar>
<bar>opq</bar>
<foo>ghi</foo>
</root>
…gives this error…
thread 'main' panicked at 'duplicate field `foo`', src/bin/bug.rs:15:18
This doesn’t happen if the elements in the root are contiguous, like…
<root>
<foo>abc</foo>
<foo>def</foo>
<foo>ghi</foo>
<bar>lmn</bar>
<bar>opq</bar>
</root>
Expected output in both cases is:
Root { foo: ["abc", "def", "ghi"], bar: ["lmn", "opq"] }
Am I doing something wrong here? Is there any way around this? The data I’m working with is formatted like the first example.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 12
- Comments: 31 (1 by maintainers)
It actually means your approach is unusable for XML, because OP gave absolutely valid XML and any valid parser should parse it. If this design doesn’t work, parser should be redesigned. It’s not some “extra feature”, it’s basic functionality.
fyi, I am happy that this wasnt solved anyhow - thanks to this error I realised that I have dependency on the order of xml children, and if this would be /working/ I would spent hours trying to find the rootcause! so the enum solution here is actually what I really wanted, thanks!
The best workaround I found at the moment is to use an enum with
#[serde(other)and#[serde(deserialize_with)]to deserialize unknown tags:You can use enums for this:
This is really hard to do and if implemented, completely messes up how serde operates and requires heap allocations.
Additionally to considering solutions on the serde-end, please consider talking to the provider of your data whether they can produce a format that doesn’t require a lot of state during deserialization.
Just as a heads up, I’ve started work on a potential solution to this issue. I hope to have a PR in the coming weeks.
Thank you @lovasoa.
IMHO, this isn’t a workaround but a solution.
@tobz1000 https://github.com/ralpha/serde_deserializer_best_effort I created this some time ago, it works but is not clean by any means. But maybe it helps. If you actually solve this problem that would be wonderful! 😃
@sapessi, actually the only problem is maybe the error message.
The solution is to modify the type definitions to the following.
Changing the output from the original program probably isn’t going to happen, it’s old and the order represents something from the underlying data.
For now I’m using a separate processing pass to group children together, I guess that’s the best option for now. Thanks!