pyyaml: Why does pyyaml disallow trailing spaces in block scalars?
Block style (|
, <
) is overruled with quoted style ("
) if the string contains trailing spaces. I don’t see any reason for this, and the fact that I can’t force a certain style (even it means losing information) has caused me a lot of grief.
About this issue
- Original URL
- State: open
- Created 6 years ago
- Reactions: 13
- Comments: 22 (8 by maintainers)
Links to this issue
Commits related to this issue
- Remove trailing spaces from partition tables cf https://github.com/yaml/pyyaml/issues/121 — committed to achevalet/foreman-yml by deleted user 4 years ago
- Sanitize lines for clean YAML output when generating profiles There is an issue with how pyyaml dumps lines, where if the line ends with whitespace, styling won't matter. https://github.com/yaml/p... — committed to rhmdnd/content by rhmdnd a year ago
- Sanitize lines for clean YAML output when generating profiles There is an issue with how pyyaml dumps lines, where if the line ends with whitespace, styling won't matter. https://github.com/yaml/p... — committed to rhmdnd/content by rhmdnd a year ago
- fix: Handle multiline strings in yaml serialization. (#935) **Pull Request Checklist** - [x] Tests added - [x] [Good commit messages](https://cbea.ms/git-commit/) and/or PR title **Description ... — committed to argoproj-labs/hera by DanCardin 5 months ago
It works for me:
Thank you @perlpunk for these examples, they perfectly illustrate what I mean, and thank you both for looking into this.
My use case was that I had a large number of dictionaries that I wanted to store as yaml records, in
|
block style. Some of these happened to have trailing spaces and were output wrongly, as a double quoted scalar. This was frustrating because|
block style. This really boggles the mind: pyyaml is needlessly deviating from the yaml spec. So this is a bug, that should be fixed.|
block style. The way every other Python function that I know of works is that the option I set is obeyed, possibly leading to warnings and/or errors, and that the consequences (like losing information in edge cases) are for me to deal with. This is programming, I should be able to control things. For pyyaml to override my option due to style considerations really is not acceptable.My ‘solution’ was to strip trailing spaces before emitting so I would get
|
block style across the board, so ironically, this led me to lose information I wouldn’t have lost otherwise.@ingydotnet , I was trying to give examples, to every reader of this issue, where I can understand the wish for block style to be preserved. I was not writing this to imply that you don’t know how it works.
I think we’re all agreed this is an old implementation issue, not a spec issue- I’m willing to spend some time trying to make it work in the next release, but as @perlpunk mentioned earlier, I’d want to integrate the more comprehensive YAML test suite first to have some confidence we didn’t break anything in the process.
I’ve added this issue to the PyYAML 6.1 planning project.
This is not a solution as it does not preserve the spaces. You cannot round-trip this!
Identifying that this was the issue I was facing just killed a few hours for me.
At the very least, can pyyaml please throw a warning that says “cannot use style | on content with trailing spaces”
@ingydotnet I agree loss of information is not an option.
For literal
|
block style it’s easy to imagine use cases IMHO. If you have text data in a specific format that requires trailing spaces, you still might want to dump it as a block scalar for readability.Folded style is used when you have long lines. Imagine you have a long input line, with words seperated by different amount of spaces.
The emitter now tries to break this up into several lines to get below a limit of characters per line.
I think there are only two possibilities to to that, double quoted and folded block. libyaml, pyyaml and ruamel all emit this as a double quoted scalar, overruling the requested block scalar style:
Which I think is not very readable. The only advantage is that there are no trailing spaces,.
I’m another person who spent a day trying to work out why my long strings weren’t being styled as block literals -.-’
I agree that trailing spaces shouldn’t be a reason to disallow block scalars. Can’t say what changes are needed to allow this, though. If we could integrate https://github.com/yaml/yaml-test-suite and check that parsing the output again returns the same parsing events we could make sure that we don’t break anything.