symfony: [Serializer][XmlEncoder] Don't wrap content in CDATA
| Q | A |
|---|---|
| Bug report? | no |
| Feature request? | yes |
| BC Break report? | unsure |
| RFC? | yes |
| Symfony version | all |
It would be good to improve XmlEncoder so that it does not wrap content in a CDATA section, but provide some means for the user to direct the encoder to wrap specified content in a CDATA.
The following function currently determines whether or not to wrap:-
/**
* Checks if a value contains any characters which would require CDATA wrapping.
*
* @param string $val
*
* @return bool
*/
private function needsCdataWrapping($val)
{
return 0 < preg_match('/[<>&]/', $val);
}
The problem with this function is that the “<”, “>” and “&” characters are a poor signal that the content should be wrapped in a CDATA. The xml-spec is somewhat clear about this:-
The ampersand character (&) and the left angle bracket (<) MUST NOT appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they MUST be escaped using either numeric character references or the strings " & " and " < " respectively.
We should interpret this as meaning that the aforementioned characters appearing in textual content of an element must be escaped as entity or character references. We should not interpret it as meaning that such content should be wrapped in a CDATA.
That same doc has a good example of when to use CDATA section:-
<![CDATA[<greeting>Hello, world!</greeting>]]>
There should be a way to serialise to a CDATA section for such a use case as that example, but this decision should not be taken by the encoder in the manner done currently.
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 23 (13 by maintainers)
How about:-
provide a serialisation context key that XmlEncoder can use to decide whether or not to wrap content in a CDATA section. BC could be maintained in v3 and v4 by treating the context key as
trueby default:-In v3 to v5 allow a custom Normalizer to control which content should be wrapped in CDATA, perhaps by using an object that wraps content
so that XmlEncoder.selectNodeType can do:-
Finally, in v5, treat
xml_cdata = falseby default so that XmlEncoder creates normal, encoded content unless instructed otherwiseAll of this means that:-
Thank you @Simperfit and @nicolas-grekas for the reviews. Here are some references for the notion that XmlEncoder should only ever create a CDATA section when explicitly instructed to do so:-
stackoverflow.com