react-native-render-html: Whitespace handling differs from HTML significantly (no collapsing, newlines ignored)
Is this a bug report or a feature request?
Bug report. Though fixing this might change the behavior of this lib too much for users, so the solutions are probably to write documentation and/or provide an opt-in fix.
Have you read the guidelines regarding bug report?
Yes.
Have you read the documentation in its entirety?
Yes.
Have you made sure that your issue hasn’t already been reported/solved?
Yes, to the best of my abilities. 😃
Is the bug specific to iOS or Android? Or can it be reproduced on both platforms?
Both platforms.
Is the bug reproductible in a production environment (not a debug one)?
Sorry, have not tried yet in a production build. I expect the same result though.
Have you been able to reproduce the bug in the provided example?
Have not tried, but the issue has a really simple setup so it shouldn’t differ.
Environment
Environment:
- React: 16.0.0
- React native: 0.51.0
- react-native-render-html: 3.9.0
Target Platform:
- Android (7.0)
- iOS (11.2)
Steps to Reproduce
Render this JSX:
<HTML html={' <div> foo\n\nbar baz </div> <div>zzz</div> '} />
Expected Behavior
I expected react-native-render-html to handle whitespace collapsing similarly to what HTML does. Replacing a rendered space character (U+0020 SPACE) with •
, that would be:
foo•bar•baz
zzz
Actual Behavior
react-native-render-html (3.9.0 and master) renders:
••foobar••baz
zzz
What seems to work:
- Removing spaces outside of block tags if they only contain whitespaces
- Removing whitespace at the end of a block tag’s content
What seems broken:
- Removing whitespace at the beginning of a block tag’s content
- Collapsing multiple spaces (and other whitespace characters) to a single rendered space, in the middle of text content
- Collapsing newlines to a single space character, in the middle of text content (newlines seem to be removed altogether)
I suspect this lib is limited by React Native’s Text component and errs on the side of not manipulating text too much, only removing newlines?
If that is the case, I can think of two possible improvements:
- Document this behavior and how HTML strings containing a lot of whitespace (which is common with some sources or JS editors) can show extra spaces before or between words.
- If it makes sense, maybe provide an option for performing more HTML collapsing?
I’m going to implement some fixes on our side using simple regexps on our HTML. I can post what I come up with here if that’s useful. Or maybe I should do it in alterData for more fine-grained control?
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 6
- Comments: 47 (2 by maintainers)
Based on @djpetenice genius idea, I created this function:
Now this:
Is replaced by this:
I’ve made great progress with the new release! Given this snippet:
We now have:
whiteSpace: pre;
whiteSpace: normal;
You will be able to control the whitespace behavior with the special
whiteSpace
style property in any of the places you could previously customize styles (baseFontStyles, tagsStyles …etc).I am currently developing this behavior as part of a service for Expensify. The new engine following the whitespace RFC is being implemented here: https://github.com/native-html/core. An early release should be available in the upcoming week.
This pre-release will be part of the 6.x release cycle. If you are wondering why we’re jumping from 4 to 6, the reason is that 6.x will require more recent versions of React Native, and we want all users to benefit from the 5.x enhancements already available in alpha. Also, the new engine changes the structure of nodes available with
onParsed
, and therenderers
prop will probably look different.I’ve done some research to see where those collapsing rules are specified. It appears to be the CSS rule
white-space
. The complete reference algorithm is defined in the CSS Text Module Level 3, sections 3 and 4.Full Reference
Glossary
But this reference considers multiple contexts that we can ignore in a minimal compatibility approach. The reference describes the required behavior for multiple values of the
white-space
CSS property,normal
,pre-line
,nowrap
… We can keep focus on thenormal
value, since this is the default behavior reported by @fvsch. Also, bidirectional layouts for RTL can be considered later, because they add complexity and are limited by React Native own support of these features. Moreover, there seems to be other kind of subtleties depending on localization. Here are the highlights of the spec I have identified. :white-space: normal;
:The W3C consortium also provides a gigantic test suite, and one folder is specifically dedicated to CSS whitespaces which can be a source for inspiration. In the meantime, I have started to implement some basic tests regarding whitespaces, see 53b8679ddc74badb486348a7404fc835527cb7f4 and d76f99d9d44b3bb38fe92a7403b1165e6b10e765. A majority of them fail, of course, which is the point of this issue!
I did a hack by adding a span with a single character and styling it the same colour as my background.
This issue has been fixed in the Foundry release. Try it out now! See #430 for instructions.
One possible workaround I’ve found, although I can’t vouch for it as I haven’t tested it fully, is tricking the library into thinking that the text node is not fully whitespace. This seems to prevent the node from being collapsed. Here I add a zero-width character to whitespace data:
In the cases I’ve seen, this seems to stop the whitespace from being lost. However I’m not sure if there’s other impact.
@djpetenice same problem here.
If the content is (with space between
a
):It renders (without space):
Ok!
I’ve used the space after the tag closure because of this scenario: if I have
<strong>hello</strong><em>world</em>
I want to show helloworld But if I have<strong>hello</strong> <em>world</em>
I want to show hello worldSo I don’t want to have a space between strings if the 2 tags are attached. The one you’ve written (html = html.replace(/<[/]strong>/g, " ")😉 will remove the strong tag closure and replace it with a space.
@Draccan yup, these tags use the same logic as the ones fvsch pointed out.
@fvsch your solution in your gist looks pretty clean. It feels like a solid improvement, even if it’s obviously not matching an actual browser rendering. I’ll try it out more, and figure out whether if this can be added in the codebase by default. I don’t feel like documenting your gist and asking people to copy and paste a hundred lines into their own project just to add this feature, and I don’t want to keep bloating the module with additional props either.
I’ll measure the performances impact and the potential regressions (all help is welcome here !) and if everything is going smoothly, let’s add it to the project. What do you think ?
@Exilz The current whitespace handling has bugs too:
Shows:
You probably have a rule that a whitespace-only text node between two tag siblings can be dropped, but if the tags are both inline it should be kept (and collapsed to a single space if needed).
If you can point me to the right source files, I can have a look and maybe do a PR. 😃
Applying the following replace function to data before passing it to the component works for all our content so far:
.replace(/[\t\r\n ]+/g, ' ')
Are there any plans to resolve this issue any time soon? Or are workarounds the recommended solution at the moment?
Thanks
Not sure. that is working for white space issue
I made a function to fix the white space between tags.
That is working fine. I hope it is helpful for anyone.
Anyone having issues with spaces between adjacent links? I’ve tried all spacing methods and all are being ignored:
For the problem where
react-native-render-html
suppresses newline characters (instead of rendering them as a space), did you try replacing your newline characters first?life is strange … 😃
now works!
the problem was thew regex:
is: html = html.replace(/<[/]strong>/g, " ");
and NOT html = html.replace(/<[/]strong> /g, " ");
without **space/g work fine…
I think that now the best solution is to understand what kind of HTML you receive.
We are lucky because in the app we are developing the admin panel gives the possibility to enter text with bold, italic, paragraphs, etc.
So the HTML editor is an our component and we know how it puts HTML tags. If you’re lucky like us you can use something like:
` code:
`
Our editor puts spaces after tag closure and we put them before the tag closure in our app.
If you’re taking html pages from web I don’t recommend to use this approach and especially avoid regex!
I have seen that the space is removed not only for strong tag but even for em, s and u. For example:
<p><strong>A</strong> <em>B</em> <u>C</u> <s>D</s></p>
It prints ABCD without spaces 😦These parts in HTML.js are probably wrong:
Use case:
Use case:
This is the result I’m getting for a HTML string with lots of whitespace, first with no special processing and then with a custom
alterData
function that tries to mimic the HTML algorithm for whitespace collapsing:The result is not as complete as what HTML does, for instance in the case of
<p><a>foo </a> , bar</p>
, web browsers will remove the space before the comma, but I’m not sure what the algorithm is and I’m not going that far.The render part for this test:
Where
collapseHtmlText
is a separate module (around 90 lines). I’m still working on it so I’ll test with more content before sharing.