pagedjs: Significant whitespace removed after page breaks
Thank you for filling in this gap that browser vendors don’t seem to care about. ❤️
When a block is split across two pages, it looks like its content is copied without preserving white space.
This becomes a problem if that block had white-space: pre style, for example when formatting code:

About this issue
- Original URL
- State: open
- Created 3 years ago
- Comments: 15 (2 by maintainers)
Commits related to this issue
- XWIKI-20553: The whitespace between two highlighted code tokens is lost when exporting to PDF * Apply workaround until Paged.js issue is fixed ( https://github.com/pagedjs/pagedjs/issues/45 ) — committed to xwiki/xwiki-platform by mflorea a year ago
- XWIKI-20553: The whitespace between two highlighted code tokens is lost when exporting to PDF * Apply workaround until Paged.js issue is fixed ( https://github.com/pagedjs/pagedjs/issues/45 ) (cherry... — committed to xwiki/xwiki-platform by mflorea a year ago
- XWIKI-20553: The whitespace between two highlighted code tokens is lost when exporting to PDF * Apply workaround until Paged.js issue is fixed ( https://github.com/pagedjs/pagedjs/issues/45 ) (cherry... — committed to xwiki/xwiki-platform by mflorea a year ago
Hey @jods4 The 2.0 was mostly about footnotes and long overdue merge request. (and all the work we manage to put into paged.js is not enough to fix everything as quickly as we’d like).
we have a meeting scheduled with @fchasen to check what we’re gonna work on next. This issue is part of that talk
(We really wish we could be faster, but we’re doing our best here.)
I debugged a bit this issue and my conclusion is this:
Layout#renderTo()skips, by default, text nodes that (1) have only white-space and (2) are direct child nodes of a block (container) node. SeenextSignificantNode, isIgnorable andisContainer.PREelements are deep copied byLayout#renderTo()(becausePREis not considered a “container”). This ensures the white-space is preserved as long as thePREelement doesn’t have to be split between print pages.PREelement is split between print pages, the break token ends up inside thePREelement (in the source content).Layout#renderTo()then continues from the break token handling each child node of thePREelement individually, as if they were outside of thePRE, thus skipping white-space only text nodes.The break token points to the source content where there is a single
PREelement, unlike the rendered content that has multiple (as a result of the split). This is why we can’t put the break token for the next print page “before the PRE”. It needs to be inside thePRE. Thus the only option I see is to modifyisIgnorableto take into account if the parent element has white-space preserved.