etherpad-lite: pdf export using AbiWord fails due to bad intermediate HTML
Describe the bug In trying to export to PDF using Abiword / AbiCommand, the export fails because the HTML is slightly invalid and Abiword is stupidly picky.
$ abiword --plugin AbiCommand
Unable to init server: Could not connect: Connection refused
** (abiword:1325175): WARNING **: 18:22:41.983: clutter failed 0, get a life.
Unable to init server: Could not connect: Connection refused
AbiWord command line plugin: Type "quit" to exit
AbiWord:> convert /tmp/etherpad_export_2796631100.html /tmp/etherpad_export_2796631100.pdf pdf
AbiWord: could not open the file [/tmp/etherpad_export_2796631100.html]
error -1
AbiWord:> quit
$ tidy -xml /tmp/etherpad_export_2796631100.html
line 43 column 1 - Error: unexpected </head> in <meta>
line 125 column 43 - Warning: unescaped & which should be written as &
line 127 column 16 - Warning: unescaped & which should be written as &
line 253 column 6 - Warning: unescaped & which should be written as &
line 374 column 1 - Error: unexpected </body> in <br>
line 375 column 1 - Error: unexpected </html> in <br>
Tidy found 3 warnings and 3 errors!
If I edit the HTML and replace <meta ....>
with <meta .../>
and change "& "
to "& "
and <br>
with <br/>
, then AbiWord no longer has issues.
This is basically the same issue as #3732. I attached a sample intermediate HTML file to that issue.
About this issue
- Original URL
- State: closed
- Created 8 months ago
- Comments: 17 (9 by maintainers)
Thanks for the research. I checked the code it seems like tidy only runs for Abiword and SOffice and that causes more problems than needed. So I don’t really see a reason why we should keep tidy as is.