zola: HTML minification fails with CJK text

Bug Report

Zola sometimes fails to build or serve content with minify_html = true when source files contain CJK characters, depending on the number of characters around them.

Environment

Zola version: 0.13.0

Expected Behavior

Zola should compile and minify all files without issue.

Current Behavior

Site fails to build sometimes, and in other situations fail to be served, if there are CJK characters. It’s interesting that zola serve and zola build behave differently (I haven’t run into a situation where both failed to work on the same content), but I guess this probably has to do with the injected reload script/different base_url.

Building site...
-> Creating 0 pages (0 orphan), 0 sections, and processing 0 images
Failed to build the site
Error: Failed to convert bytes to string : invalid utf-8 sequence of 1 bytes from index 18

(the third line doesn’t show up when running zola serve)

Steps to reproduce

As it’s not a consistent issue, I’ve prepared a small example site: minify-cjk-bug.zip

Zola successfully builds this example site if I ensure three bytes (ASCII characters, newlines), or multiples of three bytes (including 0 bytes), are before the CJK character.

I haven’t been able to create a minimal site to reproduce the failure during zola serve, but my website currently fails to build on serve.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 21 (1 by maintainers)

Commits related to this issue

Most upvoted comments

Yep, it should be fixed in the next release (0.14) but it is an issue on 0.13

It seems that I have the same problem 😂

In my case, when minify_html enabled , zola serve always runs fine, while zola build gets an error Failed to convert bytes to string : invalid utf-8 sequence of 1 bytes from index. And both works well after disable minify_html.

Reproduce: enable minify && disable minify

using zola 0.13.0 downloaded from the release url.


Updated: I’ve tested the next branch with commit https://github.com/getzola/zola/commit/534174ae78e36def4ac3cd98c2c2f0b683870a15. zola build works fine with minify_html enabled. hh, Also I find the front matter regex required an additional \n in 0.14.0 https://github.com/getzola/zola/blob/534174ae78e36def4ac3cd98c2c2f0b683870a15/components/front_matter/src/lib.rs#L18.

Sorry about that 😦