commonmarker: Error "incompatible character encodings: UTF-8 and ASCII-8BIT" when combined with a rails app
I think this might not be a commonmarker problem, BUT the error is not raised when using pandoc-ruby nor redcarpet, so it has something to do with commonmarker.
Here you can see a test run from the command line with both cmark and commonmarker and there’s no problem:
$ cat test-curly-quotes.md
This curly quote “makes commonmarker throw an exception”.
$ cmark --version
cmark 0.20.0 - CommonMark converter
(C) 2014, 2015 John MacFarlane
$ cmark test-curly-quotes.md
<p>This curly quote “makes commonmarker throw an exception”.</p>
$ gem list --local commonmarker
*** LOCAL GEMS ***
commonmarker (0.2.0)
$ cat test-curly-quotes.md | ruby -r commonmarker -e "puts CommonMarker.render_html(gets)"
<p>This curly quote “makes commonmarker throw an exception”.</p>
That said, I’m testing different markdown parsers/renderers for our rails 4.1.12 (ruby 2.2.2) based app and I’m getting the following error:
ActionView::Template::Error (incompatible character encodings: UTF-8 and ASCII-8BIT):
12: - if user_signed_in?
13: .outline-content
14: = commonmarker_markdown(@quimbee_outline.source)
app/views/outlines/show.html.slim:15:in `_app_views_outlines_show_html_slim___3317075370232322437_70158621096300'
Rendered /Users/oboxodo/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/actionpack-4.1.12/lib/action_dispatch/middleware/templates/rescues/_trace.html.erb (2.9ms)
Rendered /Users/oboxodo/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/actionpack-4.1.12/lib/action_dispatch/middleware/templates/rescues/_request_and_response.html.erb (1.7ms)
Rendered /Users/oboxodo/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/actionpack-4.1.12/lib/action_dispatch/middleware/templates/rescues/template_error.html.erb within rescues/layout (69.1ms)
I have these helpers:
# encoding: UTF-8
module ApplicationHelper
def commonmarker_markdown(text)
CommonMarker.render_html(text, :smart).html_safe
end
def pandoc_markdown(text)
converter = PandocRuby.new(text, from: :markdown, to: :html)
converter.convert.html_safe
end
def redcarpet_markdown(text)
# ...
end
end
Changing the call to commonmarker_markdown to either pandoc_markdown or redcarpet_markdown renders the expected result with no errors.
It’s not a DB (postgresql) encoding problem either as hardcoding the test phrase in place of the text variable (no DB involved) causes the same problem.
Any ideas about what could be happening?
About this issue
- Original URL
- State: closed
- Created 9 years ago
- Comments: 28 (16 by maintainers)
Due to https://github.com/gjtorikian/commonmarker/pull/186, walking over nodes has been removed in v1.0.0. Users can use https://github.com/gjtorikian/html-pipeline if they wish to iterate over HTML after the fact.
Oh shoot, I do. Ok. I’ll make time for this today.
Got it. In ruby the convention is to use
\uto indicate a unicode hexadecimal:I can now reproduce the problem; now we’re getting somewhere.
So you can absolutely walk the AST tree: https://github.com/gjtorikian/commonmarker#example-walking-the-ast
But that’s very slow/time-consuming, and ideally shouldn’t be necessary. Are you able to share your markdown doc or create a small (failing) test to show the error?