i18n: [BUG] UTF-8 YAML files with accents in version 1.9.0 raise incompatible character encodings: UTF-8 and ASCII-8BIT
For long time I have a code like this:
STATES = I18n.t('states').with_indifferent_access.freeze
The yaml file is in UTF-8 and it has accents in some words:
pt-BR:
states:
Acre: AC
Amapá: AP
Ceará: CE
Piauí: PI
Paraná: PR
with new release 1.9.0 It starts failing in our CI in a lot of places:
ActionView::Template::Error incompatible character encodings: UTF-8 and ASCII-8BIT
Failure/Error: = f.select(:state, City::STATES,
ActionView::Template::Error:
incompatible character encodings: UTF-8 and ASCII-8BIT
./app/views/customers/_form.html.slim:86:in `block in
Downgrade to 1.8.11 make everything works again.
With 1.9.0 use rails console it loads like:
rails c
Running via Spring preloader in process 4964
Loading development environment (Rails 6.1.4.4)
[1] pry(main)> I18n.t('states').with_indifferent_access.freeze
=> {"Acre"=>"AC",
"Alagoas"=>"AL",
"Amazonas"=>"AM",
"Amap\xC3\xA1"=>"AP",
"Bahia"=>"BA",
"Cear\xC3\xA1"=>"CE",
"Distrito Federal"=>"DF",
"Esp\xC3\xADrito Santo"=>"ES",
"Goi\xC3\xA1s"=>"GO",
"Maranh\xC3\xA3o"=>"MA",
"Minas Gerais"=>"MG",
"Mato Grosso do Sul"=>"MS",
"Mato Grosso"=>"MT",
"Par\xC3\xA1"=>"PA",
"Para\xC3\xADba"=>"PB",
"Pernambuco"=>"PE",
"Piau\xC3\xAD"=>"PI",
"Paran\xC3\xA1"=>"PR",
"Rio de Janeiro"=>"RJ",
"Rio Grande do Norte"=>"RN",
"Rond\xC3\xB4nia"=>"RO",
"Roraima"=>"RR",
"Rio Grande do Sul"=>"RS",
"Santa Catarina"=>"SC",
"Sergipe"=>"SE",
"S\xC3\xA3o Paulo"=>"SP",
"Tocantins"=>"TO"}
With 1.8.11 use rails console it loads like:
rails c
Running via Spring preloader in process 3411
Loading development environment (Rails 6.1.4.4)
[1] pry(main)> I18n.t('states').with_indifferent_access.freeze
=> {"Acre"=>"AC",
"Alagoas"=>"AL",
"Amazonas"=>"AM",
"Amap\xC3\xA1"=>"AP",
"Bahia"=>"BA",
"Cear\xC3\xA1"=>"CE",
"Distrito Federal"=>"DF",
"Esp\xC3\xADrito Santo"=>"ES",
"Goi\xC3\xA1s"=>"GO",
"Maranh\xC3\xA3o"=>"MA",
"Minas Gerais"=>"MG",
"Mato Grosso do Sul"=>"MS",
"Mato Grosso"=>"MT",
"Par\xC3\xA1"=>"PA",
"Para\xC3\xADba"=>"PB",
"Pernambuco"=>"PE",
"Piau\xC3\xAD"=>"PI",
"Paran\xC3\xA1"=>"PR",
"Rio de Janeiro"=>"RJ",
"Rio Grande do Norte"=>"RN",
"Rond\xC3\xB4nia"=>"RO",
"Roraima"=>"RR",
"Rio Grande do Sul"=>"RS",
"Santa Catarina"=>"SC",
"Sergipe"=>"SE",
"S\xC3\xA3o Paulo"=>"SP",
"Tocantins"=>"TO"}
The output seems exactly the same. Is that a bug in new version?
Probably something between new version and load in rails. This is just a sample I have others files with accents and all of them are raising same exception.
Versions of i18n, rails, and anything else you think is necessary
ruby: 3.0.3 i18n: 1.9.0 rails: 6.1.4.3 rspec-rails: 5.1.0 rspec: 3.10.0
and # frozen_string_literal: true in ruby files.
About this issue
- Original URL
- State: open
- Created 2 years ago
- Reactions: 4
- Comments: 21
Commits related to this issue
- Feature test for issue #606 and disable the optimization if the bug is present — committed to Shopify/i18n by byroot 2 years ago
msgpack 1.4.5was released a few hours ago and should solve this issue: https://rubygems.org/gems/msgpack/versions/1.4.5Ok, so the bug is actually in
msgpack, I opened a PR here: https://github.com/msgpack/msgpack-ruby/pull/246You can apply the patch with:
Alternatively, if you’d rather not run a gem branch, you can disable Bootsnap YAML caching. Sorry for the bug 😕
Reviewing this again, I think we’ll just wait for a new
msgpackrelease to happen, and then advise people who encounter this issue to upgrade to that new version.I’ll be leaving this issue open until that new version is out.
He said early next week hopefully.
Ah damn it, I know what the problem is. It’s because Bootsnap uses msgpack to accelarate YAML parsing, and MessagePack use an API that doesn’t preserve symbols encoding properly. See an issue I opened a while ago https://github.com/msgpack/msgpack-ruby/pull/211
Let me go over my old research see how we could sidestep this in bootsnap. I’ll update here ASAP.
Commit that breaks this behaviour is 0fda789ea745cd462658a8948ee085201aba5c6f, as discovered through a
git bisect:Thank you @joergschiller I am not alone 🙏🏻 I was going to make a repo and you saved me. I was believing that It was something desired in new version but now I think we have a bug.