ktor: application/json default charset is not UTF-8 when parsing request
This is a follow up to #80. Apparently, responses were fixed, but there are still issues with the requests.
Request with Content-Type: application/json
is decoded not as UTF-8, while request with Content-Type: application/json; charset=utf-8
is decoded correctly. Both has to behave the same and be decoded as UTF-8.
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 15
- Comments: 30 (7 by maintainers)
Commits related to this issue
- Add tests for Gson and Jackson features charset encodings related to issue #384 — committed to ktorio/ktor by deleted user 6 years ago
- Add tests for Gson and Jackson features charset encodings related to issue #384 — committed to schleinzer/ktor by deleted user 6 years ago
- Corrigir leitura do payload do PubSubHubbub https://github.com/ktorio/ktor/issues/384 — committed to LorittaBot/Loritta by MrPowerGamerBR 4 years ago
Use this function if you just want to “have it work”:
@cy6erGn0m Maybe you want to change the default implementation of
receiveText
to the implementation ofreceiveTextWithCorrectEncoding()
?I have run into this problem as well. The
receiveTextWithCorrectEncoding()
work-around solved it for now, but it seems like it is still an issue.Oh, I just got what @cy6erGn0m means in his comment. So, it’s expected behavior. But still doesn’t sound like a good and user-friendly even if it follows HTTP standard. JSON by RFC standard is must be encoded in UTF-8. Maybe make sense to open another issue and reconsider current behavior. Even for non-json content types, I hardly can imagine ISO-8859-1 as default encoding in Modern Web, and in most cases it will just cause bugs.
@MOZGIII
Actually,
receiveText
also usescall.receive<String>()
, in case of Gson/Jackson there is different behavior based on assumption of UTF-8 encoding for Json.Yes, I completely agree with you. I solved my problem by reading as ByteArray and converting it to String and specify the encoding explicitly (actually ByteArray.toString() by default uses UTF-8), but current behavior is very error-prone and sometimes hard to understand what is went wrong: in my case signature check failed sporadically for different events (for ones with UTF characters)