subliminal: Subtitle encoding detection fail
Hello,
subliminal converting subtitles to UTF-8 and for some players not a good idea. For example XBMC using user interface language for determine subtitles language. In my case my XBMC is Turkish and my subtitles downloaded from subliminal showing wrong characters.
Debug of subliminal: http://pastebin.com/raw.php?i=FDKCxZzw
I download same subtitle. This is compare of subliminal vs. opensubtitles web download:
$ charade downloaded_by_subliminal.srt downloaded_from_opensubtitles_website.srt
downloaded_by_subliminal.srt: utf-8 with confidence 0.99
downloaded_from_opensubtitles_website.srt: ISO-8859-2 with confidence 0.874910791543
$
I am going to paste sneak peek of this two subtitles. You can see differences: http://pastebin.com/raw.php?i=5bE7YZJJ
About this issue
- Original URL
- State: closed
- Created 10 years ago
- Comments: 49 (23 by maintainers)
Commits related to this issue
- Fix wrong Turkish subtitle encoding detection #315 Because of chardet Turkish encoding detect issue subliminal can't recognize Turkish encoding and convert correctly to utf-8 #315 — committed to queeup/subliminal by queeup 10 years ago
- Fix wrong Turkish subtitle encoding detection #315 Because of chardet Turkish encoding detect issue subliminal can't recognize Turkish encoding and convert correctly to utf-8 #315 — committed to queeup/subliminal by queeup 10 years ago
- Fix wrong Turkish subtitle encoding detection #315 Because of chardet Turkish encoding detect issue subliminal can't recognize Turkish encoding and convert correctly to utf-8 #315 — committed to queeup/subliminal by deleted user 10 years ago
- Fix wrong Turkish subtitle encoding detection #315 Because of chardet Turkish encoding detect issue subliminal can't recognize Turkish encoding and convert correctly to utf-8 #315 — committed to queeup/subliminal by deleted user 10 years ago
- Merge pull request #327 from queeup/master Fix wrong Turkish subtitle encoding detection #315 — committed to Diaoul/subliminal by deleted user 10 years ago
- Fix wrong Turkish subtitle encoding detection #315 Because of chardet Turkish encoding detect issue subliminal can't recognize Turkish encoding and convert correctly to utf-8 #315 — committed to h3llrais3r/subliminal by deleted user 10 years ago
- Fix wrong Turkish subtitle encoding detection #315 Because of chardet Turkish encoding detect issue subliminal can't recognize Turkish encoding and convert correctly to utf-8 #315 — committed to h3llrais3r/subliminal by deleted user 10 years ago
- Fix wrong Turkish subtitle encoding detection #315 Because of chardet Turkish encoding detect issue subliminal can't recognize Turkish encoding and convert correctly to utf-8 #315 — committed to h3llrais3r/subliminal by deleted user 10 years ago
- Fix wrong Turkish subtitle encoding detection #315 Because of chardet Turkish encoding detect issue subliminal can't recognize Turkish encoding and convert correctly to utf-8 #315 — committed to h3llrais3r/subliminal by deleted user 10 years ago
Subliminal doesn’t use any encoding detection by default. It is only triggered when using the
-eargument of the CLI.In the previous versions, subliminal used the detected encoding to decode the subtitle and write it on the filesystem using utf-8. Due to many bug reports without the required information to make the detection algorightm better I decided to deactivate it by default and leave encoding detection to media players.