terminal: Korean IME does not work as expected
Environment
Windows build number: Windows 10 2004 19041.21
Windows Terminal version (if applicable): 0.8.10091.0
Any other software?
Steps to reproduce
- Run Windows PowerShell tab
- Typing text
$x = '한글'
Expected behavior
- The resulted text should be
$x = '한글'
Actual behavior
- But the resulted text was
$x = '한그ㄱ글'
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 16
- Comments: 24 (8 by maintainers)
Me and @guswns0528 tried https://github.com/microsoft/terminal/pull/4796 and it worked perfectly! Still don’t know why the 2 TextUpdate events were fired but looks like your patch fixed it anyway. 👍
@simnalamburt Yup, not clearing the whole text buffer, but leaving unfinished characters in is key! Luckily, I think I’m close to getting the fix for this out! 🎉I’m specifically testing out trying to type out this sequence: 안녕하세요, which was provided earlier in this thread.
It seems to work as expected, but before I have a PR out for this fix, I’ll need to make sure I haven’t messed up any other IME input modes, so hang tight! 😃
One thing I would like your help on is letting me know of other sample character sequences that might possibly break the way I’m handling Korean IME! I don’t know Korean at all, so having the sequence laid out in english characters like it was above with “dkssudgktpdy” (which comes out as 안녕하세요) really helped!
@rkttu Nope, when there is progress, someone will make sure to chime in on this thread. It’s been triaged as a P1 bug for 1.0, so we won’t be shipping 1.0 without a fix for this, so stay tuned. If anyone is particularly passionate about this bug, we’d be happy to review a PR. Until someone’s been assigned to this bug, you can be sure you won’t be stepping on our toes ☺️
@leonMSFT Wow thanks for your detailed explanation! Now I understand what was going on in my development environment.
Currently I’m trying to leave some unfinished characters in text buffer instead of totally clearing it in CompositionCompletedHandler.
Please share any information or updates and let me help anything I can! Actually there are lots of people waiting for this issue to be resolved since there are not much options that Korean developer can choose in Windows. Any share will be helpful and the whole Korean developer community will be grateful to you! 😄
@simnalamburt thanks a lot for the investigation!
I tried to see if I can repro the two vs three composition completed events issue, but I only see two composition completed events. Maybe you could provide a screenshot of the debugging logs you used to find this?Update: I just found this out, but you’re likely seeing multiple composition completed events after the first character is finished because when we callNotifyTextChanged
to reset the text server’s buffer, it’ll fire another composition completed event. However, since we have the if-statement to check if there’s anything in_inputBuffer
, we don’t do anything on our side.I was also taking a look at this bug earlier, and perhaps my findings could explain why you’re seeing the bug you’re seeing.Here’s the behavior I’m observing and why I think we’re messing up Korean IME input:So, going through your example keysequence, pressing <kbd>ㅎ</kbd><kbd>ㅏ</kbd><kbd>ㄴ</kbd>, would result in three
TextUpdated
events being received, with the_inputBuffer
and the_textBlock
having the character한
.Now the user presses <kbd>ㄱ</kbd>, and what will happen is the following:
한
.한
is the finished composition._inputBuffer
(which is 한) to the terminal and reset the_inputBuffer
and_textBlock
. Then we also notify the text server that they should also make their “buffer” empty as well. (This is theNotifyTextChanged
call that you mentioned).What should happen now is that we should receive a
TextUpdated
event with the text as한ㄱ
. However, we’re actually not getting anyTextUpdated
events after our CompositionCompletedHandler is finished because we’re telling the text server to reset their text buffer. Since their buffer is empty, it won’t tell us that we should update our text to be anything.This is why you’ll run into the weird issue where you’ll be pressing <kbd>ㅎ</kbd><kbd>ㅏ</kbd><kbd>ㄴ</kbd>, which works fine, and you’ll see
한
on the screen, but once you make another input, like <kbd>ㄱ</kbd>, nothing happens. The <kbd>ㄱ</kbd> keypress triggers the CompositionCompleted event for the previous character 한, which tells the text server to clear their buffer. So, you will need to press <kbd>ㄱ</kbd> again to make ㄱ show up on the screen.So, the core of the problem is that we need to send the IME input to the terminal when we believe composition is finished, and we naturally also need to clear our buffer whenever we send some input to the terminal. We also need to keep the text server’s buffer and our
_inputBuffer
in sync, so whenever we clear our buffer, we tell the text server to clear theirs as well.As a small test, I’ve tried commenting out the code where we’re telling the text server to reset their text buffer, and lo and behold, text comes out as you would expect, without having to double-press any characters. The only problem here is that if we don’t reset our text buffer, every CompositionCompleted event will cause us to send the whole
_inputBuffer
(which included literally everything you’ve ever typed while in IME mode) to the terminal, resulting in lots of duplicate input.I’m currently trying to think of a way around this, but I’m giving you a summary of my findings so maybe you can also repro and investigate further to see if I’ve missed something! 😄
@guswns0528 discovered that this bug is first caused by https://github.com/microsoft/terminal/commit/dfa7b4a1. https://github.com/microsoft/terminal/commit/dfa7b4a1 itself is right commit so we can’t simply revert it.
Looks like this issue is caused by strange behavior of Core text APIs. Originally, the Composition complete event should only be fired when the letter composition is completed. Like this:
한
, and글
But in here it’s fired before that. Like this:
한
,그
, and글
It might be a bug of Core text APIs but core text API is just a wrapper of Text Services Framework, which is super stable framework inheritted from Windows XP era. So I need to investigate further.
I’m currently debugging this issue and I’ll comment on here if I find something. Please feel free to share any information if you find something. Thanks
https://github.com/microsoft/vscode/issues/89853 Possibly similar issue happening in VS Code
Thanks for the bug report! @rkttu /@yjh0502 Does this repro in a legacy console (
pwsh.exe
) window? Or does this only happen in the Windows Terminal?The bug breaks my daily workflow 😦