terminal: Wrong character when pasting into Windows Terminal with a Windows Subsystem for Linux profile
Windows Terminal version
1.11.3471.0
Windows build number
10.0.19044.0
Other Software
Windows Subsystem for Linux (version 2) Debian 5.10.60.1 (according to [Environment]::OSVersion in PowerShell inside WSL) PowerShell 7.2.1 (in Debian)
Steps to reproduce
First the case that I seem to be able to reproduce the easiest.
-
First the general Windows Subsystem for Linux that’s a bit hard to reproduce.
-
Have Windows Subsystem for Linux installed and install Debian from the Windows store.
-
Use the profile in Windows Terminal to open a command prompt for the WSL Debian machine.
-
In Windows Terminal, have a hotkey for paste (such as CTRL+v or Shift+Insert) that’s handled by Windows Terminal instead of the command line session.
-
Install PowerShell using the instructions at https://docs.microsoft.com/en-us/powershell/scripting/install/installing-powershell-on-linux?view=powershell-7.2 .
-
In the PowerShell profile (in my case, ~/.config/powershell/Microsoft.PowerShell_profile.ps1), put the following:
# Register an event when PowerShell is closing cleanly to save the current
# history.
#
# Note: Closing PowerShell via the X button may not allow this to save. But,
# typing "exit" should work.
Register-EngineEvent -SourceIdentifier PowerShell.Exiting -SupportEvent -Action {
# Save-PSHistory
# Check if any jobs are still running.
If (Get-Job -State Running | Where-Object {$PSItem.Name -iNe 'PowerShell.Exiting'}) {
Get-Job -State Running | Where-Object {$PSItem.Name -iNe 'PowerShell.Exiting'} | Wait-Job
}
# If we have Linux "sync", run it.
#
# For now, we'll only call sync if it's '/bin/sync'. Add a delay just in case it doesn't run.
If (Test-Path -Path '/bin/sync') {
If (Get-Command -Type Application '/bin/sync') {
& /bin/sync
}
}
Start-Sleep -Seconds 2
}
-
In a text editor, such as Notepad, put many Box Drawings Double Horizontal characters (“═”, Unicode character 0x2550). In my case, I had 80 (═══════════════════════════════════════════════════════════════════════════════).
-
In Windows Terminal, start a Debian session, and start PowerShell with “pwsh”.
-
Copy the line of Box Drawings Double Horizontal characters, not including any newline characters.
-
Paste them into the Debian session in Windows Terminal via CTRL+v shortcut.
It has occurred for me in Bash, but it’s not consistent. I’ve only had it happen a few times by pressing and holding CTRL+v to paste the text many times.
Expected Behavior
The same text that was copied should be pasted into the command prompt.
Actual Behavior
I see “═” (Unicode character 0x2550) and “�” (Unicode character 0xFFFD) pasted.
From my Windows Terminal:
PS /home/cdenn/.config> '════════���������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������'
════════���������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������
About this issue
- Original URL
- State: open
- Created 3 years ago
- Comments: 25 (1 by maintainers)
When you type in a window, the window procedure gets
WM_KEYDOWN
andWM_KEYUP
messages that contain virtual key codes. These key codes are typically translated to UTF-16WM_CHAR
messages. For a paste, an app such as Terminal likely gets the clipboard text as UTF-16 viaGetClipboardData(CF_UNICODETEXT)
.For a ConPTY host such as a terminal, text input has to be encoded to UTF-8 and sent over a pipe to the console host (i.e. conhost or openconsole). But apparently the console host still uses UTF-16 internally for the input buffer. If you try to read the input buffer as UTF-8 via
SetConsoleCP(CP_UTF8)
andReadFile()
,ReadConsoleA()
, orReadConsoleInputA()
, the console simply replaces each non-ASCII character (i.e. code points above 0x7F) with a null byte.Anyway, this bug doesn’t appear to be a problem with what Linux PowerShell reads from the terminal, but rather what it echoes back. The echoed text is UTF-8, at least until it gets to the console host. If it gets decoded to UTF-16 in the console host, then splitting a character sequence across multiple writes is potentially a problem. I thought that problem was resolved by internal buffering in the console. There could still be a problem with overlapped writes, when a partial sequence gets mistakenly completed by an unrelated write.
I do not know what these artifacts are. Judging by the screenshot, this is the part of the
0x2550
symbols that the text on the left did not paint over.This is script to make screenshot )) Just emulating of hotkeys
Alt+PrtSc
orPrtSc
and saving. I did not findscrot
for Windows.The decoding problem that displays replacement characters (U+FFFD) only occurs for me in Linux PowerShell under WSL when I repeatedly paste
══════════
fast enough such that the paste occurs before the prompt has finished displaying. At least for me, it never occurs if I wait a second between pastes.