vscode: Unicode bug in characters that move to the previous glyph

“Please search existing issues to avoid creating duplicates.” There are too many bugs for me to do an accurate search. sorry.

I tested with the standard product only. I am busy,. sorry.

  • VSCode Version:1.39.2
  • OS Version:Windows 10 Home

Steps to Reproduce:

1.Enter the string “nūta॑nairu॒ta” into a new VSC file 2.See that garbage is displayed (more than 11 characters) 3. See that the cursor does not move through the garbage characters one character at a time.

Note that this bug reporting system can display the above string correctly. Also, the UTF-8 validator tool https://onlineutf8tools.com/validate-utf8 displays the string correctly.

Unless you disagree that this is a bug, please fix it. I can’t use VSC until the worst of the bugs are fixed. Thanks!

Does this issue occur when all extensions are disabled?: Yes I do not use any extensions, since I’m evaluating VSC to be my new editor. (I can’t use it until the autoindentation disabling bug is fixed and until it handles Unicode correctly.)

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 30 (10 by maintainers)

Most upvoted comments

No problem, I understand your constraints–I subscribed to this feed because of a different bug (https://github.com/microsoft/vscode-remote-release/issues/1087), and I’ve been astounded by how many bug reports you have to deal with. I just wanted to point out that the boundary between a code editor and a text editor is a fuzzy one.

As for implementing a fix myself, I’ve worked with a lot of programming languages in my life, but I do have limitations…

@createdbyjurand: “our goal is to be a very good code editor and if it so happens to be a good text editor, then so be it.” This is just an FYI: I’m sure I’m a tiny minority, but FWIW, I deal nearly every day with coding in multiple scripts, most recently Devanagari (for Hindi) and Arabic (for Urdu), along with Latin/ English. I do most of the coding in XML, which gets converted to sfst, xfst and lexc (finite state transducer programming languages), and I sometimes have to edit the latter when something goes wrong. Also mixed-script Python. While I could be a minority of 1, I suppose there might be other people in similar boats, who program NLP (Natural Language Processing) applications that involve multiple scripts.

The first “possible duplicate” in your list has nothing to do with Unicode at all.

My bug report stands and I do want it fixed, not ignored as a duplicate.