vscode: Git: Dirty diff doesn't recognize manual encoding change

  • VSCode Version:1.44.0
  • OS Version:mac os mojave 10.14

Steps to Reproduce:

1.open a file in git repository ,which have simple chinese and encoding as gb18030 2.reopen it as gb18030 3.git will recognize all simple chinese as changes,because now the file is decoded by gb18030,but the git still decode the file using utf-8.

Does this issue occur when all extensions are disabled?: Yes/No

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 19 (17 by maintainers)

Commits related to this issue

Most upvoted comments

@joaomoreno it seems to me that we have 2 issues that maybe need different solutions:

Dirty Diff Decorator When I look into the dirtyDiffDecorator I believe this one is still using the text model resolver service which dates back to the time where Git was not registering itself as FS provider, but content provider. Now that you changed that, instead what you can do is call ITextFileService.files.resolve(uri, { encoding }) to get yourself a text file model with the encoding of choice (something the text model resolver lacks).

However, I think what is still missing here is a way for you to know what encoding the user picked for the editor. Since you only seem to be operating on code editor models in the dirty diff, this information is lost at that point I fear. However, the model of the workbench has a property to get at its encoding: https://github.com/microsoft/vscode/blob/997b4c863a45f171570d5167fee31e74b018ba06/src/vs/workbench/services/textfile/common/textFileEditorModel.ts#L837

I am understanding too little of the lifecycle of dirty diff to make more suggestions, so that would be something to learn in debt week maybe.

Diff Editor The diff editor will directly be impacted when the user changes the encoding from the status bar. The flow is as follows:

  • user changes encoding in opened diff editor
  • we apply the encoding to the models in both sides of the diff editor
  • this ends up re-reading the file from disk using the specified encoding

I just did a quick test comparing 2 files in cp1252 encoding with inline diff editor and was able to change the encoding from the statusbar and show it correctly:

Kapture 2020-04-30 at 14 22 02

Given that, I wonder if something in the Git FS provider goes wrong during that way, but I do not see why yours would behave any different unless maybe you do some conversion on your layer.

@joaomoreno computeDirtyDiff is invoked with two text models. In our codebase, editor text models are instantiated with strings.

Strings are created from bytes. The code that creates the strings from bytes is responsible for interpreting the bytes with a specific character encoding. That code is somewhere in the FS provider, or perhaps in the FS service that sits on top of the FS provider.

In any case, the editor text model can only be constructed with strings, and not with bytes.