vscode-kubernetes-tools: extension is crashing vscode

The problem

It appears whenever kubernetes extension is enabled, it will freeze vscode and potentially take down all other extensions along with it (to a point I can’t even create new file or search for things in VSCode). Once disabled, there are no more errors from Extension Host output, all extensions are running as usual.

In debug output

[error] Error: spawn EBADF
    at ChildProcess.spawn (node:internal/child_process:413:11)
    at spawn (node:child_process:817:9)
    at /Users/michael/.vscode/extensions/ms-kubernetes-tools.vscode-kubernetes-tools-1.3.14/dist/extension.js:165:2130921
    at c (/Users/michael/.vscode/extensions/ms-kubernetes-tools.vscode-kubernetes-tools-1.3.14/dist/extension.js:165:2130331)
    at /Users/michael/.vscode/extensions/ms-kubernetes-tools.vscode-kubernetes-tools-1.3.14/dist/extension.js:165:2130528
    at Array.forEach (<anonymous>)
    at ChildProcess.<anonymous> (/Users/michael/.vscode/extensions/ms-kubernetes-tools.vscode-kubernetes-tools-1.3.14/dist/extension.js:165:2130462)
    at ChildProcess.emit (node:events:513:28)
    at ChildProcess.emit (node:domain:489:12)
    at maybeClose (node:internal/child_process:1091:16)
    at Socket.<anonymous> (node:internal/child_process:449:11)
    at Socket.emit (node:events:513:28)
    at Socket.emit (node:domain:489:12)
    at Pipe.<anonymous> (node:net:322:12)
2023-09-27 13:14:38.847 [info] Extension host with pid 77205 exiting with code 0
2023-09-27 13:14:39.888 [error] ReferenceError: Cannot access 'T' before initialization
    at d (/Applications/Visual Studio Code.app/Contents/Resources/app/out/vs/workbench/api/node/extensionHostProcess.js:122:56312)
    at /Applications/Visual Studio Code.app/Contents/Resources/app/out/vs/workbench/api/node/extensionHostProcess.js:122:54858
    at Set.forEach (<anonymous>)
    at process.<anonymous> (/Applications/Visual Studio Code.app/Contents/Resources/app/out/vs/workbench/api/node/extensionHostProcess.js:122:54847)
    at process.emit (node:events:525:35)
    at processEmit (/Users/michael/.vscode/extensions/ms-vsliveshare.vsliveshare-1.0.5883/extension.js:394:51404)
    at process.Oce.process.emit (/Users/michael/.vscode/extensions/github.copilot-1.117.443/node_modules/source-map-support/source-map-support.js:516:21)
    at process.emit (/Users/michael/.vscode/extensions/github.copilot-1.117.443/prompt/node_modules/source-map-support/source-map-support.js:516:21)
    at process.apply (/Users/michael/.vscode/extensions/github.copilot-chat-0.7.1/dist/extension.js:2:960749)
    at process.v [as emit] (/Users/michael/.vscode/extensions/node_modules/signal-exit/index.js:191:37)
    at process.exit (node:internal/process/per_thread:192:15)
    at Object.exit (/Applications/Visual Studio Code.app/Contents/Resources/app/out/vs/workbench/api/node/extensionHostProcess.js:135:9594)
    at /Applications/Visual Studio Code.app/Contents/Resources/app/out/vs/workbench/api/node/extensionHostProcess.js:118:8073
    at <anonymous>
    at runNextTicks (node:internal/process/task_queues:60:5)
    at listOnTimeout (node:internal/timers:538:9)
    at processTimers (node:internal/timers:512:7)
[error] [redhat.vscode-yaml] provider FAILED
2023-09-27 13:14:39.897 [error] Error: Connection got disposed.
    at Object.dispose (/Users/michael/.vscode/extensions/redhat.vscode-yaml-1.14.0/dist/extension.js:2:308383)
    at Object.dispose (/Users/michael/.vscode/extensions/redhat.vscode-yaml-1.14.0/dist/extension.js:2:388535)
    at /Users/michael/.vscode/extensions/redhat.vscode-yaml-1.14.0/dist/extension.js:2:385018

Environment

Version: 1.82.2
Commit: abd2f3db4bdb28f9e95536dfa84d8479f1eb312d
Date: 2023-09-14T06:00:27.244Z (1 wk ago)
Electron: 25.8.1
ElectronBuildId: 23779380
Chromium: 114.0.5735.289
Node.js: 18.15.0
V8: 11.4.183.29-electron.0
OS: Darwin arm64 22.6.0

settings:

{
  "vs-kubernetes" : {
  
    "disable-linters" : [
      "resource-limits"
    ]
  },
}

About this issue

  • Original URL
  • State: closed
  • Created 9 months ago
  • Comments: 17 (1 by maintainers)

Most upvoted comments

Sigh; the last thing I wanted to do was have major impact with #1204. I really appreciate yall putting the time and effort to documenting the issue. Looking at these symptoms, I definitely wouldn’t be surprised if #1204 is causing these issues.

It does look like for macOS tree-kill does instantiate a pgrep -P <parent-pid>: https://github.com/pkrumins/node-tree-kill/blob/master/index.js#L33, and if it’s in some loop for some reason, it could start having OS-level effects by spawning too many processes (as seen by @mfilipe-te). I could see the OS failing to spawn something if it gets too bad, potentially causing the crash seen by @michaellzc

My M1 Mac is indisposed for a couple weeks, but in the event the VSIX provided by @tatsinnit doesn’t reproduce the issue, I’d be able to iteratively try to reimplement #1204 that is tested on the 3 major OS’s rather than just Windows (which in hindsight, I really should have done in the first place and was a rookie mistake, especially for something calling OS process calls).

@Tatsinnit

In order to understand this more - Is there a specific step I can follow in my macbook to re-create it, because somehow I have this extension installed and I cannot replicate this at all? (Are you doing anything like watching pod logs?)

Nothing in particular. I open the editor with zero interaction whatsoever (besides checking extensions logs), and it started crashing.

Tested vscode-kubernetes-tools-1.3.14-kill-process-test.vsix out and it also resolved the problem for me.

❤️🙏 Thank you so much all for this discussion, very helpful!! We had a quick look in the difference between last release and this release 1.3.13…1.3.14 and the only thing stand out is the treeKill command, I am not familiar with it, but given I am on Mac and I cannot replicate this issue at all.

  • In order to understand this more - Is there a specific step I can follow in my macbook to re-create it, because somehow I have this extension installed and I cannot replicate this at all? (Are you doing anything like watching pod logs?)

Possible Moving Forward Solutions

Couple of things we could do to rule out this issue:

  • VSIX which reverts the treeKill commit - One course of action could be that here is the local vsix which if anyone of you can please test and if that fixes the issue in your machine please and see if this fixes the issue for you @michaellzc and @mfilipe-te - If this works

    • I wonder if @mikeseese and us can see how this could cause this issue at-all. If you have any thoughts on how this could be related. (also for more ideas cc: @peterbom , @lstocchi ) (I wonder if this is macOS issue? or windows as well.

VSIX local copy is here: (With removed treeKill commit for testing - @michaellzc or @mfilipe-te please in case you have time to test) vscode-kubernetes-tools-1.3.14-kill-process-test.vsix.zip

Steps to install local vsix

Tested it, and it fixes the issue.

You are awesome @mikeseese , Thank you for quick reply, It possible that testing would not have caught this, and definitely seems like a combination of circumstances, given I cannot replicate (Or likely be missing out the steps to create it).

We can wait for the other folks kind input regarding the new VSIX shared above here and how it performs, and if that fixes the issue, I really like your idea of incrementally doing it and releasing the code without treeKill code. We can work with you for this, and engage in any way possible if it’s cool with you. cc: @peterbom. Thanks heaps all.

❤️🙏 Thank you so much all for this discussion, very helpful!! We had a quick look in the difference between last release and this release https://github.com/vscode-kubernetes-tools/vscode-kubernetes-tools/compare/1.3.13...1.3.14 and the only thing stand out is the treeKill command, I am not familiar with it, but given I am on Mac and I cannot replicate this issue at all.

  • In order to understand this more - Is there a specific step I can follow in my macbook to re-create it, because somehow I have this extension installed and I cannot replicate this at all? (Are you doing anything like watching pod logs?)

Possible Moving Forward Solutions

Couple of things we could do to rule out this issue:

  • VSIX which reverts the treeKill commit - One course of action could be that here is the local vsix which if anyone of you can please test and if that fixes the issue in your machine please and see if this fixes the issue for you @michaellzc and @mfilipe-te - If this works
    • I wonder if @mikeseese and us can see how this could cause this issue at-all. If you have any thoughts on how this could be related. (also for more ideas cc: @peterbom , @lstocchi ) (I wonder if this is macOS issue? or windows as well.

VSIX local copy is here: (With removed treeKill commit for testing - @michaellzc or @mfilipe-te please in case you have time to test) vscode-kubernetes-tools-1.3.14-kill-process-test.vsix.zip

Steps to install local vsix

In my case, I don’t think it’s just killing the extension, because my vscode starts saying things like can´t fork process, can’t find git, etc, also affects things outside vscode, like tab completion in zsh (also with fork related msgs). The security software starts taking more than 100% of cpu usage (normal value 0.9) and more than 5GB of memory (normal value 66MB ).

yeah, I shared the same symptom. Everything stopped working, could not create files, save files hung forever, etc. The only workaround is disabling k8s extension.

The previous extension version works.

I don’t have the exact same issue, but the latest version of the extension for some reason triggers the company endpoint protection (takes over cpu and memory) to the point that vscode becomes unusable with logs like this in the extension host and window output logs. Maybe related to this #1204 ?

2023-09-28 12:14:12.133 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
2023-09-28 12:14:12.133 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
2023-09-28 12:14:12.133 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
2023-09-28 12:14:12.133 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
2023-09-28 12:14:12.133 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
2023-09-28 12:14:12.133 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
2023-09-28 12:14:12.134 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
2023-09-28 12:14:12.134 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
2023-09-28 12:14:12.134 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
2023-09-28 12:14:12.134 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
ProductName:		macOS
ProductVersion:		13.6
BuildVersion:		22G120
Version: 1.82.2
Commit: abd2f3db4bdb28f9e95536dfa84d8479f1eb312d
Date: 2023-09-14T06:00:27.244Z
Electron: 25.8.1
ElectronBuildId: 23779380
Chromium: 114.0.5735.289
Node.js: 18.15.0
V8: 11.4.183.29-electron.0
OS: Darwin arm64 22.6.0

yes, I see the exact message as well after Extension Host has died 2023-09-28 12:14:12.133 [error] Error: spawn pgrep EAGAIN and VSCode is no longer usable.

~Funny things, we also have endpoint software, maybe it’s the endpoint software killing the k8s extension somehow? Let me check with our IT folks.~ No, endpoint software is not the problem.

@Tatsinnit

I tried to replicate with settings mentioned but working fine, I do see you have Spawn Error: EBADF aka error bad file descriptor error and something with provider-failed with redhat-vscode.

Hi, I believe logs from redhat-vscode is red herring. To clarify, only when kubernetes extension is enabled, all other extensions will start failing randomly, and this time it happened to be the redhat one (I have seen the built-in Git extension failed as well). Once k8s extension is disabled or uninstalled, no such failure is observed.

I don’t have the exact same issue, but the latest version of the extension for some reason triggers the company endpoint protection (takes over cpu and memory) to the point that vscode becomes unusable with logs like this in the extension host and window output logs. Maybe related to this https://github.com/vscode-kubernetes-tools/vscode-kubernetes-tools/pull/1204 ?

2023-09-28 12:14:12.133 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
2023-09-28 12:14:12.133 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
2023-09-28 12:14:12.133 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
2023-09-28 12:14:12.133 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
2023-09-28 12:14:12.133 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
2023-09-28 12:14:12.133 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
2023-09-28 12:14:12.134 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
2023-09-28 12:14:12.134 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
2023-09-28 12:14:12.134 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
2023-09-28 12:14:12.134 [error] Error: spawn pgrep EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:283:19)
    at onErrorNT (node:internal/child_process:476:16)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
ProductName:		macOS
ProductVersion:		13.6
BuildVersion:		22G120
Version: 1.82.2
Commit: abd2f3db4bdb28f9e95536dfa84d8479f1eb312d
Date: 2023-09-14T06:00:27.244Z
Electron: 25.8.1
ElectronBuildId: 23779380
Chromium: 114.0.5735.289
Node.js: 18.15.0
V8: 11.4.183.29-electron.0
OS: Darwin arm64 22.6.0