vscode-csharp: Slow step through and watch variable expansion when debugging in a Docker contianer

There’s a performance difference between native debugging and Docker container debugging. The problem can be observed when expanding variables in VS Code. In this test project https://github.com/jamiegs/dotnet-debugging-performance the difference between native and docker is small but still visible. On a real code base, it is much more significant. See this video https://youtu.be/V_jWtOHjvOg for a demonstration.

The details here are on macOS + VS Code. But we have observed the same on Window + VS Code and Windows + VS. Run on high spec Mac Book Pros and Dell XPS hardware. Changes to resources allocated to the container does not seem to make a difference.

Environment data

.NET Core SDK (reflecting any global.json):
 Version:   2.1.300
 Commit:    adab45bf0c

Runtime Environment:
 OS Name:     Mac OS X
 OS Version:  10.13
 OS Platform: Darwin
 RID:         osx.10.13-x64
 Base Path:   /usr/local/share/dotnet/sdk/2.1.300/

Host (useful for support):
  Version: 2.1.2
  Commit:  811c3ce6c0

.NET Core SDKs installed:
  1.0.1 [/usr/local/share/dotnet/sdk]
  1.0.4 [/usr/local/share/dotnet/sdk]
  1.1.4 [/usr/local/share/dotnet/sdk]
  2.0.0 [/usr/local/share/dotnet/sdk]
  2.0.3 [/usr/local/share/dotnet/sdk]
  2.1.4 [/usr/local/share/dotnet/sdk]
  2.1.300-preview2-008533 [/usr/local/share/dotnet/sdk]
  2.1.300 [/usr/local/share/dotnet/sdk]

.NET Core runtimes installed:
  Microsoft.AspNetCore.All 2.1.0-preview2-final [/usr/local/share/dotnet/shared/Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.1.0 [/usr/local/share/dotnet/shared/Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.App 2.1.0-preview2-final [/usr/local/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.1.0 [/usr/local/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 1.0.4 [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 1.0.5 [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 1.1.1 [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 1.1.2 [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 1.1.4 [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 1.1.9 [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.0.0 [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.0.3 [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.0.5 [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.0-preview2-26406-04 [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.0 [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.2 [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]

To install additional .NET Core runtimes or SDKs:
  https://aka.ms/dotnet-download

VS Code version:

1.25.1

C# Extension version:

1.15.2

Steps to reproduce

Based on code in https://github.com/jamiegs/dotnet-debugging-performance

There are two Launch Methods, Docker and Native.

  • Add breakpoint to ‘return View()’ line in Home Controller. HomeController.csd
  • Run application
  • When it hits breakpoint, expand the ‘this’ variable. Natively it takes about 3 seconds, Docker takes 5.
    • While this difference isn’t that big, the difference grows dramaticly with larger applications.
  • Also hitting ‘Step over’ or ‘Step into’ is slower in Docker vs Native.

Expected behavior

  • I would expect docker and native to have very little difference in debugging performance.

Actual behavior

  • The difference is noticable and increases with application size.

Demo video comparison using https://github.com/jamiegs/dotnet-debugging-performance https://youtu.be/lkDnFPGOumM

Demo video comparison using a larger, 20,000 line application. https://youtu.be/V_jWtOHjvOg

About this issue

  • Original URL
  • State: open
  • Created 6 years ago
  • Reactions: 8
  • Comments: 27 (18 by maintainers)

Most upvoted comments

Hi Everybody,

I may have found some explanations for the performance issues. Recently I decided to do development in an environment that is as production-like as possible, using the same containers used in production and just docker exec-ing vsdbg inside them for debugging. I immediately hit the issue described here - watches, stepping and the like suddenly took seconds (running on Docker Desktop for Mac).

The first thing I noticed was that during the expression evaluation “hang”, the com.docker.hyperkit process pegs one CPU core, while running top inside the docker VM shows just a few percent for the vsdbg process.

The other one was the following in Console:

ptrace attach of "dotnet <project>.dll"[2451] was attempted by "/vsdbg/vsdbg --interpreter=vscode"[5554]

So it seems like vsdbg is not able to use ptrace at all! Playing around with it some more, I’ve found that the docker VM has kernel.yama.ptrace_scope = 1 set in sysctl, which effectively means that only descendants of a process may be ptrace-d (https://www.kernel.org/doc/Documentation/security/Yama.txt). Changing this to 0 is probably not practical (does not survive a reboot of the docker host VM which is locked down pretty tight with a read-only boot volume), but using docker exec --privileged for the vsdbg process is actually quite enough. After doing this, the above warning disappeared and debugging is (at least subjectively) even faster than regular local debugging on a Mac!

Not sure if my fix is applicable to other scenarios here, but I definitely hope it will help to move this issue forward.

@janaka @jamiegs One thing that I meant to address is how this relates to Visual Studio vs VS Code. When debugging on docker, both VS and VS Code run the cross platform C# debugger “vsdbg” inside the container. The performance of both frontends should be similar for docker scenarios; they are both dependent on vsdbg inside the container.

Windows Data

Windows
Local Docker
iteration Engine Adapter Engine Adapter
1 479288 479536 4851175 4852634
2 512262 512498 4858059 4859198
3 488561 488953 4797241 4798359
4 484271 484535 4899063 4900303
5 509923 510162 4785430 4786509
6 503807 504068 4688882 4690064
7 482689 482995 4859941 4861161
8 498618 498891 4861770 4863226
9 498642 499017 4702413 4703720
10 517726 518042 4706336 4707903

image

On my machine, docker is about 10x slower than windows local. It may be worth noting that there is a larger difference between the windows and linux builds of the debugger than there is between linux and mac. Both linux and mac are built with clang, Windows is built with MSVC. Linux and Mac run on top of coreclr’s Platform Adaptation Layer, Windows has no such abstraction. Etc.

Machine Specs

Windows:

OS Name Microsoft Windows 10 Enterprise
Processor Intel® Core™ i7-6700 CPU @ 3.40GHz, 3401 Mhz, 4 Core(s), 8 Logical Processor(s)
Installed Physical Memory (RAM) 32.0 GB

Docker:

CPUs 2
Memory 2048 MB
Swap 1024 MB

I tried increasing all Docker specs and saw only a negligible increase in performance.

Update: I have not had time to test this, but there was a recent fix in the CoreCLR debugging interfaces that improves performance on XPLAT. I would expect this to improve Expression Evaluation performance both on Linux/Mac as well as Docker.

The specific commit is here. This is currently in coreclr master, meaning that it is only available in the latest .NET Core 3.0 Preview. I have confirmed that the fix is in the latest 3.0 SDK Shared Runtime available here.

I will do my best to answer your questions, but until I can do the actual perfview analysis, these are best guesses.

  1. It all started on Windows. Both coreclr and vsdbg are ports of the Windows .NET Runtime and VS Debugger, respectively. These had been worked and tuned for years and years on Windows, and the Windows versions of the ports benefit from this. I do not know exactly how much pure perf work has been done for coreclr on *nix systems, but it is certainly more than has been done for vsdbg. That is why we’re here now.

  2. The primary difference between the Windows and *nix builds of vsdbg is that the *nix versions use the coreclr Platform Adaptation Layer, or PAL. The Windows build does not use this layer because it builds directly against Windows API’s. I have no proof that the PAL has such a high perf impact, but it likely has some.

  3. The .NET Debugging Interfaces likely have a higher cost on *nix that Windows. Vsdbg is built on top of the .NET Debugger Interface. These are implemented by the runtime of the target application, not by vsdbg (specifically these are ship in mscordbi, part of the platform specific Microsoft.NETCore.App packages, like this). These APIs are what vsdbg uses to read memory out of the target process, among other things. It is likely that these code paths are not as performant on *nix as they are at windows.

  4. Expression Evaluation requires a lot of ReadProcessMemory. It is my understanding that reading from another process’s virtual memory space on *nix boils down to ptrace. I have heard anecdotally that this is far slower on *nix that it is on Windows. I hope to confirm/deny this once I do my analysis.

  5. Expression Evaluation should not be I/O bound. We load the symbol files (.pdb’s) when the module instance loads. By the time we are expanding variables, we shouldn’t be hitting the disk for anything. We are likely bound by reading memory from the target process, as described above.

I want to let you guys know that I’m looking into this, but I haven’t had a tremendous amount of time yet. I am trying to instrument our code to get some timing data on either side of the pipe (docker.exe in this case). I also have installed Ubuntu side by side on my macbook so that I can try to get some mac/linux numbers on the same hardware.

My guess upfront is that vsdbg is just slower running on the virtualized linux kernel in docker on mac. I will respond to this when I have some actual numbers. Thanks for reporting this!

@chuckries In doing some more testing and troubleshooting with other devs it seems that the debug slowness can be improved by collapsing the “Variables” pane in vscode. Task<T> objects specifically seem to exacerbate the performance issue, even crashing vsdbg entirely in some scenarios.

Just an FYI: I won’t have a chance to do a more in depth perf analysis on this in the next week or two. I will add to this issue when I have more info/results.

@chuckries Nice analysis! One suggestion–I think the original post notes that this issue also reproduces if you debug a docker from windows. If you’re more familiar with perf tools on Windows that might be a good way to get an actionable trace.