runtime: `Directory.EnumerateFiles(dir, pat)` is sometimes 45-50x slower on .NET Core 3.0 compared to .NET Framework 4.8 / .NET Core 2.0

Directory.EnumerateFiles(dir, pat) is sometimes 45-50x slower on .NET Core 3.0 compared to .NET Framework 4.8 / .NET Core 2.0

Discovered by @iSazonov, see https://github.com/PowerShell/PowerShell/issues/6577#issuecomment-543194018

Related:

Methods:

  • PS1(): 2000 x Directory.EnumerateFiles(@"C:\WINDOWS\system32\", "ping.*")
  • Blog(): 2000 x Directory.EnumerateFiles(@"d:\repos\corefx\src\System.IO.FileSystem\", "*.cs", SearchOption.AllDirectories)
.NET PS1() Blog()
.NET Framework 4.8 78 ms 4393 ms
.NET Core 1.0 93 ms 5298 ms
.NET Core 1.1 91 ms 4712 ms
.NET Core 2.0 76 ms 4729 ms
.NET Core 2.1 4314 ms 1585 ms
.NET Core 2.2 4286 ms 1599 ms
.NET Core 3.0 3788 ms 1538 ms
.NET Core 3.1 3753 ms 1538 ms

You can also test this PowerShell command

Measure-Command { for ($i=0; $i -lt 2000; $i++) { Get-Command ping > $null } }
PS tfm Time (s)
PS 5.1 net48 0.5251987
PS 7.0 netcoreapp3.0 4.1898758

Most expensive calls:

.NET Framework 4.8 and .NET Core 2.0

Name                                                                                            Inc %   Inc
 consoleapp1!ConsoleApp1.Program.PS1(class System.String)                                         2.2    90
+ mscorlib.ni!Directory.EnumerateFiles                                                            1.5    62
|+ mscorlib.ni!System.IO.FileSystemEnumerableIterator`1[System.__Canon]..ctor(...)                1.5    61
||+ mscorlib.ni!System.IO.FileSystemEnumerableIterator`1[System.__Canon].CommonInit()             1.2    49
|| + mscorlib.ni!DomainNeutralILStubClass.IL_STUB_PInvoke(System.String, WIN32_FIND_DATA ByRef)   1.2    48
||  + kernelbase!FindFirstFileW                                                                   1.2    47
||   + kernelbase!FindFirstFileExW                                                                1.2    47

.NET Core 3.1 (3.1.100-preview1-014459)

Name                                                                                                     Inc %      Inc
 consoleapp1!ConsoleApp1.Program.PS1(class System.String)                                                 62.6    2,884
+ system.io.filesystem.il!System.IO.Enumeration.FileSystemEnumerator`1[System.__Canon].MoveNext()         53.8    2,479
|+ system.io.filesystem.il!System.IO.Enumeration.FileSystemEnumerator`1[System.__Canon].FindNextEntry()   47.2    2,175
||+ system.io.filesystem.il!dynamicClass.IL_STUB_PInvoke(...)                                             45.4    2,093
|||+ ntdll!NtQueryDirectoryFile                                                                           45.3    2,086

Repro

  • Update the path to corefx repo in the source code below
  • Compile the code below
  • Run the executable on Windows
using System;
using System.Diagnostics;
using System.IO;
using System.Runtime.CompilerServices;

namespace ConsoleApp1
{
    class Program
    {
        static void Main(string[] args)
        {
            const string psPattern = "ping.*";
            const string blogPattern = "*.cs";
            for (int i = 0; i < 100; i++) {
                PS1(psPattern);
                Blog(blogPattern);
            }
            var swPS1 = Stopwatch.StartNew();
            for (int i = 0; i < 2000; i++)
                PS1(psPattern);
            swPS1.Stop();
            var swBlog = Stopwatch.StartNew();
            for (int i = 0; i < 2000; i++)
                Blog(blogPattern);
            swBlog.Stop();
            Console.WriteLine($"{IntPtr.Size * 8}-bit");
#if NETCOREAPP1_0
            Console.WriteLine("netcoreapp1.0");
#elif NETCOREAPP1_1
            Console.WriteLine("netcoreapp1.1");
#else
            Console.WriteLine(typeof(object).Assembly.Location);
#endif
            Console.WriteLine($"PS1 : {swPS1.ElapsedMilliseconds} ms");
            Console.WriteLine($"Blog: {swBlog.ElapsedMilliseconds} ms");
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        static void Blog(string pattern)
        {
            foreach (string path in Directory.EnumerateFiles(@"d:\repos\corefx\src\System.IO.FileSystem\", pattern, SearchOption.AllDirectories))
            {
                Use(path);
            }
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        static void PS1(string pattern)
        {
            var directories = new string[]
            {
                @"C:\WINDOWS\system32\",
            };
            foreach (var directory in directories)
            {
                foreach (var f in Directory.EnumerateFiles(directory, pattern))
                {
                    Use(f);
                }
            }
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        static void Use(string s) { }
    }
}
<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFrameworks>netcoreapp3.1;netcoreapp3.0;netcoreapp2.2;netcoreapp2.1;netcoreapp2.0;netcoreapp1.1;netcoreapp1.0;net48</TargetFrameworks>
  </PropertyGroup>

</Project>
dotnet build -c Release -f net48
dotnet publish -c Release -f netcoreapp1.0 -r win7-x64 --self-contained
dotnet publish -c Release -f netcoreapp1.1 -r win7-x64 --self-contained
dotnet publish -c Release -f netcoreapp2.0 -r win-x64 --self-contained
dotnet publish -c Release -f netcoreapp2.1 -r win-x64 --self-contained
dotnet publish -c Release -f netcoreapp2.2 -r win-x64 --self-contained
dotnet publish -c Release -f netcoreapp3.0 -r win-x64 --self-contained
dotnet publish -c Release -f netcoreapp3.1 -r win-x64 --self-contained
.\ConsoleApp1\bin\Release\net48\ConsoleApp1.exe
echo ''
.\ConsoleApp1\bin\Release\netcoreapp1.0\win7-x64\publish\ConsoleApp1.exe
echo ''
.\ConsoleApp1\bin\Release\netcoreapp1.1\win7-x64\publish\ConsoleApp1.exe
echo ''
.\ConsoleApp1\bin\Release\netcoreapp2.0\win-x64\publish\ConsoleApp1.exe
echo ''
.\ConsoleApp1\bin\Release\netcoreapp2.1\win-x64\publish\ConsoleApp1.exe
echo ''
.\ConsoleApp1\bin\Release\netcoreapp2.2\win-x64\publish\ConsoleApp1.exe
echo ''
.\ConsoleApp1\bin\Release\netcoreapp3.0\win-x64\publish\ConsoleApp1.exe
echo ''
.\ConsoleApp1\bin\Release\netcoreapp3.1\win-x64\publish\ConsoleApp1.exe
echo ''

About this issue

  • Original URL
  • State: open
  • Created 5 years ago
  • Reactions: 4
  • Comments: 16 (13 by maintainers)

Most upvoted comments

Is it so that the reason behind directly invoking NTFS API vs. Win32’s FindFirstFileW is the support of UNC paths to mitigate PathTooLong problem?

There were a few reasons:

  1. It cuts out the extra layer of translation FindFirst/Next does
  2. It allows us to reuse the buffer from directory to directory

Long paths didn’t enter into this particular one.

I might be off base here, but I just got lost looking at the source code for EnumerateFiles() in response to a StackOverflow question.

Could some of the performance issues here be due to the fact that null is always passed as the FileName parameter to NtQueryDirectoryFile? So the pattern you pass to Directory.EnumerateFiles() isn’t actually used in the query. Results are discarded after they’re retrieved. On a local file system, the cost may not be huge, but it would be substantial if you’re querying a remote file system (assuming NtQueryDirectoryFile sends the file name filter to the remote computer).

From FileSystemEnumerator.Windows.cs:

int status = Interop.NtDll.NtQueryDirectoryFile(
    FileHandle: _directoryHandle,
    Event: IntPtr.Zero,
    ApcRoutine: IntPtr.Zero,
    ApcContext: IntPtr.Zero,
    IoStatusBlock: &statusBlock,
    FileInformation: _buffer,
    Length: (uint)_bufferLength,
    FileInformationClass: Interop.NtDll.FILE_INFORMATION_CLASS.FileFullDirectoryInformation,
    ReturnSingleEntry: Interop.BOOLEAN.FALSE,
    FileName: null,
    RestartScan: Interop.BOOLEAN.FALSE);

This is linked to https://github.com/PowerShell/PowerShell/issues/6577 but that seems to be about Linux, which has a signfiicantly different implementation.