wpf: Startup cost of XamlServices parsing is 27% slower on .NETCore than on .NET Framework

  • .NET Core Version: 3.0 Preview1
  • Windows version: 1803
  • Does the bug reproduce also in WPF for .NET Framework 4.8?: No

Problem description: I expected/hoped that with the XmlReader hopefully being Span<T> based. the XAML parser would be faster in .NET Core than the .NET Framework equivalent. However that’s not what I’m seeing with this benchmark:

    public class BenchmarkTests
    {
        static readonly string xamlString = @"<MyObject xmlns=""clr-namespace:XamlBenchmark;assembly=XamlBenchmark"" StringProperty=""Hello World"" Int32Property=""1234"" DoubleProperty=""123.4567890"" FloatProperty=""-0.9876"" />";

        [Benchmark]
        public object Test1()
        {
            object instance = System.Xaml.XamlServices.Parse(xamlString);
            return instance;
        }
    }
    public class MyObject
    {
        public string StringProperty { get; set; }
        public int Int32Property { get; set; }
        public double DoubleProperty { get; set; }
        public float FloatProperty { get; set; }
    }

Here are the results:


BenchmarkDotNet=v0.11.3, OS=Windows 10.0.17763.1 (1809/October2018Update/Redstone5)
Intel Xeon CPU E5-1620 v3 3.50GHz, 1 CPU, 8 logical and 4 physical cores
  [Host]     : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3190.0
  Job-FQKGZY : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3190.0
  Job-YTBJBT : .NET Core 3.0.0-preview-27122-01 (CoreCLR 4.6.27121.03, CoreFX 4.7.18.57103), 64bit RyuJIT


Method Runtime Toolchain Mean Error StdDev Ratio RatioSD
Test1 Clr net472 363.3 us 7.139 us 7.639 us 1.00 0.00
Test1 Core netcoreapp3.0 461.6 us 9.078 us 8.048 us 1.27 0.03

About this issue

  • Original URL
  • State: open
  • Created 6 years ago
  • Reactions: 21
  • Comments: 39 (24 by maintainers)

Most upvoted comments

I love seeing this kind of issue. Great to see the focus on fundamentals.

Thanks @dotMorten. This is likely to be expected, as Core is not currently ngened. We’ll address before Core3 ships.

18k calls from Parse -> XamlXmlReader.Initalize -> NodeStreamSorter.ctor -> StartNewNodeStreamWithSettingsPreamble does become 14M calls to FilterCustomAttributeRecord and that’s 44% of the time for Parsing.

image

image

@danmosemsft It’s also 24 loaded assemblies (.NET Core 3.0 latest daily build) vs 5 loaded assemblies (.NET Framework 4.7.2).

If you patch System.Xaml to only check the first 5 returned assemblies (.NET Framework only loads 5 assemblies), then .NET Core 3.0 code executes faster.

So it seems the 5x number of assemblies is causing the slowdown. It has to execute more code.

Original .NET Core 3.0 daily build:

Name Exc % Exc Inc % Inc
module system.xaml <<system.xaml!XamlServices.Parse>> 24.0 1,073 91.6 4,090
module system.private.corelib <<system.private.corelib!Attribute.GetCustomAttributes>> 14.8 662 36.3 1,619
module coreclr <<coreclr!JIT_New>> 10.3 462 10.6 472
module clrjit <<clrjit!CILJit::compileMethod>> 5.4 239 5.4 243
module coreclr <<coreclr!JIT_NewArr1>> 5.4 239 5.7 253
module coreclr <<coreclr!ModuleHandle::ResolveType>> 5.2 234 5.2 234
module coreclr <<coreclr!MetaDataImport::GetCustomAttributeProps>> 5.2 231 5.6 248
module system.private.corelib <<system.private.corelib!CustomAttributeData.GetCustomAttributesInternal>> 4.9 219 22.9 1,023
module coreclr <<coreclr!Attribute::ParseAttributeArguments>> 4.1 183 5.4 241
module coreclr <<coreclr!MetaDataImport::Enum>> 3.4 152 4.0 178
module ntdll <<ntdll!LdrpDispatchUserCallTarget>> 2.6 116 2.6 116
module coreclr <<coreclr!MetaDataImport::GetParentToken>> 2.6 116 2.8 127

Also looking at allocations by running this parse method many many times, I see this hit over 40 times per parsing: image

And here’s some of the CPU analysis: image

Moving this to 3.1. The discussion has targeted the initialization of XamlSchemaContext within XamlServices.Parse (et al.), which examines more assemblies than it did in NETFx. I don’t think it’s critical to fix this for 3.0, because

  1. it only affects parse/load of loose XAML
  2. it’s an initialization expense, whose relative effect diminishes as the string being parsed grows longer or more complex
  3. it’s already possible for an app to mitigate this

In short, the effect will be felt in practice only by apps that load lots of small XAML snippets. This isn’t the mainstream case, and such apps have a workaround.

Details supporting these claims.

All apps load BAML, which uses a built-in SchemaContext. But relatively few apps load loose XAML via explicit calls to XamlServices.Parse et al. At least that’s my impression from looking at apps that have come my way (largely through bug reports) - we don’t have telemetry for this.

The 27% figure comes from parsing a very simple string (one XAML tag); a real app typically parses longer strings, and will see less of a difference. I tried strings with a few hundred tags, and the difference was not worth noting.

Most of the XamlServices methods call new XamlSchemaContext(), but an app can avoid the initialization expense by using the one method that doesn’t. Replace XamlServices.Parse(mystring) with

            StringReader stringReader = new StringReader(mystring);
            using (XmlReader xmlReader = XmlReader.Create(stringReader))
            {
                XamlXmlReader xamlReader = new XamlXmlReader(xmlReader, CachedSchemaContext);
                return XamlServices.Load(xamlReader);
            }

where CachedSchemaContext is a XamlSchemaContext that the app creates once and re-uses every time it calls Parse. (Similarly for XamlServices.Load(stream) et al.)

An app that does this is responsible for releasing the cached context when it’s no longer needed, if that’s relevant to the overall behavior. This is probably not an issue, except perhaps for apps with complex interaction with AppDomain, or similar advanced architecture.

I’ve tried building this idea into XamlServices itself. The early results are good, but it needs more investigation. For example, XamlServices can’t know when to release the cached context, and I don’t fully understand the consequences of keeping it alive “forever”.

Just FYI it is issues like this that have caused us to stop our port to netcore. I had hoped performance would be better. I hope these issues continue to see activity and improvement.

For reference: @bugproof comment is unrelated to this issue or to WPF. (.NET Core implements HTTPRequest differently from .NET Framework.)

This was milestoned to 7.0 but the startup time of WPF apps running .net7.0 are still very slow compared to .net 4.8

Just my 2 cents: I also ported a large WPF+EntityFramework app to core 3.0 and found that load times increased from <4 seconds for .Net Framework to 5+ seconds with .Net Core 3.0. This is with creating native images for both in release and debug builds. Runtime performance seemed largely unchanged so I abandoned this porting effort and will try this again with .Net 5 when it is released. There are numerous reports of slow-downs like in the post above and also this one: https://github.com/dotnet/wpf/issues/94

My impression is that some work is required to address all these already reported WPF + .Net Core performance issues in order to make it worthwhile to perform this migration for more complex applications with lots of assemblies and XAML resources.

Cheers Philip