runtime: Proposal: ImmutableArray.UnsafeFreeze(T[])

Use case: efficiently creating an ImmutableArray with a fixed (>4) number of elements.

Presently if you need to create an ImmutableArray<T> with a known collection of elements, the most efficient way to do so is to create an ImmutableArray<T>.Builder, set its Capacity to allocate an array of the correct size, fill the Builder with repeated calls to Add, and then use MoveToImmutable to create the ImmutableArray. (This strategy can be marginally improved by storing a cached Builder in a ThreadLocal.)

For element counts smaller than 4, there are overloads of ImmutableArray.Create which are implemented by creating an array and then immediately turning it into an ImmutableArray, but for more than 4 Create takes a params[] and copies it into an ImmutableArray. There is no way to create an array directly, fill it, and freeze it into an ImmutableArray without copying while pinky-promising never to mutate the original array again.

I set up a microbenchmark (see below) which uses codegen to call ImmutableArray<T>'s private T[] constructor, and it’s around 4-5 times faster for an array of 1000 longs. I found that a library of mine was spending a reasonable chunk of its time creating ImmutableArrays, and implementing this hack led to a global ~10% speedup in my end-to-end benchmarks. I’m not entirely happy about depending upon private implementation details like that though.

I propose adding a officially-supported static method to freeze an array into an ImmutableArray without copying it. I suggest calling it UnsafeFreeze to emphasise the fact that this method is risky: to use it correctly you have to make sure never to mutate the array that you passed in. (This name may be confusing given that it doesn’t have anything to do with C#'s unsafe; I’m open to alternative suggestions.)

public static class ImmutableArray
{
    [EditorBrowsable(EditorBrowsableState.Never)]
    // might wanna add [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static ImmutableArray<T> UnsafeFreeze<T>(T[] array)
        => new ImmutableArray<T>(array);
}

There may be a use case for a similar UnsafeThaw method which turns an ImmutableArray<T> into a T[] without copying, but I can’t think of one right now.

NB. There is precedent in the BCL for APIs which ask you to relinquish ownership of the argument. ArrayPool.Return requires you not to continue using the array you returned because it is liable to get handed out to a future caller of Rent.


Here’s the code for the microbenchmark I mentioned above (using BenchmarkDotNet):

public class Bench
{
    static readonly ThreadLocal<ImmutableArray<long>.Builder> _cachedBuilder = new ThreadLocal<ImmutableArray<long>.Builder>(ImmutableArray.CreateBuilder<long>);
    static readonly Func<long[], ImmutableArray<long>> _unsafeFreeze = GetUnsafeFreeze<long>();

    [Benchmark(Baseline = true)]
    public void CachedBuilder()
    {
        var builder = _cachedBuilder.Value;
        builder.Capacity = 1000;
        for (long i = 0; i < builder.Capacity; i++)
        {
            builder.Add(i);
        }
        builder.MoveToImmutable();
    }
    [Benchmark]
    public void UnsafeFreeze()
    {
        var array = new long[1000];
        for (long i = 0; i < array.Length; i++)
        {
            array[i] = i;
        }
        _unsafeFreeze(array);
    }

    static Func<T[], ImmutableArray<T>> GetUnsafeFreeze<T>()
    {
        var ctor = typeof(ImmutableArray<T>)
            .GetConstructors(BindingFlags.NonPublic | BindingFlags.Instance)
            .Single(c => c.GetParameters().Count() == 1 && c.GetParameters().Single().ParameterType.Equals(typeof(T[])));
        var param = Expression.Parameter(typeof(T[]));
        var body = Expression.New(ctor, param);
        var func = Expression.Lambda<Func<T[], ImmutableArray<T>>>(body, param);
        return func.Compile();
    }
}

And the results:

        Method |     Mean |     Error |    StdDev | Scaled | ScaledSD |
-------------- |---------:|----------:|----------:|-------:|---------:|
 CachedBuilder | 8.439 us | 0.1667 us | 0.4089 us |   1.00 |     0.00 |
  UnsafeFreeze | 1.944 us | 0.0399 us | 0.1169 us |   0.23 |     0.02 |

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 2
  • Comments: 15 (15 by maintainers)

Most upvoted comments

I agree that getting around immutability is fundamentally unsafe operation. The question is whether we should have a API to codify the unsafe pattern instead of telling everybody to depend on internal implementation details.

For example, we have introduced Memory<T> System.Runtime.InteropServices.MemoryMarshal.AsMemory<T>(ReadOnlyMemory<T> readOnlyMemory). We could have said that folks should use Unsafe to cast ReadOnlyMemory<T> to Memory<T> and take a fragile dependency on the internal implementation details, but we did not. We have API to codify the pattern that gives us control over it - our API telemetry can see how many places it used for, etc. The API is in System.Runtime.InteropServices that is namespace reserved for unsafe code.

The leanest and fastest version of this hack is to use System.Runtime.CompilerServices.Unsafe. With that, the hack is one-linker like this:

byte[] a = ...
ImmutableArray<byte> ia = Unsafe.As<byte[], ImmutableArray<byte>>(ref a);

We had methods like these, but they were removed: https://github.com/dotnet/corefx/pull/196/

The current “solution” is to use hack like what you have done. You can find another version of the hack in dotnet/corefx#196. Multiple versions of this hack are replicating in uncontrolled way. I have even seen one in F#. I think it is a much worse situation than having officially santioned unsafe interop methods. Maybe we should reconsider?

@Sergio0694 I think the approved Span-based additions to the ImmutableArray API (https://github.com/dotnet/runtime/issues/22928#issuecomment-855154372) will work well for that use case:

var span = new ReadOnlySpan<byte>(buffer, length);

return ImmutableArray.Create(span);

More powerful builder helps in some cases, but it won’t ever solve the interop case when you need to interop efficiently with the existing code that expects regular arrays and guarantees immutability via documentation. Here is an example of this use case from Roslyn: https://github.com/dotnet/roslyn/blob/944ae114140ba77cbd4be370bf13a7f758f740b3/src/Compilers/Core/Portable/NativePdbWriter/PdbWriter.cs#L593

I looked into why the safe code is so much slower than the unsafe code, since a good optimizing compiler should be able to make the performance of using Builder close to the performance of bypassing that abstraction.

My conclusion is that adding a new dangerous API is not necessary. What is necessary is:

  1. The caller should be changed from setting Capacity and using Add() to setting Count and using the indexer setter:

    public void CachedBuilderIndexer()
    {
        var builder = _cachedBuilder.Value;
        builder.Count = 1000;
        for (int i = 0; i < builder.Count; i++)
        {
            builder[i] = i;
        }
        builder.MoveToImmutable();
    }
    
  2. The implementation of ImmutableArray<T>.Builder indexer setter should be tweaked so that the CLR will inline it.

With these two changes, by my measurements, the performance is only 1.24× slower than UnsafeFreeze, which I think is acceptable, especially considering that the original code is 5× slower than UnsafeFreeze.


If you want more details, here’s what I did:

First I looked at the disassembly of CachedBuilder (whose scaled performance relative to UnsafeFreeze is 5×) and noticed that the Add() method was not being inlined. So I created my own stripped-down copy of ImmutableArray<T> and tweaked its Add() to make it shorter in IL. This didn’t actually put it under the threshold, but it did trigger “Inline candidate is mostly loads and stores.”, which meant the method was inlined. The result is that the performance of this method, MyInlinedCachedBuilder, was 2.1×.

When I looked at the disassembly of the previous method, I noticed the presence of (inlined) EnsureCapacity. Since the array will not be resized in the loop, that is not necessary. So I switched to setting Count and then using the indexer setter. The performance of this method, MyCachedBuilderIndexer, was 3.0×. This is worse than the previous method, but we’re not done yet.

When I looked at the disassembly, I noticed that the indexer was not being inlined. So I tweaked its code by extracting throw into a separate method. This meant that the indexer was now being inlined, resulting in performance of this method, MyInlinedCachedBuilderIndexer of 1.24×, relative to UnsafeFreeze.

I ran this on .NET Core 2.1.0-preview2-26131-06. The code is here, the raw results were:

Method Mean Error StdDev Scaled ScaledSD Gen 0 Allocated
CachedBuilder 7.562 us 0.0469 us 0.0416 us 4.93 0.41 2.5024 7.84 KB
CachedBuilder
Indexer
4.502 us 0.0122 us 0.0108 us 2.94 0.25 2.5024 7.84 KB
MyCachedBuilder 7.894 us 0.0195 us 0.0183 us 5.15 0.43 2.4414 7.84 KB
MyInlinedCached
Builder
3.247 us 0.0147 us 0.0137 us 2.12 0.18 2.5024 7.84 KB
MyCachedBuilder
Indexer
4.591 us 0.0691 us 0.0612 us 2.99 0.25 2.5024 7.84 KB
MyInlinedCached
BuilderIndexer
1.909 us 0.0352 us 0.0312 us 1.24 0.11 2.5330 7.84 KB
UnsafeFreeze 1.545 us 0.0448 us 0.1322 us 1.00 0.00 2.5330 7.84 KB

Hey @krwq, thank you for chiming in! @svick’s code will not work in my scenario, as I’m working with a native buffer, so the best I can do is getting a ReadOnlySpan<byte> from it, which unfortunately won’t work with the builder approach as there currently doesn’t seem to be an overload of AddRange taking a ReadOnlySpan<T>. If that was added, then I guess I could just switch to that, if the difference is just about 20% (which arguably is still not negligible, but that’s another topic).

To recap:

using ComPtr<IDxcBlob> dxcBlobBytecode = ShaderCompiler.Instance.CompileShader(hlslSource.AsSpan());

byte* buffer = (byte*)dxcBlobBytecode.Get()->GetBufferPointer();
int length = checked((int)dxcBlobBytecode.Get()->GetBufferSize());

ImmutableArray<byte>.Builder builder = ImmutableArray.CreateBuilder<byte>(length);

builder.AddRange(new ReadOnlySpan<byte>(buffer, length)); // This API is missing (also, why?)

ImmutableArray<byte> bytecode = builder.MoveToImmutable();

Should we open a separate proposal for that? I don’t really understand why a ReadOnlySpan<T> overload isn’t there already, so it seems to me that would be a nice addition regardless, unless there was a specific reason why that wasn’t added?

“byte array is an implementation detail and it should not be exposed as public API”

I have to say, the current approach of considering Unsafe.As<TFrom, TTo> a “valid solution” is exposing the underlying implementation detail much more than a public facing API like the one proposed here would do 🤔

EDIT: ah, @svick beat me to it ahahah Yeah, that proposed API seems like it would solve my problem completely, that’s awesome! 😄 I guess I’ll just stick to Unsafe.As<TFrom, TTo> for now then until that comes out. Thanks!