runtime: `InlineArray` size validation does not handle all cases

Description

The size of an InlineArray is supposed to be limited to 1MiB, but the example below slips through:

[InlineArray(2)]
struct BigArray
{
    BigElement element;
}

[StructLayout(LayoutKind.Sequential, Size = int.MaxValue)]
struct BigElement
{
}

Reproduction Steps

Given the type definitions from earlier:

unsafe { Console.WriteLine(sizeof(BigArray)); }

Expected behavior

The runtime should throw a TypeLoadException.

Actual behavior

No exception occurs, and sizeof(BigArray) returns -2.

Regression?

No response

Known Workarounds

No response

Configuration

No response

Other information

No response

About this issue

Original URL
State: open
Created 7 months ago
Comments: 16 (16 by maintainers)

Most upvoted comments

arbitrary implementation detail

This is no different than the 1MB currently selected.

potentially platform specific

This is no different than things like Array.MaxLength or other limits that can be hit. There are platform or runtime specific behaviors in some cases. Even with InlineArray and the current 1MB, it is just as platform specific on whether that will work on not. By default on MacOS, such a limit will always overflow background threads (which default to 512KB).

These limits exist, but they are designed to be high enough that almost no code will ever need to consider them. But even with these very high limits, people still do hit them and then we have to decide if its worth changing or adjusting it later (and in some cases we can’t change it later).

clearly too large

While 134,217,726 bytes (roughly 127.99 MB) may be significantly larger than normally “necessary”, I wouldn’t say its clearly too large. Something like string has a current limit of 2,147,483,582 bytes (roughly 2047.99 MB). Array has a limit of 0x7FFF_FFC7 elements (and so can potentially be many GB).

There are clearly cases for users to have inline buffers that large in other types and while they may primarily use things like T[] or string to handle many of these, there are still specialized scenarios that could easily warrant having larger buffers, especially as trailing data.

It’s been discussed in other threads how having the ability to define something like Array<T, TMetadata> where it is logically int length; TMetadata metadata; T item0, ..., itemN; would be beneficial. For example it could allow us to implement List<T> that would allow bounds checks to be safely elided since we could prevent the tear between int _size and T[] _items.

Another example is that its common to do instancing in games/graphics and so you may have n sequential Matrix4x4. A Matrix4x4 is 64-bytes, so you get up to 16k instances before you hit the limit. Some games can easily instance 16k meshes and so this unnecessarily forces you to break up your data structures.

Users can likewise define structs that are bigger than 128 MB and so while the field offset can’t be greater than 128 MB, it seems like it generally “just works” provided its the trailing field and therefore the offset to the next field isn’t greater than 127.99 MB.

It really seems better to just not give this a real limit (outside preventing overflow) and to let users hit it themselves as they would if they did [StructLayout(LayoutKind.Sequential, Size = 1024 * 1024 * 256)], which is that they hit a TypeLoad exception if it isn’t the last field in a struct. This doesn’t unnecessarily restrict things while also avoiding many of those edge case bugs.

// This is fine today
[StructLayout(LayoutKind.Sequential, Size = 1024 * 1024 * 256)]
struct S1 { }

// This is fine today
struct S2 { public int x; public S1 s; }

// This fails today
struct S3 { public int x; public S1 s; public int y; }

// This fails today
class C { public S2 s; }

S2 is then still fully usable with NativeMemory and other scenarios, without users having to do even worse or more complex code. Allowing InlineArray just makes this simpler, cleaner, and ultimately safer for the users already doing dangerous or things that are out of the norm; while still failing in the same cases the alternative already would.

tannergooding on Nov 24, 2023

This is likely just a bug with size check not covering one of scenarios or not checking for overflows. Possibly just on one runtime. I do not think this is a reason for changing the actual limit.

The idea is that once an inline array is too large to be safely used on stack, the heap-only use does not have enough advantage over regular arrays. Another concern that even in borderline cases the behavior could be fragile and expose differences in codegen strategy between JITs and modes (i.e. Release/Debug) and different stack size limits between platforms and lead to random stack overflows.

Thus the limit is 1MiB. More than that is likely a mistake or a misuse of the feature.

VSadov on Nov 24, 2023

the heap-only use does not have enough advantage over regular arrays.

There is notably significant advantage in certain types of perf oriented code. Removing that indirection can be very important and can allow developers to codify their own powerful types that have large amounts of inline data. For example, it could be used to build specialized types of ECS.

The current limit is in fact arbitrary, and there are platforms where it might overflow sooner. There are likewise platforms where it might overflow later. This is no different than stackalloc or other types of custom functionality which can incur the same.

I think its ultimately better to just not restrict it, outside of disallowing overflow, and to allow users to freely use it as they desire. If there is concern over “too large” types, then having an analyzer that warns for users specifying over a certain size seems like a reasonable middleground.

tannergooding on Nov 24, 2023