runtime: Improve SelectMany behavior when the selector returns null

In the current version Enumerable.SelectMany experiences a NullReferenceException when selector returns null. This is undesirable because it’s a confusing crash with a “forbidden” exception type in BCL code. This should be changed.

Example: foreach(var item in new int[][] { new int[1], null }.SelectMany(x => x)) { }

The safest approach would be to throw an exception that explains that the user has provided an invalid selector that does not obey the contract. Additionally, the documentation should state the requirement to not return null.

An alternative approach would be to treat null as an empty sequence. I’m not sure which solution is better. Sometimes, an empty sequence would be very convenient. Sometimes this would hide a bug. Also it’s unclear that this can be done due to compatibility concerns. I think throwing a different exception type would be safe enough to take the change.

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Reactions: 1
  • Comments: 19 (11 by maintainers)

Most upvoted comments

I really wonder what code would depend on a NRE being thrown from a SelectMany. I would make the case to simply fix this. I think the perf loss from this null check would be very small and pale in comparison to other perf losses that are routinely accepted by LINQ to improve usability.

It is very, very rare for the BCL to take the stance that throwing a bad exception type is OK. I cannot think of a single other such case (they might exist). I personally value this quality of the framework very much.

I understand the argument of cost, but is that really worth the inconsistency? It took me a while to track the bug down that lead to this ticket. I was searching the lambda for possible causes.

Sometimes, an empty sequence would be very convenient. Sometimes this would hide a bug.

It’s the hiding a bug part that matters most to me. If I’m expecting null (which I think should be rare because null and empty list are not semantically similar from my perspective), I’d rather make that expectation explicit via .SelectMany(x => x ?? Enumerable.Empty<int>()).

I cannot think of a single other place where the BCL allows a NRE to be generated by framework code.

There are plenty of places.

Some of them are by design because we don’t want to pay any additional costs on very hot paths, e.g.

default(TaskAwaiter).GetResult();

will throw a NullReferenceException internally as it dereferences a null Task field. This is considered invalid use, and it’s not worth the extra code bloating all of the places this gets inlined into as part of awaits.

Some are accidental and we’ve just never gone back and addressed them because nulls are considered invalid inputs to them and we’d simply be changing the exception type from one indication of a null failure to another indication of a null failure, e.g.

new XPathDocument(default(Stream));

currently throws a NullReferenceException, as it doesn’t validate that the expected non-nullable Stream is actually non-null. For these, we generally accept PRs to “fix” them, primarily to a) make failures easier to debug (callers get an exception that provides more detail about what was the actual problem) and b) to highlight that the failure wasn’t actually because of a bug in the library but rather with how the library was being used; that, after all, is the primary purpose of ArgumentNullException.

This SelectMany case falls into the latter bucket. null is not allowed to be returned from the delegate, and as I noted, that is now explicitly codified with nullable reference types, with the delegate defined as Func<TSource, IEnumerable<TResult>> rather than the erroneous Func<TSource, IEnumerable<TResult>?>. I don’t see a good reason to change the signature to allow nulls nor to change the implementation to pretend that null means an empty collection. But I think it’d be fine to add a check for a null return that throws an InvalidOperationException with a message explaining that the delegate returned null when it was required to return a valid collection.

I’m not concerned about breaking changes here; we generally consider it acceptable to replace NullReferenceExceptions with something else.

I’m also not concerned about the overhead: the cost of an extra null check won’t be measurable when compared to the cost of the proceeding delegate invocation plus at a minimum an interface invocation (or if special-casing was desired, a type check), and most likely many more plus allocation as the returned enumerable is enumerated, on top of whatever is done inside of the callback and whatever is done by the consumer of the SelectMany.

I would not advise flattening but rather throwing a non-boneheaded exception. The exception comes from BCL code, not from user code.

We do not believe the value is worth the extra check (has small perf impact). We do not believe the scenario of passing null to SelectMany is that common in running applications. We agree that it might be slightly better in development time, but that’s about as much value as we can think of. If you disagree, feel free to provide additional data.