runtime: Not documented that the List.RemoveAll method can corrupt the list

I noticed that the RemoveAll method can corrupt a List<T>, in case the match predicate throws for some element, but not for all elements. Here is a minimal demonstration of this behavior. A list has three elements, the first is removed, but the predicate fails for the third element:

using System;
using System.Collections.Generic;

public static class Program
{
    public static void Main()
    {
        var list = new List<int>() { 1, 2, 3 };
        Console.WriteLine($"Before RemoveAll: [{String.Join(", ", list)}]");
        try
        {
            list.RemoveAll(x =>
            {
                if (x == 3) throw new Exception();
                return x == 1;
            });
        }
        catch { }
        Console.WriteLine($"After RemoveAll: [{String.Join(", ", list)}]");
    }
}

Output:

Before RemoveAll: [1, 2, 3]
After RemoveAll: [2, 2, 3]

Live demo.

I can’t find in the internet any report of an actual problem caused by this behavior, but nonetheless it might be prudent to document it. I would suggest something along the lines:

Caution: In case the match predicate fails for some element, the List<T> might become corrupted. It is advised that the match predicate is fail-proof, otherwise the RemoveAll method should be invoked in a try block that has a catch block where the potentially corrupted List<T> is discarded.

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 35 (35 by maintainers)

Most upvoted comments

Exceptions - in general, not just here - should be assumed in all cases to corrupt the state/be the result of corrupt state of any object they are returned from (barring some specific documented cases).

Even if this case was fixed (so the list wasn’t mutated this specific way), the list is corrupted simply because RemoveAll didn’t finish. It’s not usable, and there’s no way to automatically recover from it (ie, you can’t call the method again, because presumably it fails again).

Clockwork-Muse on Mar 6, 2022

On top of what @PathogenDavid said, if your predicate is throwing an exception, you simply have a bug. Rather than hoping the library will help you recover from that bug and plow on, you should fix the bug.

danmoseley on Mar 6, 2022

While @PathogenDavid is correct that the list isn’t corrupted, that’s an implementation detail, and in no way guaranteed. Almost all cases of “resistant” calls are due to such implementation details.

It doesn’t really matter. though - any of the ArgumentExceptions represent a bug in programmer code;

They should (barring some truly special cases that don’t apply to normal use) never be caught, and allow the program to crash.
The state of the program up to that point should be assumed to be corrupt, or otherwise not usable.

None of these methods seems to leave the collection in an unusable state. Items might be partially removed, and the collection might be partially sorted, but no duplicate values appear out of nothing

Your second sentence contradicts the first sentence. Those are corrupt states, that the “invariants” (expectations) for the method have been violated. If you were expecting a sorted list, the “partially sorted” one is unusable.

So maybe it’s just undocumented side effect feature?

It’s “undocumented”, because if it happens it’s due to a bug in your code, which is better off being fixed, and there’s no guarantee at all about the resulting state of the object, so is better for it not to be documented (and thus maybe guaranteed).

Generally speaking, the “safe” exceptions are going to be IOExceptions - they represent a boundary to the “outside” world. In most cases program state won’t be corrupted, but certainly the state of the filesystem might be (ie, file missing, partially written, etc); whether that’s “recoverable” depends on a variety of factors.

That aside, unless a method or library specifically states some form of “mutable inputs or outputs will be in such-and-such a state”, assume that they’re corrupted/unusable and discard them (and let the exception crash your program) - even if they appear safe from observation.

Clockwork-Muse on Mar 6, 2022

it should be assumed that the list has been corrupted? Or this assumption should be made only for non-documented exceptions?

No. The list will be fine in this case.

The problem is not necessarily the exception its self, but the reason it was thrown and who threw it. In your case the predicate threw an unexpected exception before RemoveAll could complete its work. The state you’re seeing is RemoveAll having only finished part of its work.

On the flip side, ArgumentOutOfRangeException is explicitly thrown by RemoveAt before work has even begun in order to tell you that you messed up. Since no work had begin, things will still be in a good state inside List<T>. – However, things are arguably bad with whatever was calling RemoveAt. Depending on what it was doing at the time things may be in a very bad state for it. (Although it was arguably already in a bad state since it passed an invalid index to RemoveAt.)

The main issue is you caught an exception you could not have reasonably expected and continued to use data which was being manipulated when the exception was thrown.

PathogenDavid on Mar 6, 2022

@theodorzoulias I agree that it is a little fuzzy at the highest level, as there are indeed some exceptions that .NET core API throw that it is reasonable to recover from (eg IO exceptions, in many cases). However one can assume at a very minimum that throwing an exception into library code is not safe unless explicitly documented otherwise.

You could add a line to the doc for RemoveAll to note that the delegate should not throw an exception. I am not sure it is worth doing (as I do not think I have seen other reports) but if you want to edit that doc (click the edit button in top right) feel free to offer that change.

As the docs for collections are currently maintained in that docs repo, it would not be a change in this repo so I’ll close this now.

danmoseley on Mar 7, 2022

@theodorzoulias LINQ does not modify the input collection. You can safely use ids in the catch block.

teo-tsirpanis on Mar 7, 2022

I think it should catch the exception, make sure the list’s state is consistent (e.g. keep this and the rest of the elements), and rethrow it.

Edit: If similar cases are not hardened, it makes no sense to do it only for List.RemoveAll, and starting doing it widely does not seems like worth it (most exceptions are exceptional either way 😛)

teo-tsirpanis on Mar 6, 2022

To add – I would be opposed to documenting cases where throwing from a delegate would not (currently) corrupt the collection. We should reserve the right to change the implementation for whatever reason (such as to use a more performant algorithm or data structure) in such a way that such an exception would indeed corrupt the object.

danmoseley on Mar 7, 2022

Exceptions - in general, not just here - should be assumed in all cases to corrupt the state/be the result of corrupt state of any object they are returned from (barring some specific documented cases).

@Clockwork-Muse hmm, so in case for example that a list.RemoveAt(index) resulted in a (documented) ArgumentOutOfRangeException, it should be assumed that the list has been corrupted? Or this assumption should be made only for non-documented exceptions?

theodorzoulias on Mar 6, 2022