FASTER: (1.7.2) NotImplementedException thrown sometimes in VarLenBlittableAllocator.cs when deleting entry, using SpanByte as key and value

I’m using FasterKV<SpanByte, SpanByte> the newest release (1.7.2), having a fully functional project, but sometimes I get System.NotImplementedException in VarLenBlittableAllocator.cs:line 191 when deleting entries in bulk.

When I’m deleting them one by one manually everything works fine, but if I iterate through keys from a List<> or Dictionary<>, or just calling it many times during stress-tests I always get this exception.

Stacktrace:

System.NotImplementedException
  HResult=0x80004001
  Message=The method or operation is not implemented.
  Source=FASTER_Test2
  StackTrace:
   at FASTER.core.VariableLengthBlittableAllocator`2.ShallowCopy(Key& src, Key& dst) in ...\FASTER\core\Allocator\VarLenBlittableAllocator.cs:line 191
   at FASTER.core.FasterKV`2.InternalDelete[Input,Output,Context,FasterSession](Key& key, Context& userContext, PendingContext`3& pendingContext, FasterSession fasterSession, FasterExecutionContext`3 sessionCtx, Int64 lsn) in ...\FASTER\core\Index\FASTER\FASTERImpl.cs:line 1029
   at FASTER.core.FasterKV`2.ContextDelete[Input,Output,Context,FasterSession](Key& key, Context context, FasterSession fasterSession, Int64 serialNo, FasterExecutionContext`3 sessionCtx) in ...\FASTER\core\Index\FASTER\FASTER.cs:line 537
   at FASTER.core.ClientSession`6.Delete(Key& key, Context userContext, Int64 serialNo) in ...\FASTER\core\ClientSession\ClientSession.cs:line 415
   at KV.KVStore.WorkThread.Loop() in ...\FASTER\KVStore.cs:line 2370
   at System.Threading.ThreadHelper.ThreadStart_Context(Object state) in /_/src/System.Private.CoreLib/src/System/Threading/Thread.CoreCLR.cs:line 50
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) in /_/src/System.Private.CoreLib/shared/System/Threading/ExecutionContext.cs:line 172
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() in /_/src/System.Private.CoreLib/shared/System/Runtime/ExceptionServices/ExceptionDispatchInfo.cs:line 63
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) in /_/src/System.Private.CoreLib/shared/System/Threading/ExecutionContext.cs:line 200
   at System.Threading.ThreadHelper.ThreadStart() in /_/src/System.Private.CoreLib/src/System/Threading/Thread.CoreCLR.cs:line 100

  This exception was originally thrown at this call stack:
    FASTER.core.VariableLengthBlittableAllocator<Key, Value>.ShallowCopy(ref Key, ref Key) in VarLenBlittableAllocator.cs
    FASTER.core.FasterKV<Key, Value>.InternalDelete<Input, Output, Context, FasterSession>(ref Key, ref Context, ref FASTER.core.FasterKV<Key, Value>.PendingContext<Input, Output, Context>, FasterSession, FASTER.core.FasterKV<Key, Value>.FasterExecutionContext<Input, Output, Context>, long) in FASTERImpl.cs
    FASTER.core.FasterKV<Key, Value>.ContextDelete<Input, Output, Context, FasterSession>(ref Key, Context, FasterSession, long, FASTER.core.FasterKV<Key, Value>.FasterExecutionContext<Input, Output, Context>) in FASTER.cs
    FASTER.core.ClientSession<Key, Value, Input, Output, Context, Functions>.Delete(ref Key, Context, long) in ClientSession.cs
    KV.KVStore.WorkThread.Loop() in KVStore.cs
    System.Threading.ThreadHelper.ThreadStart_Context(object) in Thread.CoreCLR.cs
    System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, object) in ExecutionContext.cs
    System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() in ExceptionDispatchInfo.cs
    System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, object) in ExecutionContext.cs
    System.Threading.ThreadHelper.ThreadStart() in Thread.CoreCLR.cs

As the simplest fix I tried to modify this two functions in the VarLenBlittableAllocator.cs:

        public override void ShallowCopy(ref Key src, ref Key dst)
        {
            dst = src;
            // throw new NotImplementedException();
        }

        public override void ShallowCopy(ref Value src, ref Value dst)
        {
            dst = src;
            // throw new NotImplementedException();
        }

…and well, it works. I’m pretty sure not this is the solution, but since the modification I never get the exceptions, and deleting entries works just fine.

I didn’t have this issue with 1.7.1, but I can’t just switch libraries to check if the previous version throws this exception in this test project, because there is a considerable difference between 1.7.1 and 1.7.2 in the SpanByte and SpanByte Functions class, I’m having custom Functions classes, not really interchangeable.

Why do I get this excetion? Is the solution I came up with ok? I can’t really figure out how and why are these functions called inside, that’d need more time for me to understand how faster works.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 54 (28 by maintainers)

Most upvoted comments

@badrishc

Does this help explain the issue and fix?

Yes, thank You for this info, it was helpful, made things clear. I changed the compaction method to

        long toAddr = kv.Log.BeginAddress + (long)Math.Round(((double)(kv.Log.SafeReadOnlyAddress - kv.Log.BeginAddress) * ratio));
        long compAddr = session.Compact(toAddr, false);
        Checkpoint(true, true);
        kv.Log.ShiftBeginAddress(compAddr);
        Checkpoint(true, true);

and works as intended.

Now it seems everything works fine, for now. Thanks again!

Very nice repro, thank you @resetcoder. Here is the explanation:

  • Before you start the first compaction (with shiftBeginAddress=false), the database has its log begin address still at original location (64) in the log. Then you perform the compaction without shifting begin address.
  • Then you take a checkpoint, FASTER saves all details including the current begin address (at original location 64 in the log)
  • Then you shift begin address, so now the log has only the newly compacted data in it. But you have not checkpointed this new state of the system.
  • Now if you recover the original checkpoint, it will restore the original begin address. This is why you are also seeing the older entries after recovery.
  • So, to fix this, after shifting begin address, you need to take another checkpoint. That will store the new begin address in the checkpoint file for recovery.

Does this help explain the issue and fix?

Fix is in the linked PR.