orleans: Consul KV for silos are not getting cleaned up on new run

Hello

I’m facing a problem similar to https://github.com/dotnet/orleans/issues/6090

On first run, I’m able to spin up silo fine and see consul KV generated for per silo on Consul.

however when I spin up silo container next time on new port, client is trying to reach out the silo from previous run on previous port.

I’m seeing this error on silo container.

Exception when trying to get GrainInterfaceMap for silos S10.84.36.86:20038:319257509

I also have settings set in my code.

options.DefunctSiloCleanupPeriod = TimeSpan.FromMinutes(2); options.DefunctSiloExpiration = TimeSpan.FromMinutes(2);

which I believe would clean up old KV but it’s not happening it seems.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 16 (6 by maintainers)

Most upvoted comments

We found a hacky workaround, figured I’d share in case anyone else was running into this as well. Create a hosted service, something like this:

 public class ConsulCleanup : IHostedService
 {
    private readonly ILocalSiloDetails localSiloDetails;
    private readonly IMembershipTable membershipTable;

    public ConsulCleanup(ILocalSiloDetails localSiloDetails, IMembershipTable membershipTable)
    {
        this.localSiloDetails = localSiloDetails;
        this.membershipTable = membershipTable;
    }

    public Task StartAsync(CancellationToken cancellationToken) => Task.CompletedTask;

    public async Task StopAsync(CancellationToken cancellationToken)
    {
        await this.membershipTable.UpdateIAmAlive(new MembershipEntry
        {
            SiloAddress = this.localSiloDetails.SiloAddress,
            IAmAliveTime = DateTime.UtcNow.AddMinutes(-10),
            Status = SiloStatus.Dead, // This isn't used to determine if a silo is defunct but I'm going to set it anyway.
        });

        await this.membershipTable.CleanupDefunctSiloEntries(DateTimeOffset.UtcNow.AddMinutes(-5));
    }
}

Then register the hosted service with the HostBuilder:

IHostBuilder hoster = new HostBuilder()
    .ConfigureServices(serviceCollection =>
    {
        serviceCollection.AddHostedService<ConsulCleanup>();
    });

Now when process receives a SIGTERM, the ConsulCleanup host will remove it’s entry. It’s a hack, but hopefully it helps.