azure-signalr: Timeout when add to group
Hi, We have a Web API (hosted in azure using app service plan) configured with ASRS. At times, we are experiencing timeouts when add connection into a group. Here is our hub method to add to group.
public async Task AddToGroup(string groupName) { await Groups.AddToGroupAsync(Context.ConnectionId, groupName); }
Here is the exception message
“Ack-able message Microsoft.Azure.SignalR.Protocol.JoinGroupWithAckMessage waiting for ack timed out.” and trace
System.TimeoutException:
at Microsoft.Azure.SignalR.ServiceConnectionContainerBase+<WriteAckableMessageAsync>d__49.MoveNext (Microsoft.Azure.SignalR.Common, Version=1.0.14.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.Azure.SignalR.MultiEndpointServiceConnectionContainer+<WriteAckableMessageAsync>d__18.MoveNext (Microsoft.Azure.SignalR.Common, Version=1.0.14.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at BDL.TwilioAPI.Hub.CallStatusHub+<AddToGroup>d__0.MoveNext (TwilioAPI, Version=1.0.0.0, Culture=neutral, PublicKeyToken=nullTwilioAPI, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: D:\a\1\s\TwilioAPI\Hub\CallStatusHub.csTwilioAPI, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: 9)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.AspNetCore.SignalR.Internal.DefaultHubDispatcher1+<ExecuteHubMethod>d__15.MoveNext (Microsoft.AspNetCore.SignalR.Core, Version=1.0.4.0, Culture=neutral, PublicKeyToken=adb9793829ddae60) at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e) at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e) at Microsoft.AspNetCore.SignalR.Internal.DefaultHubDispatcher
1+<Invoke>d__13.MoveNext (Microsoft.AspNetCore.SignalR.Core, Version=1.0.4.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
What would be the cause and how can we fix this? Our clients facing much of inconvenience in retrying. The ASRS is configured with “Classic” mode and Standard pricing tier. The ASRS metrics were alright in terms on server and client connections when these timeout occurs.
We are using azure-signalr 1.0.14 version.
What would be the cause for this issue and is there any workaround that can help us?
Regards, Tresa
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 41 (17 by maintainers)
We released 1.13.0 containing the fix.
I was never able to reliable reproduce this in my environment and it was pretty rare (last received this 1 month ago). This seems to support the idea this is caused by a race condition. Yes we do have multiple service endpoints.
Fix tested & confirmed. Thank you!
@vicancy I just gave it a try and seems like the issue is gone in
1.13.0-preview1-10900
will give it more testing tomorrow, thanks for looking into this, great work 👍Here’s an isolated test case that causes the error on 1.6.1 and not on 1.8.1
Error:
Base docker image is mcr.microsoft.com/dotnet/aspnet:5.0-alpine3.12. Running on AKS in EU-West
I started seeing these errors right after I switched my solution from dotnet core 3.1 to .net 5. I also upgraded Microsoft.Azure.SignalR from 1.6.1 to 1.8.0. Turned out, when I downgrade the package version back to 1.6.1 it starts working again. I did not try any version between the two though. So my solution was to go with 1.6.1 as it works fine.
I’m also seeing this same timeout when JoinGroupWithAckMessage is called. Seems to happen when there is lots of traffic on our hub server. I’ve tried adding more app server instances and scaling up ASRS. But no luck. We’re using Microsoft.Azure.SignalR 1.5.1 and .NET Core 3.1. Any help or guidance on debugging would be greatly appreciated.
@ajbeaven The fix for original issue is done. The reason for this kind of exception can be various. Would you share us(jixin[at]microsoft.com) further information about time, resourceId about your case?