apollo-kotlin: Retrying a subscription does not renew the id and may cause an error on the server because the id is already used
Version
4.0.0-alpha.2
Summary
I have noticed few crashes of my app when using subscription. It looks like SDK have problem with parsing errors. I am not sure exactly why. I have contacted my backend team but maybe letting you know can resolve that issue faster.
My setup of ApolloClient looks like this
ApolloClient.Builder()
.networkTransport(
HttpNetworkTransport.Builder()
.serverUrl(BuildConfig.BASE_URL)
.okHttpClient(okHttpClient)
.build()
)
.subscriptionNetworkTransport(
WebSocketNetworkTransport.Builder()
.protocol(GraphQLWsProtocol.Factory())
.serverUrl(BuildConfig.BASE_URL)
.okHttpClient(okHttpClient)
.reopenWhen { throwable, attempt ->
if (attempt < 13) {
delay(2.0.pow(attempt.toDouble()).toLong())
} else {
delay(5000L)
}
true
}
.build()
)
.build()
and subscription like this
apolloClient.subscription(BoostCompetitionSubscription(roomId))
.toFlow()
.mapNotNull { response ->
val exception = response.exception
if (exception is ApolloGraphQLException) {
Timber.e(BoostCompetitionException(roomId, exception.errors, exception))
}
try {
response.data?.boostCompetitions?.firstOrNull()?.toDomain()
} catch (ex: Exception) {
Timber.e(BoostCompetitionParsingException(roomId, ex.message, ex))
null
}
}
Just before the crash reopenWhen set on WebScoketNetworkTransport was fired once with attempt 0
Is there something I could do to prevent app from crashing?
Steps to reproduce the behavior
No response
Logs
com.apollographql.apollo3.exception.JsonDataException: Expected BEGIN_ARRAY but was BEGIN_OBJECT at path errors
at com.apollographql.apollo3.api.json.MapJsonReader.peek
at com.apollographql.apollo3.api.json.MapJsonReader.beginArray
at com.apollographql.apollo3.api.json.MapJsonReader.beginArray
at com.apollographql.apollo3.api.json.MapJsonReader.beginArray
at com.apollographql.apollo3.api.json.MapJsonReader.beginArray
at com.apollographql.apollo3.api.internal.ResponseParser.readErrors
at com.apollographql.apollo3.api.internal.ResponseParser.readErrors
at com.apollographql.apollo3.api.internal.ResponseParser.parse
at com.apollographql.apollo3.api.Operations.parseJsonResponse
at com.apollographql.apollo3.network.ws.WebSocketNetworkTransport$execute$$inlined$map$1$2.emit
at com.apollographql.apollo3.network.ws.WebSocketNetworkTransport$execute$$inlined$map$1$2.emit$bridge
at kotlinx.coroutines.flow.internal.SafeCollectorKt$emitFun$1.invoke
at kotlinx.coroutines.flow.internal.SafeCollectorKt$emitFun$1.invoke
at kotlinx.coroutines.flow.internal.SafeCollectorKt$emitFun$1.invoke
at kotlinx.coroutines.flow.internal.SafeCollector.emit
at kotlinx.coroutines.flow.internal.SafeCollector.emit
at kotlinx.coroutines.flow.internal.SafeCollector.emit
at com.apollographql.apollo3.network.ws.WebSocketNetworkTransport$execute$3.invokeSuspend
at androidx.compose.material.AnchoredDraggableKt$snapTo$2.invokeSuspend$bridge
at com.apollographql.apollo3.network.ws.WebSocketNetworkTransport$execute$3.invoke
at com.apollographql.apollo3.network.ws.WebSocketNetworkTransport$execute$3.invoke
at androidx.compose.material.AnchoredDraggableKt$snapTo$2.invoke$bridge(SourceFile:0)
at com.apollographql.apollo3.internal.FlowsKt$transformWhile$1$invokeSuspend$$inlined$collectWhile$1.emit
at com.apollographql.apollo3.internal.FlowsKt$transformWhile$1$invokeSuspend$$inlined$collectWhile$2.emit
at androidx.compose.ui.platform.WindowRecomposer_androidKt$createLifecycleAwareWindowRecomposer$2$onStateChanged$1$1$1$1.emit$bridge
at com.apollographql.apollo3.network.ws.WebSocketNetworkTransport$execute$$inlined$filter$1$2.emit
at androidx.compose.foundation.text.selection.SelectionMagnifierKt$rememberAnimatedMagnifierPosition$1$2.emit$bridge
at kotlinx.coroutines.flow.SubscribedFlowCollector.emit
at kotlinx.coroutines.flow.SharedFlowImpl.collect$suspendImpl
at kotlinx.coroutines.flow.SharedFlowImpl$collect$1.invokeSuspend
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith
at kotlinx.coroutines.DispatchedTask.run
at kotlinx.coroutines.DispatchedTask.run
at kotlinx.coroutines.internal.LimitedDispatcher.run
at kotlinx.coroutines.scheduling.TaskImpl.run
at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run
About this issue
- Original URL
- State: open
- Created 10 months ago
- Comments: 16 (8 by maintainers)
Oooh my…, the KDoc for
ApolloClient.Builderis severaly lacking. Follow up issue here.ApolloCall.retryOnError(Boolean)uses a boolean that lets individual operation decide if they want to be retriedApolloClient.Builder.retryOnError((ApolloRequest) -> Boolean)uses a function that determines the default value if noretryOnErrorhas been set on theApolloCall. Usually you’ll do{ it.operation is Subscription }to only retry subscriptions (but you might want to retry queries too)The actual logic is done in RetryOnErrorInterceptor. It leverages coroutines to retry the
Flowwhenever an error happens. If no error happens then no retry is done.Hope this helps, since this is all pretty new there’s not a lot of docs but I’ll get working on this (edit: KDoc PR here).
Hey @martinbonnin Thanks for update, I was looking for those changes. No worries about delay. We get this SDK for free so appreciate any feedback and support 😄
I have updated to beta5 now and added retryOnError. Will see how that works. Regarding incubating, I wouldn’t mind to try that as well but would appreciate some hints on it. We use it on prod so don’t want to take too much risk with experimentation🙈
Here is how we setup apollo client currently:
Things that I would need to address:
KtorWebSocketEngine. Looks like there is just one implementation ofWebSocketEngineinside incubating which isAppleWebSocketEngineFactoryforGraphQLWsProtocolbut I guess it suppose to be created now with constructor?reopenWhenwhich was called on network disconnections. Is it not needed anymore? Is every disconnection now passed to each subscription or is it handled silently by SDK?Additionally, you can see that I am reducing idle timeout. I discovered that re-login user kept using previous connection. Do you know if there is a better way to invalidate connection?
Thanks!
@damianpetla apologies for the super late response here. This ended up being quite the rabbit hole…
The tlrd; of that rabbit hole is that retrying at the WebSocket like was previously done has major limitations as you found out:
All in all,
4.0.0-beta.5adds a retry option on theApolloClientitself:This code also uses an incubating
WebSocketNetworkTransportthat should make it a lot easier to debug things moving forward:For more details, you can also check the integration test here
The current plan is to replace the default implementation of
WebSocketNetworkTransportin the next version if everything looks good with the incubating one. If you’re still around and have the time to try it, I’d love to hear what you think. Again, deep apologies for the delay on this issue and thanks for raising it 🙏thanks @martinbonnin can’t do that unfortunately because we use KKM and I have Apollo in shared module exposing
Flowand two collectors in android module. I will use single flow, not a big deal. Just wanted to raise this as it might be a potential improvement to Apollo allowing multiple collectors.Yes, instead of a crash, you will have an
ApolloResponsewith nulldataand non-nullexceptioninsteadWe typically release ~1 a month these days but there is no fixed release schedule. You can already try in the SNAPSHOTs though