apollo-kotlin: Retrying a subscription does not renew the id and may cause an error on the server because the id is already used
Version
4.0.0-alpha.2
Summary
I have noticed few crashes of my app when using subscription. It looks like SDK have problem with parsing errors. I am not sure exactly why. I have contacted my backend team but maybe letting you know can resolve that issue faster.
My setup of ApolloClient looks like this
ApolloClient.Builder()
.networkTransport(
HttpNetworkTransport.Builder()
.serverUrl(BuildConfig.BASE_URL)
.okHttpClient(okHttpClient)
.build()
)
.subscriptionNetworkTransport(
WebSocketNetworkTransport.Builder()
.protocol(GraphQLWsProtocol.Factory())
.serverUrl(BuildConfig.BASE_URL)
.okHttpClient(okHttpClient)
.reopenWhen { throwable, attempt ->
if (attempt < 13) {
delay(2.0.pow(attempt.toDouble()).toLong())
} else {
delay(5000L)
}
true
}
.build()
)
.build()
and subscription like this
apolloClient.subscription(BoostCompetitionSubscription(roomId))
.toFlow()
.mapNotNull { response ->
val exception = response.exception
if (exception is ApolloGraphQLException) {
Timber.e(BoostCompetitionException(roomId, exception.errors, exception))
}
try {
response.data?.boostCompetitions?.firstOrNull()?.toDomain()
} catch (ex: Exception) {
Timber.e(BoostCompetitionParsingException(roomId, ex.message, ex))
null
}
}
Just before the crash reopenWhen
set on WebScoketNetworkTransport
was fired once with attempt 0
Is there something I could do to prevent app from crashing?
Steps to reproduce the behavior
No response
Logs
com.apollographql.apollo3.exception.JsonDataException: Expected BEGIN_ARRAY but was BEGIN_OBJECT at path errors
at com.apollographql.apollo3.api.json.MapJsonReader.peek
at com.apollographql.apollo3.api.json.MapJsonReader.beginArray
at com.apollographql.apollo3.api.json.MapJsonReader.beginArray
at com.apollographql.apollo3.api.json.MapJsonReader.beginArray
at com.apollographql.apollo3.api.json.MapJsonReader.beginArray
at com.apollographql.apollo3.api.internal.ResponseParser.readErrors
at com.apollographql.apollo3.api.internal.ResponseParser.readErrors
at com.apollographql.apollo3.api.internal.ResponseParser.parse
at com.apollographql.apollo3.api.Operations.parseJsonResponse
at com.apollographql.apollo3.network.ws.WebSocketNetworkTransport$execute$$inlined$map$1$2.emit
at com.apollographql.apollo3.network.ws.WebSocketNetworkTransport$execute$$inlined$map$1$2.emit$bridge
at kotlinx.coroutines.flow.internal.SafeCollectorKt$emitFun$1.invoke
at kotlinx.coroutines.flow.internal.SafeCollectorKt$emitFun$1.invoke
at kotlinx.coroutines.flow.internal.SafeCollectorKt$emitFun$1.invoke
at kotlinx.coroutines.flow.internal.SafeCollector.emit
at kotlinx.coroutines.flow.internal.SafeCollector.emit
at kotlinx.coroutines.flow.internal.SafeCollector.emit
at com.apollographql.apollo3.network.ws.WebSocketNetworkTransport$execute$3.invokeSuspend
at androidx.compose.material.AnchoredDraggableKt$snapTo$2.invokeSuspend$bridge
at com.apollographql.apollo3.network.ws.WebSocketNetworkTransport$execute$3.invoke
at com.apollographql.apollo3.network.ws.WebSocketNetworkTransport$execute$3.invoke
at androidx.compose.material.AnchoredDraggableKt$snapTo$2.invoke$bridge(SourceFile:0)
at com.apollographql.apollo3.internal.FlowsKt$transformWhile$1$invokeSuspend$$inlined$collectWhile$1.emit
at com.apollographql.apollo3.internal.FlowsKt$transformWhile$1$invokeSuspend$$inlined$collectWhile$2.emit
at androidx.compose.ui.platform.WindowRecomposer_androidKt$createLifecycleAwareWindowRecomposer$2$onStateChanged$1$1$1$1.emit$bridge
at com.apollographql.apollo3.network.ws.WebSocketNetworkTransport$execute$$inlined$filter$1$2.emit
at androidx.compose.foundation.text.selection.SelectionMagnifierKt$rememberAnimatedMagnifierPosition$1$2.emit$bridge
at kotlinx.coroutines.flow.SubscribedFlowCollector.emit
at kotlinx.coroutines.flow.SharedFlowImpl.collect$suspendImpl
at kotlinx.coroutines.flow.SharedFlowImpl$collect$1.invokeSuspend
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith
at kotlinx.coroutines.DispatchedTask.run
at kotlinx.coroutines.DispatchedTask.run
at kotlinx.coroutines.internal.LimitedDispatcher.run
at kotlinx.coroutines.scheduling.TaskImpl.run
at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run
About this issue
- Original URL
- State: open
- Created 10 months ago
- Comments: 16 (8 by maintainers)
Oooh my…, the KDoc for
ApolloClient.Builder
is severaly lacking. Follow up issue here.ApolloCall.retryOnError(Boolean)
uses a boolean that lets individual operation decide if they want to be retriedApolloClient.Builder.retryOnError((ApolloRequest) -> Boolean)
uses a function that determines the default value if noretryOnError
has been set on theApolloCall
. Usually you’ll do{ it.operation is Subscription }
to only retry subscriptions (but you might want to retry queries too)The actual logic is done in RetryOnErrorInterceptor. It leverages coroutines to retry the
Flow
whenever an error happens. If no error happens then no retry is done.Hope this helps, since this is all pretty new there’s not a lot of docs but I’ll get working on this (edit: KDoc PR here).
Hey @martinbonnin Thanks for update, I was looking for those changes. No worries about delay. We get this SDK for free so appreciate any feedback and support 😄
I have updated to beta5 now and added retryOnError. Will see how that works. Regarding incubating, I wouldn’t mind to try that as well but would appreciate some hints on it. We use it on prod so don’t want to take too much risk with experimentation🙈
Here is how we setup apollo client currently:
Things that I would need to address:
KtorWebSocketEngine
. Looks like there is just one implementation ofWebSocketEngine
inside incubating which isAppleWebSocketEngine
Factory
forGraphQLWsProtocol
but I guess it suppose to be created now with constructor?reopenWhen
which was called on network disconnections. Is it not needed anymore? Is every disconnection now passed to each subscription or is it handled silently by SDK?Additionally, you can see that I am reducing idle timeout. I discovered that re-login user kept using previous connection. Do you know if there is a better way to invalidate connection?
Thanks!
@damianpetla apologies for the super late response here. This ended up being quite the rabbit hole…
The tlrd; of that rabbit hole is that retrying at the WebSocket like was previously done has major limitations as you found out:
All in all,
4.0.0-beta.5
adds a retry option on theApolloClient
itself:This code also uses an incubating
WebSocketNetworkTransport
that should make it a lot easier to debug things moving forward:For more details, you can also check the integration test here
The current plan is to replace the default implementation of
WebSocketNetworkTransport
in the next version if everything looks good with the incubating one. If you’re still around and have the time to try it, I’d love to hear what you think. Again, deep apologies for the delay on this issue and thanks for raising it 🙏thanks @martinbonnin can’t do that unfortunately because we use KKM and I have Apollo in shared module exposing
Flow
and two collectors in android module. I will use single flow, not a big deal. Just wanted to raise this as it might be a potential improvement to Apollo allowing multiple collectors.Yes, instead of a crash, you will have an
ApolloResponse
with nulldata
and non-nullexception
insteadWe typically release ~1 a month these days but there is no fixed release schedule. You can already try in the SNAPSHOTs though