urql: Subscription cache updates triggered during a mutation can cause cache tearing

I may have found a race condition in Graphcache updates for cache updates that are triggered by both a mutation and a subscription. Both are needed for a chat where messages are sent that need to be synced across multiple devices.

In particular I believe that it’s caused by multiple rapid mutations causing the subscription updater to be called before the updates for a successful mutation which in turns breaks reconciliation.

I’ve been able to track it down to a specific cache manipulation function (added at the end of this post for completeness). The function appends the new message to a message list in a conversation, if the message does not already exist in the list.

This function is called with a message from the result of a mutation and from the subscription that receives new messages (including ones sent by the client since the server can’t link a subscription to the origin of a mutation).

This works fine when the mutation is sent slowly as can be seen by the log ouput that I added in the cache.

timestamp location __typename id cacheResolveResult filename:linenumber:column
19:17:10.452 mutation ChatMessage optimistic-a5e38364-adaf-4fe5-8884-a1e118bac338 null cache.js:115:18
19:17:11.122 mutation ChatMessage 29c86790-f904-46c4-af98-68d70fef4a8e null cache.js:115:18
19:17:11.219 subscription ChatMessage 29c86790-f904-46c4-af98-68d70fef4a8e null cache.js:185:18

When two mutations are sent in rapid succession however the execution is no longer linear and as a result it seems the ChatConversation that’s updated in updateConversationMessageLists is turned into a null result in a component that depends on it causing the component to have no data where it expects data.

This is the log output in that scenario

timestamp location __typename id cacheResolveResult filename:linenumber:column
19:17:17.851 mutation ChatMessage optimistic-d569bd44-0ef5-41ec-95c2-868b5b6dee87 null cache.js:115:18
19:17:18.189 mutation ChatMessage optimistic-c94ddd7e-91d0-4f7b-9810-54e3ff8cfd99 null cache.js:115:18
19:17:18.477 subscription ChatMessage 160ea03f-70a7-4621-92e6-e2255bc7ef11 null cache.js:185:18
19:17:18.838 subscription ChatMessage cada3d6e-e8e9-497d-89f1-27b8bd512e0f null cache.js:185:18
19:17:18.942 mutation ChatMessage 160ea03f-70a7-4621-92e6-e2255bc7ef11 null cache.js:115:18
19:17:18.945 mutation ChatMessage cada3d6e-e8e9-497d-89f1-27b8bd512e0f null cache.js:115:18

urql version & exchanges:

├─ @urql/core@2.0.0 ├─ @urql/devtools@2.0.3 └─ @urql/exchange-graphcache@4.0.0

  • devtoolsExchange
  • dedupExchange
  • graphcacheExchange
  • fetchExchange
  • subscriptionExchange

Steps to reproduce

  1. Create a cache update function that reads and writes from the cache
  2. Call the function in a mutation and a subscription updater
  3. Trigger a new subscription message on the server while processing the mutation with the same data
  4. On the client trigger two mutations in rapid succession

Expected behavior All cache updates happen correctly and the component that has queried the updated data renders as usual.

Actual behavior The component that has queried the data receives null causing it to fail its data dependencies, unable to show the user a meaningful screen.


function updateConversationMessageLists(cache, message) {
  const cacheKey = {
      __typename: message.conversation.__typename,
      id: message.conversation.id
  };
  const allFields = cache.inspectFields(cacheKey);

  // Find all message lists that were looking at the end of the conversation
  // where new messages would appear.
  const messageQueries = allFields.filter(field => field.fieldName === "messages" && field.arguments.last && !field.arguments.before);

  const fragment = gql`
    fragment _ on ChatConversation {
      __typename
      id
      messages(last: $last) {
        edges {
          node {
            id
          }
        }
      }
    }
  `;

  // Update all messages field instances for the affected conversation.
  messageQueries.forEach(({ arguments: fieldArgs }) => {
    const fragmentVariables = { last: fieldArgs.last };
    const data = cache.readFragment(
      fragment,
      cacheKey,
      fragmentVariables
    );
    if (data) {

      if (data.messages.edges === null) {
        data.messages.edges = [];
      }

      // We just ignore existing messages. We don't expect changes here.
      if (data.messages.edges.findIndex(({ node }) => node.id === message.id) !== -1) {
        return;
      }

      data.messages.edges.push({
        __typename: "ChatConversationChatMessageEdge",
        node: message
      });

      cache.writeFragment(
        fragment,
        data,
        fragmentVariables
      );
    }
  });
}

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 17 (17 by maintainers)

Most upvoted comments

When the message is already present in the list and the normalised cache writes the new data automatically to the message entity directly. That’s when the message updates without any of your updaters.

Aaaaah I see, okay 😃

Yeah unfortunately my subscription usually arrives (but not always) before my mutation result at the moment. Given that my subscriptions are served by a separate process from a message over a RabbitMQ queue and the message is put on the RabbitMQ queue before the mutation returns (since that’s a PHP web request in Drupal that needs to finish its response). My constraints are, shall we say, interesting 🙈 .

Thanks for the help! At least this means I have some improvements to make in the caching code of the actual application 🎉 . I’ll get back to that on Monday (today was my day off but didn’t want to let you wait too long for my reply) and then I’ll see if those improvements also solve my original problem (or whether I need to improve my reproduction sandbox).

However, what’s the third case here that seems to be bypassing both of my updaters?

When the message is already present in the list and the normalised cache writes the new data automatically to the message entity directly. That’s when the message updates without any of your updaters.

Edit: One idea I’d have would maybe work here too if you can’t add a stable secondary ID for the clients. You can filter out optimistic messages (as you’re already doing) but only if your current updater is non-optimistic, i.e. triggered by a mutation result. You can detect this by checking !info.optimistic

Edit 2: Then you’d only have duplicate messages if the subscription event comes in before the mutation result. You can make this less likely on the API by delaying the publish of the “new message” event until after the current mutation completes. You can also completely solve this by ignoring newMessage events that have originated from the current device. This detection is trickier without a client-side ID, but you could try to add a heuristic for this.