stripe-cli: Webhooks are being sent in incorrect order.

The more information we have the easier it is for us to help. Feel free to remove any sections that might not apply

Issue

Sometimes webhooks are being sent incorrectly. Specific events:

  • customer.subscription.created
  • customer.subscription.updated

This causes our system to record the subscription status incorrectly. In the stripe dashboard, the actual status of the subscription is already active, however our system recorded incomplete because we received the created event later that it’s updated counterpart.

I wonder if this will also happen in production?

Expected Behavior

Expected behavior would be:

  • customer.subscription.created first
  • then customer.subscription.updated

Steps to reproduce

Note network throttling might help. I have a slow internet.

  1. Create a checkout session
  2. Checkout using session id
  3. Checkout success
  4. Webhooks successful but, events are send incorrectly.

Traceback

Correct (created then updated) image

Wrong (updated then created) image

Environment

ArchLinux

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 6
  • Comments: 19

Most upvoted comments

@tomer-stripe How is this not considered a bug? I am seeing the same issue, and it is wreaking havoc with my integration. If I can’t rely on the subscription.created > subscription.updated > session.completed events coming in that order, how do I know what the current status is of my subscription? Note that those events also need to be synchronous - subscription.updated shouldn’t be sent until Stripe receives a 200 response from the subscription.created webhook (same thing with session.completed waiting for subscription.updated)… otherwise, this could lead to a race condition on my server.

I am relying on the subscription status field, and if the last webhook event that I receive happens to be subscription.created, it will have a status of “incomplete” which will overwrite the status of “active” sent by the subscription.updated event that arrived first.

@tomer-stripe This is super problematic for your clients. It would be possible to send them in order from your side, by sending requests to webhooks, one at a time, and by waiting for the 200 status code response (or a failure after a number of retries) and only then sending the next event. Instead you move the complexity to your clients.

The fact that it is recommended to ignore webhook request payload by your support and query API is a complete BS. There can still be a race condition, unless client ensures that only 1 request per given customer is processed globally. Did you try to implement a reliable solution using your webhooks yourself? It is a nightmare.

Finally, could you at least provide more fine-grained event samples (instead of seconds) such that your clients can reorder the events on their own?

Really sloppy design on your side, heavily disappointed and we are now strongly considering another payment processor.

I understand this is by ‘design’ as per the event-ordering docs I spent a whole day building webhooks that rely on the lifecycle of the subscription, but now I realise the order is not gaurenteed, I don’t know what to do.

And polling or calling the API everytime the user logs seems like the wrong thing to do, whats the recommended solution here? Are consumers of the stripe API supposed to implement some form of events bus system and ordering the events by created time?

I know the stripe CLI has nothing to do with this, but if someone in the team can re-direct us to the right place for this, that would be great!

So the solution I came up with is to completely ignore the subscription data that is posted when my webhook handler is called. Rather than relying on that data, I use the subscription ID to call the subscriptions.retrieve API to get the current subscription data. THAT data is up-to-date, so I don’t have to worry about the order of the subscription and session events arriving.

I think the documentation on handling session completion is misleading. Under “webhooks” on this page https://stripe.com/docs/payments/checkout#payment-success it reads:

Stripe sends the checkout.session.completed event for a successful Checkout payment. The webhook payload includes the Checkout Session object, which contains information about the Customer, PaymentIntent, or Subscription,

While this may be technically correct, it implies that the whole process has completed, when in actuality the corresponding events may not have yet been sent. I think that should be clarified to avoid confusion.

Hi, also came into this topic while implementing a new subscription model and found no way to determine the order of events, making webhook data quite useless 😦 and only to be used as a trigger to fetch data from the Stripe side. Is that correct? any new solution around?

:: EDITED :: 2 cents

objective I want to SAVE in my DB all the events in the correct order. I don’t want to use a pre-defined order of events (“customer.created goes before customer.updated” etc)

what we know

  • webhooks do not deliver events in any specific order
  • there is no property on the event data you can use to infer the relative order from past or future events to come to your webhook

first conclusion

  • no way you can use webhook and the received data alone to do it

solution

  • as proposed by @spiffytech, use the /events API, as it looks the only “ordered” list of events 😦
  • we use webhook not to process data, but to trigger an /events fetch
  • big big big issues:
    • /events fetch seems to have a fetch limit (100 events)
      • meaning, if your system has for example 1M invoices created every second (for analysis), your fetch of the last 100 (on any of the 1M webhook events) may not end listing all the created events…
    • inneficient, obvious.
  • fetching data more frequently than your events creation rate (and not use webhooks at all) may work, but that will be even more inneficient

damm

  • I will go for it for now - webhook + using /events API to order the events on my side. but damm, this is not a good solution.
  • @tomer-stripe any proposal/corection?

I know this is an old issue but let me introduce what kind of “workaround” I’ve implemented on my end.

Note: Even if this “solution” provides a way of consistency it comes with multiplied cost of performance and resources overhead.

First of all, IMHO the events should be fully sequential. When state of an entity (customer, subscription, invoice) is involved each event of the sequence should be co-dependent on the previous one and the processor (Stripe in this case) must guarantee on their end that you receive them in the proper order. I still have doubts that they went with this asynchronous approach because of huge technical architecture issues. It’s unacceptable to point in their documentation that ensuring event ordering is job of the integrator. See

Allowing multiple events on you server to be processed simultaneously even with refreshing the #StripeObject by ID from their SDK’s is potentially is error prone and dangerous because you’ll end up in huge percentage of cases with Unique Constraint Violation. This would happen because of the low time gap between the event triggering from Stripe’s processor side.

Imagine following event sequence:

  1. customer.subscription.updated {“created”: 1696754206}
  2. customer.subscription.created {“created”: 1696754206}

You have database table for storing the subscriptions. On your server web hook implementation you optimistically try to insert new subscription if does not exist each time an event is received. Due to low time gap in huge percentage of the cases there will be serious race conditions due to uniqueness. Moreover… if your application is designed to respond with errors for such cases, Stripe automatically will retry calling your server which I find as an overkill.

What we want to achieve:

  • Preventing event race conditions between processing events over same entity (customer, subscription, invoice).
  • Nevertheless it’s not the best solution, we want to have valid state of an entity during processing of the event (the proper one would be if the payload of the event is the valid one at the time when event has been sent).
  • Depending on fresh state of the entity (re-fetched via Stripe’s SDK) we want to have awareness whether or not we should process the data from the current event of execution.
  • Ability to fully control retries over failed events.
  • Of course… integrity.

Implementation:

  1. Let’s define our storage where we’re gonna keep incoming events from Stripe. I will choose PgSQL as storage engine but you can implement it in any RDBMS system.
create type stripe_event_status_enum as enum ('pending', 'processing', 'processed', 'failed');

create table stripe_event (
    id serial primary key,
    event_id varchar(255) not null constraint uq_stripe_event_id unique,
    event_name varchar(255) not null,
    payload json not null,
    status stripe_webhook_event_queue_status_enum not null default 'pending',
    created_at timestamp default not null,
    executed_at timestamp
);
  1. In you web hook you need only to store the incoming event adjusted to our newly defined queue table:

webhook.php

// verify stripe signature...
try {
     $event = Stripe\Webhook\Webhook::constructEvent(
         file_get_contents('php://input'), 
         $_SERVER['STRIPE_SIGNATURE'], 
         'your-stripe-server-secret-here'
     );

     $db->insert([
         'event_id' => $event->id, 
         'event_name' => $event->type, 
         'payload' => $event->toJSON(), 
         'created_at' => (new DateTime)->setTimestamp($event->created)
     ]);

     // respond with 200 (OK)
} catch (Stripe\Exception\SignatureVerificationException $verificationException) {
    // respond with 400 (bad request)
}
  1. Create worker script which will infinitely loop over event queue table in executed in separate process (the important part here is that you need to ensure that single process of this script is spawned in order to avoid race conditions).

worker.php

$stripeClient = new Stripe\StripeClient(/* your server secret key here */);

while (true) {
    $retriesCount = 0;
    $done = false;

    while (!$done) {
        $db->beginTransaction();

        try {
            $stmt = $db->prepare("
                SELECT *
                FROM stripe_event 
                WHERE executed_at IS NULL 
                AND status = 'pending' 
                ORDER BY id 
                LIMIT 1 
                FOR UPDATE
           ");
    
           $stmt->execute();
           $event = $stmt->fetch(\PDO::FETCH_ASSOC);

           if (!$event) {
               $db->commit();
               $done = true;
               continue;
           }

           if ($retriesCount > 2) {
               $db->prepare("UPDATE stripe_event SET status = 'failed' WHERE id = :id")
                  ->execute(['id' => $event['id']]);

               $db->commit();
               $done = true;
               continue;
           }

           $db->prepare("UPDATE stripe_event SET status = 'processing', executed_at = now() WHERE id = :id")
              ->execute(['id' => $event['id']]);

           $stripeEventObject = Stripe\Event::constructFrom(json_decode($event->getPayload(), true));

           switch ($stripeEventObject->type) {
               case Stripe\Event::INVOICE_CREATED:
               case Stripe\Event::INVOICE_UPDATED:
               case Stripe\Event::INVOICE_PAID:
               case Stripe\Event::INVOICE_PAYMENT_FAILED:
                    handleInvoiceEvent($stripeEventObject);
                    break;
               // list your events here for which you've subscribed for
               default:
                   throw new \Exception('Unsupported event type.');
            }

            $db->prepare("UPDATE stripe_event SET status = 'processed' WHERE id = :id")
               ->execute(['id' => $event['id']]);

            $db->commit();
            $done = true;
        } catch (\Throwable $exception) {
            $db->rollback();
            $retriesCount++;
        }
    }
}

function handleInvoiceEvent(Stripe\Event $stripeEvent) use ($stripeClient) {
    $invoice = $stripeEvent->data->offsetGet('object');

    // always fetch from Stripe API fresh entity
    try {
        $invoice = $stripeClient->invoices->retrieve($invoice->id);
    } catch(Stripe\Exception\ApiErrorException $stripeException) {
        // not found
        if ($stripeException->getHttpStatus() === 404) {
            $db->prepare('DELETE FROM invoice WHERE stripe_invoice_id = :stripe_invoice_id')
               ->execute(['stripe_invoice_id' => $invoice->id]);
            return;
        }

        throw $stripeException;
    }

    $invoiceFromDbStmt = $db->prepare('SELECT * FROM invoice WHERE stripe_invoice_id = :stripe_invoice_id');
    $invoiceFromDbStmt->execute(['stripe_invoice_id' => $invoice->id]);
    $invoiceFromDb = $invoiceFromDbStmt->fetch(\PDO::FETCH_ASSOC);

    if (!$invoiceFromDb) {
        $db->prepare('INSERT INTO invoice (stripe_invoice_id, status) VALUES (:stripe_invoice_id, :status)')
           ->execute(['stripe_invoice_id' => $invoice->id, 'status' => $invoice->status]);
    } else {
        $db->prepare('UPDATE invoice SET status = :status')->execute(['status' => $invoice->status]);
    }
}

Running into this as well. First thinking that my webhooks may be too slow, and I need transactions on my database. Then I saw the events coming in the “wrong” order. Unfortunately, the created field resolution is too low to use it as a workaround. I’m afraid of running into usage limits some day when fetching data from stripe on every event to receive the newest data.

I’m guessing the best solution is to store the created timestamp and only update if the new event happened after the last event.

In some of my tests, multiple webhooks for a resource have come through with the same created value. That seems not unlikely, since the field is relatively low resolution at whole seconds.

For now I’ve settled on treating webhooks as a signal to retrieve, using Redlock to ensure no single resource has parallel retrieve operations happening.

The workaround that has worked for me without problems so far is to use a create op for customer.subscription.created, and upsert for customer.subscription.updated.

I have a unique key constraint on the Stripe subscription ID in my database. If creating a new record in response to customer.subscription.created fails with a unique key constraint error, I just ignore it. I know the initial record could only have been created in response to a customer.subscription.updated event.

It’s not an ideal solution, but it’s the best I could come up with. I’m watching this space for a better solution!

Here’s the high-level TypeScript code:

const stripeWebhookEventHandlers: Records<
  (e: Stripe.Event) => void | Promise<void>
> = {
  'customer.subscription.created': handleSubscriptionEvent(async event => {
    // Stripe doesn't guarantee the order in which webhooks are called, which
    // means we may receive the `.create` event AFTTER receiving the `.update`
    // event once the subscription is activated. We don't want to accidentally
    // overwrite the subscription with an outdated event. that's why we'll use
    // a `create` operation here, while using `upsert` elsewhere.
    try {
      await syncSubscriptionViaWebhook(event.data.object, 'create')
    } catch (error) {
      // if it's a unique key constraint error, it means we've already created
      // the subscription via a more up-to-date (but out-of-order) event
      if (isPrismaError(error, PrismaError.UniqueKeyConstraint)) {
        return
      }

      throw error
    }
  }),

  'customer.subscription.updated': handleSubscriptionEvent(async event => {
    await syncSubscriptionViaWebhook(event.data.object, 'upsert')
  }),

  'customer.subscription.deleted': handleSubscriptionEvent(async event => {
    await syncSubscriptionViaWebhook(event.data.object, 'upsert')
  }),
}

I think this was a little miss leading as well but each event object contains a created timestamp. Checking the events it seems like this created filed is accurate. @tomer-stripe can you confirm this?

I’m guessing the best solution is to store the created timestamp and only update if the new event happened after the last event. This can be a simple upsert style update which a check on this column if you’re using traditional db like postgres.

calling retrieve is probably the simplest solution though.