The Order That Existed and Didn't Exist at the Same Time

Here's a fun one from last December. A client's Shopify store processed an order. Stripe charged the card. The customer got a confirmation email from Stripe. But the order never showed up in the client's fulfillment system. No shipping label. No inventory update. The customer waited a week, emailed support, and learned their order was basically lost in the void.

The cause? A webhook from Shopify to their fulfillment app failed on the first attempt (the receiving server was momentarily overloaded), and the retry mechanism wasn't configured. One failed HTTP request and the entire order fulfillment chain broke. That's what happens when webhook retry monitoring doesn't exist.

Why Webhooks Fail More Than You'd Think

Webhooks are fire-and-forget HTTP requests. Your e-commerce platform sends a POST request to another system saying "hey, an order just happened." If the receiving system is down, slow, or returns an error, that webhook either retries or gets lost. It depends entirely on how the sending platform handles failures.

Shopify retries failed webhooks up to 19 times over 48 hours. That sounds generous. But if your receiving endpoint has a misconfigured SSL certificate or returns a 401 because someone rotated an API key, every single retry will fail too. And after 48 hours, Shopify gives up and deletes your webhook registration entirely. Now you've got no data flowing at all and no alert to tell you it stopped.

I've seen this play out with Stripe webhooks, HubSpot webhooks, Zapier webhooks, and custom integrations. The failure pattern is always the same: something changes on the receiving end, retries exhaust, and data stops flowing silently.

What Webhook Retry Monitoring Needs to Cover

Good webhook retry monitoring watches three things:

  • Delivery confirmation. Did the webhook reach the destination and get a 200 response?
  • Retry patterns. Are webhooks being retried? How often? What's the failure rate?
  • Flow continuity. Is data still moving between systems? If you normally get 50 webhook events per hour and suddenly get zero, something's wrong

Most platforms have webhook logs. Shopify shows webhook delivery attempts in Settings > Notifications. Stripe has a webhook events dashboard with retry status. But nobody sits there watching webhook logs all day. You need automated monitoring that alerts you when delivery rates drop or retries spike.

A Practical Webhook Monitoring Setup

Here's what we run for clients with critical webhook integrations:

First, we monitor the receiving endpoint independently. A simple HTTP check every 5 minutes confirms the endpoint is reachable and returning 200s. If the endpoint goes down, we know before the webhooks start failing.

Second, we track webhook volume. We log incoming webhook events and set up alerts for anomalies. If a client normally receives 30-50 order webhooks per day and we see zero for 6 hours, that triggers an alert. Tools like Datadog or even a simple log aggregator can handle this.

Third, we run reconciliation checks. Once a day, we compare orders in the e-commerce platform against orders in the fulfillment system. Any mismatches get flagged for investigation. This catches the webhooks that failed silently and slipped through every other check.

Don't Wait for a Customer to Tell You

Webhook failures are invisible to your marketing team, invisible to your analytics, and invisible to your monitoring unless you're looking for them on purpose. The customer who paid but never got their order is the worst kind of bug report.

We built webhook flow monitoring into FunnelLeaks because this class of problem sits right at the intersection of marketing and operations. Your funnel doesn't end at the thank-you page. It ends when the customer gets what they paid for. If webhooks are the glue holding your systems together, webhook retry monitoring is what makes sure that glue doesn't silently fail.

Check your webhook logs this week. See how many retries you've had in the last 30 days. If the number surprises you, get proper monitoring set up before the next failure costs you a customer.