Enterprise SaaS Leadership Insights

The SaaS Dunning Strategy Playbook: A Framework for Recovering Failed Payments

How to structure your retry logic, timing, and communications to maximise payment recovery — without damaging customer relationships

Last Updated

Last Updated

Last Updated

February 16, 2026

Dunning is one of those operational disciplines that most SaaS businesses set up once and rarely revisit. A retry schedule is configured, a couple of payment failure emails are written, and the system runs quietly in the background. It either recovers the payment or it doesn't.

The problem with that approach is that dunning performance degrades over time without anyone noticing. Your customer base evolves. Your billing model changes. Your transaction volume grows. The retry logic configured for 5,000 monthly transactions does not produce the same recovery rate at 500,000. And because the degradation is gradual, it rarely triggers an alert.

This playbook gives you a framework to audit your current dunning setup and build something that actively manages recovery rather than passively running it.

What Dunning Actually Covers

Dunning is the full sequence of actions taken in response to a payment failure — from the first retry attempt through to the final resolution, whether that is a successful recovery, a customer-initiated card update, or an account suspension.

It has three components that need to work together:

Retry logic — the automated payment re-attempts that run after a failure, governed by timing rules, decline code routing, and processor decisions.

Customer communications — the emails, in-product notifications, and SMS messages that prompt the customer to act when a retry alone is unlikely to recover the payment.

Escalation logic — the rules that govern what happens when retries and communications fail to produce a recovery: account suspension, service restriction, final notice, write-off.

Most dunning setups handle these three components separately, with limited coordination between them. The retry schedule runs regardless of whether a communication has been sent. The communication sequence runs regardless of whether the retry succeeded. The escalation rules fire based on time elapsed rather than on what has actually happened in the recovery process.

A well-designed dunning strategy coordinates all three — retry outcomes inform communication timing, communication responses inform retry decisions, and escalation is triggered by the actual state of the recovery attempt rather than a fixed timer.

Part 1: Retry Logic

Separate hard from soft declines before anything else

This is the single most impactful change most businesses can make to their retry logic. Hard declines — card reported stolen, invalid account, do not retry — have near-zero recovery probability. Running them through a standard retry queue wastes retry capacity and, for compromised card numbers, risks triggering issuer-level responses that affect your broader payment acceptance.

Every payment failure should be categorised at the point of failure:

Route hard declines directly to customer communications. There is nothing a retry can do here. The customer needs to update their payment method. Get them a communication immediately.

Route soft declines to the retry queue with a code-specific schedule. Soft declines are recoverable, but the timing depends on the reason.

Retry timing by decline reason

The following timing guidance reflects the recovery probability curves for each major soft decline category. These are the intervals at which retry attempts are most likely to succeed:

Insufficient funds (code: 51) The card is temporarily over limit. Recovery probability is highest at the start of the next billing cycle — when monthly limits reset — or shortly after typical payday dates for your customer geography (25th–1st for UK customers, for example). Retrying mid-cycle is the least productive timing. Recommended: Hold 7–10 days, then retry at the start of the next cycle.

Do not honour — unspecified (code: 05) The issuer declined without a specific reason. This covers a wide range of temporary issuer-side holds and risk flags, most of which resolve within 24–72 hours. Recommended: First retry at 24 hours. Second retry at 72 hours. Escalate to communications if both fail.

Temporary bank-side error or network issue (codes: 96, 91, 06) A system error on the issuer or network side. These resolve quickly — often within minutes, always within a few hours. Recommended: First retry at 1–2 hours. Second retry at 6 hours. If both fail, treat as a different failure type and re-route accordingly.

Card velocity limit exceeded (code: 61) The card has hit its daily or monthly transaction limit. This resets on a known cycle — usually daily for transaction count limits, monthly for spend limits. Recommended: Retry the following day for transaction count limits. Retry at the start of the next month for spend limits.

Refer to card issuer (code: 01) The issuer wants the cardholder to call before authorising. This often resolves when the customer contacts their bank — a communication prompting them to do so is more effective than a retry. Recommended: Send customer communication immediately. Single retry at 48 hours if no response.

Expired card (code: 54) The card on file is past its expiry date. A retry will not succeed. The customer needs to update their card details, or an automated card updater service (VAU/ABU) needs to supply the replacement card number. Recommended: If card updater is running, allow 24–48 hours for it to supply updated details before retrying. If not, route directly to communications.

Retry attempt limits

Three attempts is the most common default. It is also arbitrary. The appropriate number of retry attempts depends on the failure type and the recovery probability curve:

  • Hard declines: zero retries

  • Expired card: zero retries (pending card updater or customer action)

  • Temporary errors: two retries within the first 12 hours, then stop

  • Insufficient funds: two retries at the timing intervals above, then escalate

  • Do not honour: two to three retries over 72 hours, then escalate

More retries do not produce proportionally more recoveries. Beyond two or three attempts, the marginal recovery rate drops sharply while the risk of triggering issuer-level responses or customer complaints increases.

Multi-processor fallback

For businesses running more than one payment processor, retry logic should include a processor routing decision. A transaction declined by your primary processor on a soft decline code should be routed to a fallback processor on the second attempt, for the subset of failures where the decline pattern is consistent with a processor-level acceptance rate issue rather than a genuine card problem.

This is most relevant for specific card type and geography combinations where processor acceptance rates diverge materially. If you are running a single processor, this option is unavailable — which is itself a dunning constraint worth noting.

Part 2: Customer Communications

The post-failure window

There is a short window immediately after a payment failure during which a customer is most likely to update their payment details without additional prompting. They are in your product, aware something went wrong, and not yet disconnected from the intent to continue.

This window is typically minutes to a few hours. Standard dunning email sequences — sent on a fixed schedule, often 24–72 hours after the failure — miss it entirely.

The most effective recovery sequences trigger an in-product prompt immediately at the point of failure: a payment update banner, a card update modal, or a contextual notification in the product UI. This captures a cohort of self-service recoveries before the email sequence even begins.

Communication sequence structure

A well-designed dunning communication sequence has three phases:

Phase 1 — Immediate (0–2 hours after failure) In-product prompt if the customer is in session. Automated payment update notification if they are not. Tone: factual and helpful. Do not imply fault. "We were unable to process your payment — here's how to update your card details."

The link in this communication should go directly to a card update flow, not to the homepage or account settings. Every additional step between the customer and updating their card reduces completion rate.

Phase 2 — Follow-up (24–72 hours after failure) If Phase 1 produced no action and the retry has not succeeded, a follow-up email. This communication can be slightly more direct about the consequence — service interruption if not resolved — without being aggressive. Include the specific card that failed (last four digits), the amount that could not be processed, and a direct link to the update flow.

Subject lines that reference the specific product or amount ("Your Chargehive subscription — payment update needed") consistently outperform generic subjects ("Action required on your account").

Phase 3 — Final notice (5–7 days after failure) If Phase 2 produced no action and retries are exhausted, a final notice before service suspension. This communication should be clear about what will happen and when — "Your account will be suspended on [date] unless your payment details are updated" — and should include a prominent, single call to action.

After Phase 3, the account enters the escalation stage.

Segmenting communications by customer profile

Not all customers should receive the same dunning sequence. Two variables that should govern segmentation:

Customer tenure and payment history. A customer with three years of uninterrupted payments is almost certainly dealing with a temporary card issue. A more patient, lower-urgency sequence with a longer window before escalation is appropriate. A newer customer, or one with a history of payment issues, warrants a more direct sequence with a shorter escalation timeline.

Subscription value. High-value subscribers warrant higher-touch intervention — in some cases, a direct outreach from the account team rather than an automated sequence. The economics of a manual intervention are justified at enterprise or high-ACV subscription levels.

Communication tone

Dunning communications have a reputation for being adversarial. The best-performing sequences are not. They are factual, specific, and frame the situation as a problem to be solved rather than a failure to be addressed.

The customer receiving a dunning email did not set out to not pay. In the majority of cases, it is a card issue they were unaware of. Communications that treat them accordingly — informative, easy to act on, not guilt-inducing — produce better response rates and preserve the relationship for customers who do re-engage.

Part 3: Escalation Logic

What escalation means

Escalation is the set of rules that govern what happens when the dunning sequence does not produce a recovery. It should be designed deliberately rather than defaulting to immediate account suspension.

The options at escalation are:

Service restriction. Limit access to the product rather than suspending it entirely. The customer can log in and see their data but cannot use the core functionality. This maintains the relationship while making the payment situation visible. Some businesses find that service restriction produces more recoveries than suspension, because the customer engages with the product and encounters the restriction naturally.

Account suspension. Full suspension of access. Appropriate for shorter grace periods or lower-value subscriptions where the cost of carrying non-paying accounts is significant.

Extended grace period. For high-tenure, high-value customers, an extended grace period before escalation — 14–21 days rather than 7 — gives more time for the customer to act without the friction of a suspension and reactivation cycle.

Write-off and win-back. For accounts that do not recover through the dunning sequence, a win-back flow — a communication sent 30–60 days after suspension offering a path back — recovers a small but non-trivial proportion of churned subscribers.

Defining the escalation timeline

The escalation timeline should be defined by subscription value and customer tenure, not by a single fixed rule:

Customer profile

Grace period

Escalation action

New subscriber, standard plan

7 days

Service restriction, then suspension at day 14

Established subscriber, standard plan

14 days

Service restriction, then suspension at day 21

High-value or enterprise subscriber

21 days

Account team outreach, extended restriction

Repeated payment failures (3+ in 12 months)

7 days

Suspension, targeted win-back sequence

These are starting points, not fixed rules. Calibrate against your own churn data — specifically, what proportion of suspended accounts reactivate within 30 days, and how that varies by segment.

Part 4: Measuring Dunning Performance

A dunning strategy that is not being measured is not being managed. These are the metrics that tell you whether the strategy is working:

Gross recovery rate: The percentage of payment failures that are eventually recovered — through any combination of retry, customer-initiated card update, or dunning communication response. Benchmark: 55–70% for a well-operated SaaS business.

Recovery rate by channel: What proportion of recoveries came from successful retries vs. customer-initiated card updates vs. direct communication response? This breakdown tells you which element of the dunning sequence is doing the work — and which is not.

Recovery rate by decline code: Are you recovering a higher proportion of insufficient funds failures than do-not-honour failures? If not, your timing logic may not be calibrated by decline reason.

Time to recovery: How long does the average recovery take from failure to successful payment? Shorter is better — longer recovery windows mean longer periods of service disruption and higher churn risk.

Churn rate post-dunning: What proportion of accounts that enter the dunning sequence eventually churn, regardless of payment recovery? A high post-dunning churn rate, even among accounts that recover payment, can indicate that the communications or the escalation experience is damaging the relationship.

Dunning sequence open and click rates: For the communication phase, standard email engagement metrics tell you whether your sequence is reaching and prompting customers. Low open rates suggest deliverability or subject line issues. Low click rates on a high open rate suggest friction in the payment update flow.

The Data Infrastructure Requirement

The playbook above describes what a well-designed dunning strategy looks like. Implementing it requires that the systems governing each component can communicate with each other.

Decline code-based retry routing requires that the retry logic has access to the decline code from the PSP in real time. Customer-segment-based communication sequencing requires that the communication platform has access to tenure and payment history from the CRM and billing system. Escalation logic calibrated by subscription value requires that the escalation rules have access to ACV data.

In stacks where these systems are disconnected — where the retry logic runs in the billing system, the communications are sent from the email platform, and the customer data lives in the CRM — the coordination required to run a proper dunning strategy is manual, fragile, and does not scale.

This is the architectural reality underneath the operational playbook. The strategy is only as good as the data connections that make it executable.

→ See how Chargehive coordinates retry logic, communications, and customer data in a single operational layer: Billing

→ Read next: How to Reduce Involuntary Churn in SaaS: The Operational Playbook

It's Time

At hyper-scale, the limitations of CRMs, payment tools and stitched-together systems become unavoidable.

Tell us where the friction is and we’ll show you what it looks like once it’s gone.

©Chargehive 2026