How to evaluate a cold email deliverability agency (and avoid the ones that cost you three months of pipeline)
Choosing a cold email deliverability agency? Here's what to actually check: bounce rates, reply rates, infrastructure setup, and red flags most buyers miss.
How to evaluate a cold email deliverability agency
Deliverability is the single variable that decides whether your cold email program generates pipeline or burns your sending domains into the ground. Most agencies will tell you they "handle deliverability" without being able to explain what that means in practice, which makes choosing one genuinely difficult.
This page breaks down what good deliverability work actually looks like, what to ask any agency before you sign, and where the market currently overcharges and underdelivers.
Why most advice on deliverability is wrong
Search "cold email deliverability agency" and you'll find listicles built around open rates. Agency A claims 60% open rates. Agency B claims 72%. The problem: open rates have been meaningless since Apple Mail Privacy Protection launched in September 2021. Apple's mail client prefetches images, which fires your tracking pixel regardless of whether the recipient ever reads the email. You're measuring Apple's servers, not human eyeballs.
We stopped reporting open rates to clients the day MPP went live. The real signals are bounce rate (keep it below 2% on any sending domain) and positive reply rate, defined as the percentage of contacted accounts that reply with genuine buying interest. Those two numbers tell you whether your emails are landing and whether the message is working. Everything else is noise.
The mistake I see most often is founders hiring an agency because it quoted a strong open rate without ever asking what the bounce rate or reply rate looked like. One number is broken by design; the other two are what drive revenue.
What a deliverability agency actually controls
Deliverability isn't one thing. It's a stack of decisions, each with its own failure mode. Here's what competent agencies manage, and what each layer costs when it breaks.
Domain and mailbox infrastructure
A properly built outbound infrastructure uses secondary sending domains, never your primary business domain. Each domain needs correct SPF, DKIM, and DMARC records. Mailboxes on those domains need a warmup period of at least three to four weeks before sending at volume. Skipping warmup is the fastest way to land in spam folders and trigger Google and Microsoft's bulk sender filters.
Proper infrastructure takes time and adds tooling cost. Expect to pay $15 to $30 per month per mailbox in combined hosting and warmup-tool fees before you send a single email. An agency that sets up five sending domains with three mailboxes each is running 15 mailboxes. That's $225 to $450 per month in infrastructure alone. Competent agencies build this cost into their program rather than hiding it.
List hygiene and bounce control
Bounce rate is a direct deliverability signal. Google and Microsoft track it. If your domain consistently generates hard bounces above 2%, inbox placement degrades quickly, and recovery can take 60 to 90 days of reduced volume before the domain regains trust.
Good agencies verify contact lists before sending, typically through a combination of tools like ZeroBounce or NeverBounce, plus a catch-all verification pass. The catch-all problem is real: a domain that accepts all email regardless of whether the mailbox exists will pass basic verification but still bounce when you send. Agencies that don't account for this will routinely burn domains every 90 days and replace them, which looks like "normal churn" until you understand what's happening.
For a European print-on-demand marketplace where we run US-targeted outbound, we maintain bounce rate below 1.4% across roughly 3,000 contacts per month by running a three-step verification pass on every list before any send goes out. That verification step alone drops roughly 8 to 12% of raw leads from the send list each cycle.
Sending volume and rotation logic
Fresh domains and mailboxes can't send 200 emails per day from day one. A reasonable ramp looks like this: weeks one and two, 20 to 30 emails per day per mailbox. Weeks three and four, 40 to 50. Full volume, typically 80 to 100 per day per mailbox, only after the warmup window closes. Agencies that skip this because a client wants fast volume are trading short-term activity for long-term domain health.
We track spam placement rate using inbox placement tests, typically running a test batch through seed-list tools before each new campaign phase. If placement drops below 90% inbox on a given domain, that domain gets pulled back to reduced volume while we diagnose the cause. The cause is almost always one of three things: sending too fast, a list quality problem, or a copy trigger hitting spam filters.
Copy and spam filter interaction
Deliverability isn't purely technical. Copy matters. Certain phrases, HTML-heavy formatting, and tracking link density all affect whether Gmail's filters classify your email as promotional or spam. We strip click-tracking links from outbound emails in almost every program we run. The data those links produce isn't reliable anyway since it conflates link-preview bots with actual clicks, and the links themselves add a redirect hop that some filters flag.
Plain-text or near-plain-text emails, written to sound like a genuine one-to-one message, outperform heavily formatted templates on deliverability and reply rate. This isn't a stylistic preference. It's a pattern we've seen hold across more than 40 retainer engagements, across industries from B2B SaaS to promotional products to apparel.
What to ask before hiring a cold email deliverability agency
Here are the five questions I'd ask any agency in a first call, and what their answers tell you.
What bounce rate do you maintain, and how do you verify lists before sending? If they quote a number above 2% or can't describe their verification process, walk away. A competent agency can tell you their average bounce rate across active programs without hesitation.
How do you monitor inbox placement? The honest answer involves inbox placement tests run against seed lists. "We monitor open rates" is the wrong answer for reasons already covered above.
What does your domain and mailbox setup look like? They should describe secondary domains, three-stage DNS setup, and a warmup protocol. If they're sending from your primary domain, that's a red flag.
What's your average positive reply rate across active programs? Anything above 2% on a well-targeted list is solid. Above 3% is good. Below 1% on a properly built list suggests a copy or targeting problem, not just a deliverability problem.
Do you use tracking links in the emails you send? If yes, ask how they account for the deliverability impact. Good agencies have a clear position on this, not a non-answer.
What cold email deliverability agency retainers actually cost
Most cold email agencies in this space charge between $4,000 and $8,000 per month for a full-service retainer that includes deliverability infrastructure, list building, copywriting, and ongoing optimization. At the lower end of that range, you're usually getting a smaller team, fewer contacts per month (typically 1,000 to 2,000), and less senior copy involvement. At the higher end, volume scales up and you should expect dedicated account management and weekly reporting on the metrics that matter.
Deliverability-only services, where an agency handles infrastructure and list hygiene but not strategy or copy, run $1,500 to $3,000 per month. That's a reasonable split if you have in-house writers who understand cold email copy but lack the technical setup expertise. The catch: if your copy is what's suppressing replies, a deliverability-only vendor won't catch it.
One-time domain and mailbox setup, without ongoing management, typically runs $500 to $1,500 depending on the number of domains. That's fine for teams that can manage ongoing operations internally, but most founders underestimate how much ongoing maintenance the infrastructure actually needs: monitoring bounce rates, rotating domains when signals degrade, and re-verifying lists as contact data ages.
If you want to understand what a full outbound program looks like before pricing conversations, the cold email agency pillar on this site covers the full scope of what agencies do and how to evaluate them end to end.
The deliverability metrics that actually predict outcomes
Let's put the tracking framework clearly, because this is where most buyers get confused by what agencies report.
Bounce rate: should stay below 2% per sending domain. Above that threshold, inbox placement starts to degrade within two to three weeks of sustained sending. Below 1% is the target for well-verified lists.
Positive reply rate: the north-star metric. This is the percentage of contacted accounts that reply with genuine buying interest, not auto-replies, not unsubscribes, not "wrong person" redirects. A well-run program targeting a qualified list should sit between 1.5% and 3.5% positive reply rate on campaigns that have been iterated at least once.
Meetings booked per 1,000 contacts: a useful benchmark across programs. Across the programs we run, this typically lands between 4 and 12 meetings per 1,000 contacts, depending on the market, offer specificity, and list quality. US markets for B2B SaaS tend toward the higher end. Broader outreach with a less defined ICP sits at the lower end.
Spam placement rate: measured through inbox placement tests, not inferred from reply rates. A domain can generate replies while still having 20% of sends going to spam, which means you're leaving pipeline on the table without knowing it.
Open rates are not on this list. They haven't been a reliable signal since mid-2021, and any agency leading with open rate benchmarks is either uninformed or hoping you are.
When deliverability work alone won't fix your results
This is the nuance most agencies skip. Deliverability work is a prerequisite, not a complete solution. If your emails are landing in the inbox but generating a 0.3% positive reply rate, the deliverability is fine. The problem is copy, targeting, or offer.
A US promotional products brand we work with came to us after a previous agency had rebuilt their infrastructure twice in six months without improving results. The domains were clean. The bounce rate was under 2%. But the copy read like a press release, the targeting was pulling from a list built on SIC codes with no firmographic filtering, and the offer was vague. We rebuilt the ICP definition, rewrote the sequence from scratch, and the positive reply rate went from 0.4% to 2.1% in eight weeks. Infrastructure wasn't the issue.
Before you hire a cold email deliverability agency, honestly assess whether your actual constraint is technical or strategic. Technical means: domains are flagged, bounce rate is high, emails are going to spam. Strategic means: emails land but nobody replies. The two problems need different fixes, and conflating them is expensive.
How to structure the agency evaluation process
If you're evaluating agencies right now, here's the sequence I'd follow.
Define your constraint first. Is your current program failing on delivery (spam placement, high bounce rate) or on response (emails land, nobody replies)? One sentence answer before you talk to anyone.
Run a deliverability audit on your current setup. Tools like GlockApps or Mail-Tester can give you a baseline inbox placement score in 30 minutes. If you're below 85% inbox placement, you have a technical problem. If you're at 95% inbox but not generating replies, you have a copy or targeting problem.
Ask agencies for their average bounce rate and positive reply rate across active programs, not their best-ever campaign, their average. If they won't share it, that tells you something.
Understand what they're actually building for you: secondary domains, number of mailboxes, warmup protocol, list verification approach, and copy iteration process. A proposal that doesn't specify these things is a proposal you can't evaluate.
Treat the first 60 days as a data-collection phase, not a results phase. Good deliverability work is invisible when it's working. The signal that it's working is a bounce rate below 2% and a spam placement rate above 90% inbox. Meetings booked before week eight are a bonus, not the baseline expectation.
If you're ready to talk through what a program looks like for your specific market and offer, book a discovery call and we'll audit what you have and tell you exactly where the constraint is.
Dedicated team structure versus account managers with 40 clients
One thing that rarely gets discussed in agency comparisons is staffing model. Many agencies at the $4,000 to $6,000 per month price point are running a single account manager across 30 to 50 active clients. That's not a judgment, it's a business model, but it has a direct impact on how quickly deliverability problems get caught and fixed.
When a domain starts to degrade, speed of diagnosis matters. A domain sending at high volume with degraded inbox placement for three weeks will need 60 to 90 days to recover. An account manager with 40 clients checking weekly reports will miss a two-week degradation window regularly. A smaller team with dedicated attention to your program will catch it in days.
Ask any agency directly: how many active clients does the person managing my account handle? Below 15 is healthy. Above 30, you're getting reactive management, not proactive optimization. This is worth naming explicitly in any negotiation, and it's something most buyers don't ask about until they've already had a domain burned.
For more on how agencies differ in structure and what to look for across the full evaluation, the cold email agency guide on this site goes deeper on team models, pricing tiers, and how to compare proposals side by side. Worth reading before any shortlist conversation.
The one thing that predicts whether a deliverability agency is worth hiring
After running programs for European companies breaking into the US market, ecommerce brands pushing B2B buyers to webshops via targeted discount-code outreach, and growth equity firms building deal-flow pipelines, the clearest signal I've found is this: can the agency tell you what positive reply rate they're generating across their active book of business, broken down by industry vertical, right now?
Not their best campaign. Not a case study. Their current average, across all active programs, by vertical. If they can answer that with a real number, they're measuring what matters. If they deflect to open rates, vanity metrics, or "it depends on so many factors," they're not.
A competent cold email deliverability agency runs below 1.5% bounce rate, above 90% inbox placement on placement tests, and generates between 1.5% and 3.5% positive reply rate on properly targeted lists. Those are the numbers. Hold any agency to them, including us.
If your current program isn't hitting those benchmarks, or you're starting from scratch and want to build it right the first time, book a discovery call and we'll spend 30 minutes telling you exactly what needs to change.
