How to choose a B2B cold email agency: 9 things that actually separate good ones from bad ones
Looking for a B2B cold email agency? Here's a practical breakdown of what separates high-performing agencies from expensive disappointments, with real benchmarks.
How to choose a B2B cold email agency: 9 things that actually separate good ones from bad ones
The real question when evaluating a B2B cold email agency isn't whether they can send emails. It's whether they can generate positive replies from accounts that actually match your ICP, at a cost per meeting that makes commercial sense. Most agencies pitch deliverability and volume. The ones worth hiring pitch reply rate and pipeline fit.
We've run cold email programs for European companies breaking into the US market, ecommerce brands pushing B2B buyers through discount-code outbound, and growth-equity firms targeting specific fund cycles. Across those engagements, one thing stays constant: the gap between a mediocre cold email program and a productive one almost never comes down to sending volume. It comes down to targeting precision, message quality, and how quickly the team iterates on signal.
This page breaks down the 9 criteria we'd use if we were hiring an outside agency, the benchmarks worth demanding upfront, and the structural red flags that tell you to walk away before signing a contract.
1. They track reply rate, not open rate
Open rates have been broken since Apple Mail Privacy Protection launched in 2021. Apple prefetches tracking pixels, which fires an "open" event whether or not the email was ever read. Any agency presenting open rates as a primary KPI either doesn't know this or is hoping you don't. Either way, it's a bad sign.
The metric that matters is positive reply rate: the percentage of contacted accounts that respond with genuine buying interest. A realistic benchmark for a well-targeted B2B cold email program is 1.5% to 3.5% positive reply rate, depending on ICP fit, offer clarity, and how competitive the vertical is. Below 1% after 6 weeks of iteration, something structural is broken. Bounce rate is the other signal worth watching. Keep it below 2% or your sending infrastructure will degrade faster than the agency can fix it.
The mistake I see most often is founders letting agencies report "70% open rate" as proof of performance. That number is noise. Ask for positive reply rate per 1,000 contacts and meetings booked per month. If they can't produce those figures, that tells you everything.
2. The agency has real deliverability infrastructure, not just a Smartlead account
Deliverability is boring until it kills your program. A cold email deliverability agency running things properly will have dedicated sending domains, aged inboxes (minimum 30 days of warmup before live sends), inbox placement tests run via tools like GlockApps or Lemwarm, and bounce rate tracking below that 2% threshold mentioned above.
What most agencies actually have is a shared Smartlead or Instantly account, a few domains registered last week, and a warmup sequence they turned on the same day they started your campaign. That combination produces one outcome reliably: your domains hit spam folders within 60 days, your reply rate collapses, and you're told to "give it more time."
A serious agency will own its sending infrastructure or have explicit policies for how many clients share a domain pool. Ask them. If they can't answer without hesitation, assume the worst.
We use dedicated domain setups per client, run inbox placement tests before any live sequence launches, and treat bounce rate as a weekly health signal rather than an afterthought. If bounce rate crosses 2%, we pause and fix it before continuing. That costs us speed occasionally. It saves programs from irreversible domain damage.
3. They can describe their data sourcing process in one paragraph
List quality is where most cold email programs die quietly. An agency that can't explain precisely where they source contacts, how they verify emails before sending, and what their rejection criteria are for a given ICP is going to waste 40% of your sends on bad data.
The benchmark I'd demand: less than 3% hard bounces on any new list segment before it hits live sequences. If they're using unverified scraped lists or bulk-exported databases without enrichment, you'll see bounces climb fast and domain reputation follow. Ask them: "How do you build a list for a company targeting mid-market US logistics firms with 50-500 employees?" A good answer takes 90 seconds and mentions specific tools or data sources. A vague answer about "proprietary databases" means they're buying ZoomInfo exports and calling it a day.
4. They understand that message volume and message quality pull in opposite directions
There's a persistent myth that more sending volume equals more pipeline. It can, but only up to the point where list quality degrades or message personalization drops. After that, volume is actively harmful. Higher bounce rates, more spam reports, and sequences that train recipient domains to filter your infrastructure.
A 500-contact sequence with a 3% positive reply rate generates 15 conversations. A 5,000-contact spray with a 0.2% reply rate generates 10 conversations and costs you two sending domains. The math isn't complicated, but it runs against the incentive structure of agencies billing on volume metrics.
The agencies worth working with set a ceiling on daily send volume per inbox (usually 30-50 emails per inbox per day) and hold that line even when clients push for acceleration. The tradeoff is that you can't scale to 10,000 contacts per week overnight. Building the infrastructure to do that safely takes 8-12 weeks of domain aging. Any agency promising fast ramp-up at high volume is cutting corners somewhere.
5. They've worked in your specific market or motion, not just "B2B"
Cold email for a European SaaS breaking into the US market is structurally different from cold email for a US-based promotional products brand trying to push B2B buyers to an ecommerce checkout. The ICP research, the sequence structure, the offer framing, the follow-up cadence, even the sending timezone logic: all of it changes.
For a European print-on-demand marketplace where we run a US-targeted outbound program, the core challenge isn't finding potential buyers. It's communicating trust and fulfillment reliability to US procurement contacts who've never heard of a European vendor in that category. The messaging has to address that objection before it's raised. That's not something an agency figures out in week one unless they've done it before.
For ecommerce brands running B2B cold email, the motion is completely different. We've built sequences for a European apparel brand entering the US wholesale market where the email's only job is to get a B2B buyer to click through to a product page with a pre-loaded discount code. The conversion happens on-site, not in the reply. Most cold email agencies don't have any experience structuring that kind of program because they've only ever done lead-gen for SaaS. If that's your use case, see our breakdown of B2B ecommerce cold email to understand what the setup actually looks like.
6. Their pricing reflects a real deliverable, not a retainer with soft commitments
Most B2B cold email agencies charge between $3,000 and $8,000 per month on retainer. Some go higher for enterprise programs or dedicated SDR-plus-copy packages in the $10,000-$15,000 range. A few charge on a performance basis (per meeting booked), usually $300-$600 per qualified meeting, which aligns incentives better but often comes with tighter ICP restrictions.
The problem with pure retainers is that the agency gets paid whether the program works or not. That's not automatically disqualifying, but it means you need hard deliverables written into the agreement: contacts built per month, sequences launched, reply rate thresholds that trigger a strategy review. "We'll send emails and optimize" is not a deliverable. "We'll contact 1,200 accounts per month across two ICP segments, maintain bounce rate below 2%, and share reply-rate data weekly" is a deliverable.
Vectify doesn't publish fixed pricing because program scope varies too much to quote a flat rate honestly. What I'd tell any founder evaluating us or anyone else: demand a scope document with specific monthly deliverables before you sign anything.
7. They give you a realistic timeline, not a 2-week guarantee
Cold email programs need time to work. Domain warming takes 3-4 weeks before you can send at meaningful volume. ICP refinement from early replies typically takes another 2-3 weeks. The first genuine pipeline meetings usually land in week 5 to week 8 of a new program, assuming the targeting and messaging are solid from the start.
Any agency guaranteeing meetings in 2 weeks is either starting you on pre-warmed infrastructure they've been building for other clients (possible, but worth asking about directly) or setting expectations they can't meet. The "10x your meetings overnight" pitch is a lead-gen hook, not a realistic program outcome.
Here's what actually happens when a program is set up properly: weeks 1-3 are infrastructure and list build. Week 4 is the first live sequences at conservative volume, 20-30 emails per inbox per day. Weeks 5-6 produce the first replies, most of which are nos or unsubscribes, but some of which are signal. Weeks 7-8 you start seeing positive replies from the right accounts. By week 10-12 you have enough data to know whether the program is working and what to change if it isn't.
If you want a sense of what the full outbound setup looks like before meetings start landing, our outbound lead generation agency page walks through the structural components in more detail.
8. The comparison that actually matters: agency types
Not all B2B cold email agencies run the same model. Here's a practical breakdown of the three most common types and who each one suits.
Agency type | Best for | Watch out for | Typical cost |
|---|---|---|---|
Full-service cold email retainer | Companies that want strategy, copy, data, and infrastructure managed end-to-end | Vague deliverables, no reply-rate SLAs | $4k-$10k/month |
Performance/pay-per-meeting | Companies with a clear ICP and proven offer | Narrow ICP filters, volume caps, meeting quality varies | $300-$600/meeting |
Done-with-you (setup + coaching) | In-house teams that need infrastructure built, then want to own it | Steep learning curve after handoff, no ongoing iteration | $5k-$15k one-time |
Most early-stage B2B companies are better served by a full-service retainer that runs the whole program, because they don't have an SDR team to own it internally. Companies that have already validated their cold email motion and want to scale volume are better candidates for a performance model. The done-with-you option makes sense specifically when you have a founder or ops person who's willing to own it long-term. Most aren't, which is why handoff programs have a high failure rate after month three.
If you're a European company targeting the US market, the agency you choose needs experience with cross-timezone sends, US compliance norms (CAN-SPAM, not GDPR), and the specific trust gap that comes with an unfamiliar European brand name in a US inbox. That's a different conversation than generic cold email. Our European cold email agency page covers the structural differences in detail.
At the halfway point of your evaluation, one question cuts through most of the noise: can this agency show me reply-rate data from a program that targets accounts similar to mine? If the answer is yes with specific numbers, keep talking. If it's a case study PDF with "increased pipeline by 300%" and no underlying metrics, treat it as marketing, not evidence.
Book a discovery call if you want to talk through what a cold email program for your specific market and motion would actually look like before committing to anything.
9. They have a clear position on what they won't do
Good agencies turn down work. They turn down clients with a vague ICP who want to start sending in a week. They turn down verticals where cold email structurally underperforms (consumer finance, regulated healthcare in some jurisdictions). They push back on clients who want to send 10,000 emails per week on two domains registered last Tuesday.
The mistake I see most often with founder-led cold email evaluation: treating willingness to start immediately as a green flag. It's often the opposite. An agency eager to sign you without asking hard questions about your ICP, your offer, and your sales cycle is one that optimizes for getting you under contract, not for making the program work.
We've turned down ecommerce brands that wanted to send cold email to scraped consumer lists and call it B2B outbound. We've pushed back on SaaS companies with a 12-month enterprise sales cycle who expected cold email to produce signed contracts in 90 days. Cold email is a top-of-funnel motion. It books first conversations. What happens in those conversations is on you.
Decision framework: when to hire a B2B cold email agency and when not to
Hire an agency if: you have a defined ICP (job title, company size, industry, geography), a clear value proposition you can articulate in two sentences, a sales process that can handle inbound replies from cold leads, and budget for at least a 3-month engagement. Below that time horizon, you're paying for setup without getting to the optimization phase where programs actually produce consistent pipeline.
Don't hire an agency if: you're still figuring out who your customer is, your product isn't deployed yet, or your sales team can't follow up with new leads within 24 hours. Cold email generates conversations. If your side of the funnel leaks, the agency's work is wasted and you'll blame the channel rather than the conversion problem.
For European companies specifically, there's a third constraint worth naming: if you don't have a US entity or at least a US-facing email domain and point-of-contact, US prospects are more likely to treat your cold email as spam even if the message is good. Setting up a US domain and a forwarding number costs under $200 and materially improves reply rates. Do it before the program launches, not after.
What the numbers should look like at 90 days
After 90 days of a properly run cold email program, here's a realistic benchmark set:
Bounce rate: below 2% on all active sequences
Positive reply rate: 1.5% to 3.5% depending on ICP tightness and offer clarity
Meetings booked per 1,000 contacts: 8 to 20, depending on vertical and deal size
Spam placement rate: below 5% on inbox placement tests
For context on what 8-20 meetings per 1,000 contacts looks like in practice: for an NYC growth-equity firm's outbound program, we run US-targeted sequences at roughly 1,200 contacts per month, which produces around 12-18 qualified conversations monthly after the first 6 weeks of iteration. That's a tight ICP with a clear seasonal trigger (fund cycle timing). A broader ICP in a more competitive vertical will probably land at the lower end of that range.
If you're benchmarking against agencies quoting open rates as their primary performance metric, those numbers aren't comparable to the ones above. Open rates post-MPP are inflated and don't predict replies. Demand reply rate and bounce rate, or the comparison is meaningless.
The one question worth asking every agency you evaluate
"What's the positive reply rate on your best-performing active program, and what ICP is it targeting?"
A good agency answers with a number and a sentence of context. A mediocre agency deflects to open rates, volume stats, or client logos. The answer to that one question tells you more than an hour of sales calls.
If you want to see how we'd approach a program for your specific ICP and market, book a discovery call and we'll tell you honestly whether cold email is the right motion for where you are right now, and if it is, what the first 90 days would actually look like in terms of contacts, sequences, and expected reply rate.
