Table of contents
Get insights delivered straight into your inbox every week!

How to Build AI Agents for GTM: The Stack That Replaced My 50-Person Sales Team

My old sales team had 50 people. My current team has 6.

The 6-person team books more meetings every month than the 50-person team did, at a fraction of the cost per meeting, and the business is still growing 10 to 20 percent month over month.

I get asked about this setup more than anything else. People want to know which agents I use, where I draw the line between agent and human, what the budget actually looks like, and what failed hard enough that I had to rip it out.

This post is the answer. The full stack I run in production every day for GTM, including the parts that did not work.

TL;DR

The reason my smaller team outperforms the old one is not headcount reduction for its own sake. It is that a large chunk of what 50 SDRs used to do is work an AI agent now handles reliably, which includes sourcing leads against an ideal customer profile, writing personalised cold email and LinkedIn copy, following up, routing replies, and reporting. I orchestrate that work on top of Salesforge, Mailforge,Warmforge, and Leadsforge,with an AI SDR calledAgent Frank handling daily execution. The six humans on the team do what agents still cannot do well, which is live conversations, complex discovery, and closing. The maths works out to roughly 650 dollars a month in tooling to book around 20 meetings, versus the six-figure monthly payroll it used to take to get the same output.

What actually happened when I rebuilt my outbound around agents?

Back when I was a VP of Sales running a 50-person outbound team, the only way I knew how to scale pipeline was to hire more bodies. The problem was that the tools my reps were stuck on had per-seat pricing, and that model quietly pushed the whole industry toward bigger teams because every new seat made the vendor more money. Conversion rates barely moved while headcount kept growing.

The turning point was realising the stack I wanted did not exist. I wanted a lean team where every person was on the phone closing, and the research, drafting, sending, follow-up, and reporting was done by software. Large language models made the drafting and reasoning parts finally good enough to rely on. The rest was plumbing.

So I built Salesforge, and I use it to run my own go-to-market today. Six people, more output than 50, and the business is profitable while growing double digits every month.

Why did the old 50-person SDR model stop working?

Several things compounded at once. Every serious buyer has seen thousands of cold emails by now, so the generic template that worked in 2019 does not get a reply in 2026. Mail providers got strict about domain reputation and volume spikes, which means the volume tricks that used to work push you straight into spam today. LinkedIn started rate-limiting aggressive senders. The cost of hiring SDRs kept climbing while reply rates kept dropping.

The headcount model broke on unit economics long before it broke on anything else. A human SDR who costs 6,000 dollars a month fully loaded needs to book a lot of qualified meetings for that maths to work. When reply rates drop from 3 percent to half a percent, you either triple headcount or take a different approach entirely. I took the second option.

What does a real GTM agent stack actually look like?

Before anyone reads another post about AI agents, here is the anatomy in a way that makes buying decisions easier. A GTM agent is not one thing. It is five components working together, and most vendors only ship one or two of them, which is why operators end up duct-taping software to fill the gaps. If you want the wider tooling map around this, I broke down the GTM tools the top 1 percent use separately.

1. Connectors

Your agent needs access to the tools your sales motion already runs on. Lead source, sender infrastructure, sequencer, CRM, inbox. Modern agents connect through either direct APIs or MCP, a protocol built for agents to call tools reliably. Salesforge ships its own MCP server at mcp.salesforge.ai/mcp, so a chat agent like Claude can source leads, write copy, set up mailboxes, and launch a sequence inside Salesforge without anyone touching the UI.

The step-by-step on that is in how to automate cold email with Claude Code and Salesforge MCP.

2. Skills

Skills are reusable instructions the agent follows for specific tasks. A skill is a written playbook for something your best SDR does over and over. Lead sourcing. Icebreaker writing. ICP scoring. Reply classification. The more precisely you write the skill, the more predictable the output gets. Treat your agent like a new hire on day one. You would not expect a brand new SDR to write great cold email without a copy framework in hand, and the same logic applies here.

3. Memory and context

Context is the single biggest lever on output quality. The agent needs your offer, your ICP, your objections, your positioning, and the context of past conversations so it does not ask the same question five times. A serious stack has three memory layers. A static layer for offer and ICP, a dynamic layer that pulls live from the tools you have connected, and a conversational layer that remembers what you told it last week. When any of the three is missing, the agent starts repeating itself or hallucinating the obvious.

4. Database

Every part of go-to-market runs on tables. Lead lists, enrichment fields, AI-generated variables, reply logs, campaign stats. If your agent cannot store and query structured data natively, every task becomes a rebuild. This is where most do-it-yourself setups in Claude Code hit a wall, because a shell-based agent does not have a persistent database by default.

5. Code execution

Sounds technical, but the point is simple. An agent that can write and run its own scripts moves past the hard ceiling of whatever tools you connected. It can build a custom dashboard. It can pull conversion rates from three different sources into one view. It can run a background worker on a schedule. This is the capability that quietly makes the difference between a chat interface and an agent that actually replaces a workflow.

Component What it is Why it matters
Connectors Access to your sales tools via API or MCP Without this the agent is a chatbot
Skills Reusable task playbooks in markdown Turns one-off chats into repeatable jobs
Memory Static + dynamic + conversational context Output quality tracks context quality
Database Native table storage for leads and results GTM work is tabular, this is non-negotiable
Code execution Ability to run its own scripts Covers anything your connectors do not

Which tasks should go autonomous first, and which should stay with a human?

This is the question I get asked most, and there is a simple mental model for it. Plot every task on two axes. The first axis is how much human input the task needs while the agent is running. The second is how deterministic the output is. You get four quadrants, and the answer for which to automate first is different in each one.

                                                                                                                                          
QuadrantExample taskWhat to do
High determinism, low human inputSource 50 ICP accounts, enrich, push to CRMAutomate first. Fastest payback.
Low determinism, low human inputRefine ICP based on last 50 closed customersIterate in co-pilot until stable, then automate.
High determinism, high human inputBuild a targeted list with unusual filter logicKeep in co-pilot. The human steering is the value.
Low determinism, high human inputIdeate 10 outbound plays for a new segmentKeep in co-pilot. This is creative work.

The rule I follow is simple. Anything I would describe to a junior SDR in three bullet points on a Monday morning is a strong candidate for a scheduled autonomous agent. Anything I would describe to my head of sales over a 30-minute call is not. Keep that one in co-pilot.

What does my own GTM stack look like day to day?

Here is the actual shape of my stack, because real setups are more useful than theoretical ones.

For lead data, I run Leadsforge as the primary source. It covers more than 500 million contacts, waterfall enrichment across providers, and ICP-based search. I used to pay for Sales Navigator on top of this and stopped after testing. The underlying data across most providers is the same data, and paying twice never made sense.

For sender infrastructure, I run a mix of Mailforge for shared IP volume when I am scaling domain count, and Infraforge when a specific campaign needs dedicated IPs and more reputation control. Shared IPs are cheaper and fine for most volumes. Dedicated IPs pay off once you are sending meaningful volume from a smaller number of well-warmed domains.

For warmup and inbox placement, Warmforge runs in the background. I do not think about it unless a heat score drops below 85. The 14-day warmup before first send is not optional, because sending on a cold mailbox is the fastest way to torch a domain reputation you just paid for.

For sequencing and AI SDR execution, everything lands in Salesforge. Agent Frank runs on auto-pilot for a big chunk of outbound, and my humans use co-pilot mode when they want to review drafts before send. Primebox handles the unified inbox where replies from email and LinkedIn both show up. That unification matters more than it sounds, because the moment you run multi-channel outreach you need one single surface to reply from.

The six humans on top of that stack handle live calls, objection handling, product demos, negotiation, and closing. That is the work no agent does well yet, which is also why every serious operator I know is still hiring account executives. For a longer read on where the SDR role is shifting, I broke it down in AI SDR vs human SDR.

How much does this actually cost to book 20 meetings a month?

Real numbers on this, because the maths is where most people get tripped up. The industry benchmark for cost per booked B2B meeting sits between 200 and 1,000 dollars, depending on market and deal size. Enterprise motions sit at the top of that range. Commodity SMB motions sit at the bottom.

To book 20 meetings a month in a typical mid-market motion, you are either paying for people or paying for infrastructure plus a small team. Here is the rough split of what I see working in 2026.

                                                                                                                
SetupMonthly spendWhat you get
2 human SDRs (no agent stack)$10,000 to $14,000Two people, manual sourcing and sending, inconsistent output
Agent stack with 1 part-time operator$400 to $700Full outbound infra, AI SDR, unified inbox, automated follow-up
Agent stack plus 2 closers (my model)$650 tooling + AE salariesOutbound fully run by agents, humans on demos and close

A benchmark to keep in your head is that an outbound motion in 2026 needs somewhere between 200 and 8,000 touches across email and LinkedIn to book one meeting. Where you sit in that range depends on market saturation, deal size, and how good your messaging is. That spread is also why AI SDRs that can run those touches at scale keep winning the unit-economics argument.

Where do human SDRs still fit in 2026?

This is where most AI SDR content gets it wrong. The role is not disappearing. It is shifting. A lot of the classic SDR tasks have moved to agents, which is uncomfortable news for anyone whose job was built on sending 120 manual emails a day. But the second-order effect of all this outbound automation is more noise in the inbox, which makes real human outreach more valuable in specific pockets.

The places humans are still winning are warm-lead cold calls, executive-level multi-threaded outreach, and anything that needs judgment about where a conversation should go next. A cold call to a warm lead right after they opened a personalised email converts better than either channel on its own. That is a very human move and no agent handles it well yet. For a live example of how agents and humans split the work, I wrote up four outbound workflows I run with Claude Code that save hours of manual work every week, and my team picks up the last mile on the phone or LinkedIn.

Can you put LinkedIn on full autopilot with a GTM agent?

Technically yes, and I would push back a little on the idea. Part of LinkedIn can be automated safely. Commenting on posts in your niche, scheduling content, liking relevant posts, sending first-touch connection requests at a measured pace. All of that runs fine on a scheduled agent with the right guardrails. What I do not automate is actual DMs after someone has replied. Those conversations are where trust is built, and I want my team on them.

For the automatable part, guardrails matter. One comment per run. Never comment on your own posts. Only reply in your own tone of voice. Respect platform rate limits. I broke the commenting strategy down in our blog on LinkedIn commenting plays that generate pipeline.

The 5-step playbook to deploy your first GTM agent this week

If you want to act on this, here is the sequence I would follow. It works whether you are a founder doing outbound solo or a sales leader rebuilding around agents.

Step 1. Write down your ICP and offer in one page

Agent output quality tracks context quality. If your ICP lives in three different Notion pages and no two people agree on what it is, fix that first. One page. Titles, industries, company size, pain points, objections, differentiation, proof. Everything downstream depends on this document.

Step 2. Pick your connectors and wire them up

For most teams this means a lead source, a sequencer, a warmup tool, and a CRM. The Forge stack is designed to connect out of the box. Leadsforge for data, Mailforge or Infraforge for infrastructure, Warmforge for inbox placement, Salesforge for sending and agent execution.

Step 3. Spend your first week in co-pilot mode

Do not flip to autonomous on day one. Run Salesforge in co-pilot and review every draft. This is where you catch the edge cases in your product positioning that the agent got slightly wrong, and where you tighten the context. By week's end, drafts should feel like they came from your best rep.

Step 4. Move the high-determinism tasks to autonomous

Once drafts feel consistent, move the obvious low-risk tasks to autonomous. First-touch outreach on a tight ICP is the usual first one.

Step 5. Keep humans on reply and close

The minute a reply lands, a human should pick it up. Primebox makes this manageable because replies from email and LinkedIn sit in the same inbox, and your team works from one surface. Do not automate reply handling in the first quarter. That trust loop is where deals live or die.

Ready to run your outbound on agents instead of headcount?

If you want to see what this looks like for your own motion, start a free trial of Salesforge and connect your first mailbox. It takes about ten minutes. For the version where the AI SDR takes over daily execution so you spend your time on closing, hire Agent Frank and I will walk you through the setup.

FAQ

What is a GTM AI agent?

A GTM AI agent is a software system that takes over go-to-market tasks that used to need a human. That includes sourcing leads against an ICP, writing and sending personalised outreach, following up, routing replies, and reporting. A proper GTM agent is not a chatbot. It has connectors to your sales tools, reusable skills, persistent memory, a database, and the ability to run its own code.

How is a GTM agent different from Claude Code or a generic LLM?

Claude Code is a fantastic general-purpose coding agent and I use it every day. But running outbound on Claude Code alone means you rebuild a lead database, a memory system, a warmup engine, and compliant sender infrastructure from scratch. That is a lot of custom code to maintain. A GTM agent orchestrating a product like Salesforge gives you those pieces out of the box.

Will an AI SDR replace my human SDRs?

Not entirely, and anyone saying otherwise is selling you something. An AI SDR reliably replaces the repetitive parts of the job. Sourcing, drafting, sending, follow-up, reply classification. What stays with humans is warm cold calls, executive-level multi-threaded outreach, and the judgment calls about where a conversation goes next. A team that used to need 50 SDRs can now run with 6 closers and an agent stack.

How much does a GTM agent stack cost to run?

For a mid-market motion, a full agent stack runs between 400 and 700 dollars a month in tooling. That covers lead data through Leadsforge, sender infrastructure through Mailforge or Infraforge, warmup through Warmforge, and Salesforge for sequencing and execution. Compared to the 10,000 dollars plus a month it takes to run two human SDRs, the maths is straightforward.

How long does it take to set up a GTM agent?

The infrastructure side is fast. Mailforge gets sender infrastructure ready in about five minutes. Warmforge kicks in automatically for a 14-day warmup. Agent Frank goes through a structured onboarding where you upload your knowledge base, set your ICP, and choose auto-pilot or co-pilot. From start to first live campaign, plan on two weeks. Skipping the warmup is the fastest way to torch your domain.

Can I use Claude or another AI assistant to control Salesforge directly?

Yes. Salesforge ships an MCP server at mcp.salesforge.ai/mcp, which means any MCP-compatible AI assistant can describe a campaign in natural language and have the steps, copy, mailboxes, and schedule populate inside Salesforge without anyone opening the product UI. That is the setup I use when I want to spin up a new experiment quickly.

Is cold calling with AI voice agents legal?

Outbound cold calling with autonomous voice agents is restricted in both Europe and the United States unless the recipient has explicitly consented. Inbound, where someone has opted in and the terms of service cover automated voice handling, is a different story and plenty of teams run it with tools like elevenLabs. The safe default for outbound remains humans on the phone, agents on email and LinkedIn.