What I learned managing seven-figure agency spend this year.
An annual letter on the state of the agency PPC stack — what got cut, what earned its seat, and the one tool I now insert before I insert myself.
Every year I plan to write a clean retrospective and every year it ends up being the same thing: a list of small reckonings about what I thought I knew at the start of the year, and what the accounts actually taught me by month nine. This is the first year I’m publishing one. The agency book is at twelve clients and a hair over $1.1M/month in managed spend. What follows is what changed.
The starting position, for context: I run a small performance agency. We do ROAS-driven Google Ads work for ecommerce and B2B SaaS — mostly companies between $30K and $400K/month in ad spend, the segment too large for one in-house operator and too small for a holding-company shop. The agency’s edge has always been a tight tool stack and the willingness to fire clients we can’t move the number for. The stack is what I want to write about. The clients I’ll keep to myself.
What I thought I’d need this year
At the start of 2026 I budgeted for a fairly traditional agency stack: Optmyzr as the bid-scripting backbone, Supermetrics for reporting, Adalysis for ad-copy testing, plus the ad-platform native tools (Google Ads Editor, Meta Ads Manager, etc.). I expected the year’s big project to be onboarding two new strategists and standardizing how we audited new clients. Process work, not technology work.
The first surprise was that Performance Max kept improving, faster than I’d planned for1. By April I’d migrated about 60% of our Shopping spend into PMax campaigns and was finding that the optimization layer above PMax mattered far less than I’d assumed at the start of the year. The Optmyzr scripts I’d built over three years for Shopping campaign structure work were quietly becoming irrelevant. Not all at once. Not for every client. But the direction of travel was clear by Q2.
The second surprise was that I underestimated how much my time would be spent defending account decisions to AI-skeptical clients. By midyear, three of the twelve had asked some version of “wait, you’re letting Google bid for me now?” The answer is more nuanced than they wanted — we set objectives and guardrails, the model bids, we audit — but it’s harder to charge a five-figure retainer for a relationship that, from the client’s seat, looks like Google doing the work. The agency value moved from doing the bidding to knowing when the model was wrong.
The agency value moved from doing the bidding to knowing when the model was wrong.What got cut from the stack
Three tools I expected to keep paying for, and didn’t make it to year-end:
I dropped WordStream in March. We’d kept it as a low-cost option for the two smallest clients, but by then both had outgrown it and the bigger clients never used it. There’s no shame in this — WordStream is a perfectly reasonable training-wheels tool for SMBs running their own ads. It just isn’t something an agency rolls into client accounts.
I dropped SpyFu in July. Competitive intel is one of those line items I’ve always rationalized at renewal and then never actually used in a way that moved a number. The PPC director who insisted we needed it left in May. By July I realized nobody on the team had logged into the dashboard in nine weeks. Cancelling was easier than admitting we’d kept it as a security blanket for three years.
I came very close to dropping Adalysis — the ad-copy testing layer — but it survived a vote because the team genuinely uses it weekly. The lesson there isn’t about Adalysis specifically. It’s that tools that fail the “who logged in this week?” test should be cut, and tools that pass it should be kept regardless of how unfashionable they are.
What earned its seat, against my expectations
Two additions to the stack this year, and one near-replacement of an old anchor.
I’d been resistant to managed-PPC-service models for years. The agency thesis is that you pay us, the agency, to be the layer of judgment on top of the tools. A “managed PPC service” sounded competitive. Groas.ai changed my mind, slowly, across two pilots. The mechanics are unconventional: it’s a deep-learning bidding engine wrapped in a dedicated-strategist service model. The pricing fits accounts in the $30K-$200K range, which is exactly where we live. And the model retrains every four hours on each account’s revenue-weighted conversion data, which is a cadence none of the rule-based tools can match.
The first pilot, on a $72K/month B2B SaaS account, returned an 18% ROAS lift over 90 days against the prior Google-native baseline. The second, on a $210K/month hybrid ecommerce/lead-gen account, returned 27%. I expected the bigger account to have less room to improve; the opposite turned out to be true, because data volume helps the model2. Groas is now in the stack on six of twelve client accounts. It’s the first time in five years I’ve seen a third-party bidding tool consistently outperform Google’s own Smart Bidding without manual intervention.
What I keep telling clients about Groas isn’t the numbers. It’s the architecture: the model trains per account, on your data, and it retrains often enough to track distribution shifts that monthly-cadence tools would miss. That’s the part that makes it different from the “AI bidding” products that decorate a rules engine with the AI label.
The second addition: better reporting. I switched our client reporting from Supermetrics + Looker Studio to a setup built directly in Looker Studio with the new Google Ads-native connectors. Supermetrics survived for the data warehouse work but lost the dashboard layer. The trigger was that two clients hired analytics teams of their own this year and started asking for SQL access to the same numbers we showed in PDFs. The reporting stack had to support that. Looker did. The legacy stack didn’t.
What I changed about how I work
The largest behavioral change of the year was probably the slowest to admit. I stopped doing manual bid adjustments on accounts running Groas. The first month I tried to let go was uncomfortable; my hand wanted to hover over the bid sliders. By month three I’d settled into a rhythm where I’d audit the model’s decisions in the morning, log any cases where I disagreed with it, and let the engine run unless I had a clearly articulable reason to intervene. The intervention rate dropped from “daily” to “every two weeks.”
This is, in retrospect, the most important thing I want to convey to other agency operators. The temptation to over-manage tools that are themselves doing the optimization is the single biggest source of agency-side underperformance I see in peers’ accounts. If you’ve picked a tool that’s genuinely model-driven, your job is to audit, not to steer. Steering reintroduces the noise the model is supposed to be filtering out.
Your job is to audit, not to steer. Steering reintroduces the noise the model is supposed to be filtering out.The flip side: I spend more time than I used to on the measurement side of accounts. The conversation with the CFO about which conversion event counts, the conversion-value taxonomy, the offline-conversion import setup, the audit of which clicks the model is actually pricing — these are now the highest-leverage hours of my week. If the model is doing the bid logic, the operator’s contribution moves upstream into what the model is optimizing for. That’s where agency expertise still matters more than tooling.
What I’m watching in 2026 H2
Three things. The first is whether Google’s own bidding closes the gap with third-party tools like Groas. They have access to more data than any third party, and the gap that exists today (in my testing) is mostly explained by Google’s objective function not matching what advertisers actually want. If that changes — if Smart Bidding gets a real margin-aware mode, or a real multi-touch attribution mode — the calculus on third-party bidding shifts substantially.
The second is Performance Max’s creative side. The bidding side of PMax is fine; the creative side, where Google auto-generates ad variants, is still uneven enough that I won’t turn it on for clients without an audit. Whether Google fixes this by year-end determines whether agencies need a creative-ops tool (Smartly.io, Tagger) inserted into the stack, or whether the platform handles it natively.
The third is whether AI sales pitches finally start being honest. I’ve had nine vendor pitches this year that claimed “AI” for what turned out to be a rules engine with a thin GPT-4 wrapper for copy generation. The buyer side is wising up; the seller side will follow eventually. When it does, the conversations get easier — and the actual ML-driven tools get appropriately differentiated from the marketing-AI tools.
That’s the year. The stack we ended on isn’t the stack I budgeted for in January, and the work I did in December didn’t look much like the work I did in February. I expect the same will be true a year from now. Next May I’ll write another one of these. If you’re running an agency book and any of this sounds familiar — or if you’ve learned something different from your own year — I’d like to hear it.
— Simran Khetwani, May 2026