I recently started advising an up-and-coming startup. In one of our first sessions, I sat in on their weekly growth meeting.
They kicked it off by reviewing the experiments pipeline. At the 5-minute mark, they switched gears and took a look at key metrics.
Now, 10 minutes into the meeting, the head of growth headed off a couple of potential rabbit holes and announced it was time to talk learnings.
So far so good 👏. They're already demonstrating more process and discipline than most.
Their paid marketing lead kicked things off. She just wrapped up a batch of ad creative tests on Meta and was clearly eager to share what was learned. It went something like this:
"Our competitors predominantly run ads featuring male models. And the market is getting tired of it. People are starting to feel like the other brands don't 'get them.' So, we hypothesized that ad creatives featuring female models would allow us to not only stand out, but also build trust with our market."
Love that they're taking a hypothesis-driven approach. But I think I see where this is headed…
She then walked us through the data. By every traditional success metric, the new creatives outperformed the control.
"Great result. But what can we learn from this test?" asked the head of growth.
"We've learned that creative featuring female models performs best with our audience. Our hypothesis was right. Our market is tired of the incumbent brands. This is how we differentiate. This feels like a huge learning that should be applied across all of our marketing surfaces."
The head of growth nodded in agreement. A brief discussion took place about next steps for applying those learnings more broadly. And then they moved on.
Uh-oh. 😬 Think we might've found our problem.
So what went wrong here? They conflated "what happened" for "why it happened" – or what I call the 'what vs why trap.'
Here's what they learned: A couple new ad creatives featuring female models produced a higher ROAS than their champion variations that don't. That's it.
You know what they didn't learn? Why.
Was it because the model was a woman? Maybe. Or maybe it was Meta's algorithm. Could there have been some kind of cultural dynamic at play? Certainly possible. Right alongside a million other possibilities.
The bottom line is that an ads test, like any other quantitative testing method, can only tell us what happened. It provides clues as to the why (which is super valuable). But that's as far as it goes.
Why This Matters
When I introduce this concept to teams for the first time, it's not uncommon to get pushback.
"OK, I get that you're technically right. Maybe we can't prove the why with 100% confidence, but this is close enough, right?"
Nope.
"Fine. So we just need to do a better job with our experiment design. If we can isolate a single variable and prove it's responsible for the observed effect, then we'll have our why, right?"
Nope again.
"Wtf are you talking about, bro? We only changed 1 variable. It's the only possible explanation for the result."
Maybe. But it still doesn't tell us why.
At first, this can feel a bit pedantic. It's not.
For a later-stage company, years of rich, compounding learnings can serve as a moat. All things equal, a team that knows more about its customers and market than anyone else will win.
For early-stage companies, it's an existential matter. At the end of the day, a startup is just a collection of unproven hypotheses. Product-market fit being the most fundamental. "We believe this audience has this problem and will pay for this solution."
Learnings are how you validate or invalidate those beliefs. If your learning system is broken, you can't course-correct. You drift further from reality with every decision.
Getting Closer To The "Why"
At the highest level, experimentation methods can be put into one of two categories:
- Quantitative methods
- Qualitative methods
So far, we've been talking about numero uno. Think A/B tests, funnel data, cohort analyses, heat maps, etc. They all have one thing in common. They do a great job of telling us what happened or where a constraint may exist, but they can't tell us why.
Have a new LP that's converting at a higher rate than your control? That's a what.
Funnel data showing a major dropoff? A what.
Cohort analysis showing significantly lower retention for a particular customer segment? Another what.
And these "whats" are extremely important. They give us clues. Clues that can be used to form new hypotheses or shape existing ones. And that's where the team I'm advising went wrong. They mistook their Meta Ads experiment as validation of their entire hypothesis. And they were ready to shift their whole GTM strategy accordingly.
But here's something that the ultra data-driven teams don't like to hear: it's impossible to answer "why" through quantitative methods alone.
Which brings us to the second category: qualitative methods. User interviews, surveys, and so on.
Most growth teams skip the qualitative layer. Shipping new tests is fun. Mining through big data sets—yummy.
Slowing down to talk to customers? Not so much.
But if the goal is to maximize the quality of our learnings—and hopefully I've made a compelling case for why that should be the goal of any growth team—then you need both.
Quantitative to tell you what's happening. Qualitative provides the human context to get closer to why.
Closing Note
The keyword is "closer." We'll never know with 100% certainty why humans behave the way they do. But we can triangulate. Layer different types of evidence. Get close enough to make better decisions.
Most teams treat learnings like a checkbox. Test ran. Winner picked. Learning logged. Move on.
But a learning that's wrong is worse than no learning at all. It compounds in the wrong direction.
So before you log your next "learning," ask yourself: have we done the work to get closer to why? Or are we only at the "what happened" layer?
If it's the latter, you're not done yet.
Justin Setzer
Demand Curve Co-Founder & CEO
No comments:
Post a Comment