Back to Blog
assumption-mapper

Designing Against Human Nature: The Research Behind Assumption Mapper's Next Version

I spent a week in the research literature on cognitive bias, innovation methodology, and interface design. What I found is reshaping how I build Assumption Mapper — and what it means to track assumptions at all.

March 12, 202611 min read
Designing Against Human Nature: The Research Behind Assumption Mapper's Next Version

I've written before about why I built Assumption Mapper and how AI can surface evidence from messy notes. Those posts covered what the tool does. This one is about what it should become — and why.

I went deep into the research literature: thirty years of innovation methodology, cognitive science on why people defend beliefs, and interface design patterns from tools that actually change behavior. What I found confirmed some of my instincts, challenged others, and gave me a concrete roadmap for the next version.

The core insight is uncomfortable: assumptions are the fundamental unit of risk in any new venture, yet humans are neurologically wired to forget, distort, and defend them. The frameworks for managing assumptions are mature and convergent — Lean Startup, Strategyzer, Discovery-Driven Planning all agree on the lifecycle. The tools are not. No dominant software product owns this category. That's the gap Assumption Mapper is built to fill.


The six-week half-life

Rita McGrath cites Russell Ackoff's research revealing that the half-life of an organization's ability to remember critical assumptions is just six weeks. After that, nobody involved can recall what assumptions underpinned their decisions. Worse, McGrath found that "we take our assumptions and magically turn them into facts in our minds." Once a number enters a business plan — say, a 20% conversion rate — it loses its provisional character and gets treated as established truth.

This was the original problem that made me build Assumption Mapper. I'd seen it happen on every venture engagement: the initial assumption-mapping workshop generates energy and clarity, and then reality takes over. The assumptions get buried under feature backlogs, sprint planning, investor decks. Six months later the venture fails — and the fatal assumption was sitting in row 47 of a spreadsheet nobody opened since week two.

Steve Blank learned this the hard way when his seventh startup failed: "I was so good at creating a reality-distortion field that I believed I was right. Customers voted with their feet." Eric Ries built on this with the concept of "leap-of-faith assumptions" — the riskiest elements everything else depends on. Alex Osterwalder and David Bland operationalized it further with Assumption Mapping, structured around a single question: "What are all the things that need to be true for this idea to work?"

Every major framework converges on the same lifecycle: Surface → Prioritize → Hypothesize → Test → Learn → Decide → Iterate. The methodology is settled. The experience of actually doing it is not.


Seven cognitive biases form a wall around your assumptions

This is the part of the research that changed how I think about the product. It's not enough to give people a place to log assumptions. You have to design against the psychological forces that make people defend them.

Confirmation bias is the most pervasive. Founders "tend to convince themselves that their product is innovative, ignoring evidence to the contrary." They set success criteria bars too low so they always prove themselves right. Stanovich, West, and Toplak demonstrated that this bias persists even among highly intelligent individuals — it's motivational, not a failure of reasoning.

Overconfidence compounds it. A meta-analysis of 62 studies found that all three types — overprecision, overestimation, overplacement — "stimulate individuals to engage in entrepreneurship but impair their performance after the venture has been founded." Even when entrepreneurs are reminded of past forecasting errors, overconfidence persists. They replace hindsight bias with misattribution to psychologically sustain it.

The IKEA effect is particularly insidious for assumptions. Norton, Mochon, and Ariely found that people pay 63% more for furniture they assembled themselves. Applied to business ideas: psychological ownership of assumptions makes them resistant to challenge. The effect is strongest when the effort was successful — meaning a founder who has successfully articulated a business assumption values it more, precisely when challenging is most needed.

Belief perseverance — clinging to beliefs even after disconfirming evidence — was demonstrated in Ross, Lepper, and Hubbard's foundational Stanford studies. Even after participants were told the data was fabricated, they maintained their beliefs. The mechanism: people spontaneously generate causal explanations that become independent of the original evidence.

But there's hope. Anglin's 2019 research found that when presented with a clear, consistent pattern of disconfirming findings, participants shifted beliefs and the change persisted. That's a design mandate: present disconfirming evidence clearly and consistently, not in fragmented or ambiguous ways.

The most effective interventions aren't motivational — they're structural. Gary Klein's pre-mortem technique, where teams imagine a plan has already failed and generate reasons why, increases the ability to accurately forecast risks by 30%. Philip Tetlock's superforecasting research shows the best forecasters treat "beliefs as hypotheses to be tested, not treasures to be protected." And the single most effective debiasing technique is counterexplanation — asking people to imagine how the opposite belief might be true. Simply asking people to "be unbiased" does not work.

That last finding is the one that matters most for product design. You can't rely on users being disciplined. You have to build the discipline into the tool.


What best teams actually do — and where they break

The research confirms a clear weekly cadence among the best innovation teams: assumption mapping workshops to identify risks, one to two experiments per week targeting the riskiest assumptions, Test Cards before each experiment (hypothesis, method, metric, success criteria), Learning Cards after (observations, insights, actions), and weekly reviews connecting learnings to business model updates.

Kromatic benchmarks innovation teams on experiment velocity (at least one per week) and insight velocity (at least one actionable insight per week). 7-Eleven Japan has run a weekly hypothesis-review-learn cycle for over thirty years.

But the failure modes are pervasive and predictable. The most common is "log and forget" — teams document assumptions in an initial workshop and never revisit them. Nielsen Norman Group specifically warns that "assumptions are often treated the same as facts later in the project." David Bland observes that newly motivated teams run landing pages or interviews, "only to ignore the riskiest assumptions in their strategy." Teams test comfortable assumptions, not dangerous ones. They write vague hypotheses that are unmeasurable. They skip defining success criteria upfront — which Grace Ng identifies as fatal: "Without defining success up front, teams will argue over whether an experiment was successful or not."

These failure modes aren't character flaws. They're predictable consequences of using tools that weren't designed for assumption management. Teams cobble together Miro templates, Trello boards, and Google Sheets. None of them fight the natural decay. None of them surface what's going stale. None of them make it easier to challenge an assumption than to ignore it.


What's coming to Assumption Mapper

The research pointed me to three specific features that address the biggest gaps in the current product. These are what I'm building next.

The 2x2 matrix

Strategyzer's Assumption Mapping uses a simple spatial model: Importance on one axis, Evidence on the other. The top-right quadrant — assumptions that are critical for success yet have the least evidence — is where your attention should be.

The current Assumption Mapper does this with automatic risk scoring and a ranked list. But David Bland's key insight is that "it's not about who is the loudest or who gets paid the most. It's about having observable evidence." The power of the 2x2 isn't the ranking — it's the conversation it creates when a team debates where an assumption falls. Bland found that "the shared understanding through the mapping conversation is much more valuable than the map itself."

A list doesn't create that conversation. A spatial view does. You see the landscape of your risk. You drag assumptions and argue about where they belong. That argument is the product.

Up Next

Superhuman's core workflow insight is that surfacing the most important item automatically creates a pull-based flow state. The user doesn't browse or scan — the system presents what matters, and the user acts on it.

Applied to assumptions: when you open Assumption Mapper, instead of seeing a dashboard, you see your single most critical untriaged assumption. What do you know about it? What evidence do you have? What's your next move? Combined with a constraint mechanism — you can only have three assumptions in the "Test This Week" zone — this creates the forcing function that moves teams from logging to acting.

The research backs this up: the endowed progress effect (starting progress bars slightly filled) makes completion twice as likely. Sprint-style time-boxing with clear constraints is what separates teams that log from teams that learn.

"We Believe That" framing

This one is subtle but important. Strategyzer's format frames assumptions as "We believe that [assumption]" rather than declarative statements. The difference matters psychologically: a declarative statement ("Our conversion rate will be 20%") feels like a fact. A belief statement ("We believe that our conversion rate will be around 20%") stays provisional. It invites challenge.

Combined with the counterexplanation research — the most effective debiasing technique is asking people to imagine how the opposite might be true — the creation flow will prompt: "What evidence would prove this wrong?" Not as optional metadata. As a structural part of capturing the assumption. You define what would break your belief before you go test it.

This maps directly to Grace Ng's insight: defining success and failure criteria upfront is the difference between teams that learn and teams that rationalize.


What we explored and may build later

The research surfaced several other concepts that are compelling but need more thought before committing to them. I'm sharing them because they shaped my thinking even if they're not in the next release.

Visual confidence encoding. The idea that uncertain assumptions should literally appear fuzzy or hand-drawn, while validated ones appear sharp and solid. Color saturation gradients, opacity levels, and border treatments encoding confidence without requiring users to read numbers. Research from Padilla et al. confirms this works for dual-process cognition — creating accurate fast impressions while supporting deep analysis. This is beautiful in theory. I need to figure out if it's practical in a tool people use daily or if it crosses into gimmick territory.

The detective board. An evidence wall metaphor — corkboard, pushpins, connecting strings between evidence and assumptions. Viscerally compelling for making evidence collection feel like building a case. Each assumption becomes a case to be built or broken. The risk: it could feel more fun than useful. But the spatial arrangement and mixed media support (quotes, screenshots, data, links) could transform how teams see the connections between what they're learning.

Pre-mortem integration. Gary Klein's technique where teams imagine the venture has already failed and generate reasons why. The research shows this reverses social dynamics — "people show how smart they are by the quality of issues they raise" rather than by defending the plan. Building this into a team workflow, where everyone independently generates risks before sharing, could be a powerful assumption-surfacing mechanism. It needs a team collaboration layer that doesn't exist yet.

Confidence calibration. Rather than binary tracking, using probability estimates (0-100 confidence scale). Tetlock's research shows granular thinkers make better decisions — distinguishing 60% confident from 90% confident enables incremental updating and reveals where evidence is actually moving beliefs. The Strategyzer evidence hierarchy (strong: actual behavior, moderate: stated intentions, weak: opinions) could weight evidence automatically. This is intellectually rigorous but might add friction that kills adoption.

Opportunity Solution Trees. Teresa Torres's framework provides a hierarchical view: Business Goal → Opportunities → Solutions → Assumption Tests. Invalidated assumptions cascade to affect connected strategies. This "zoom" from strategic to tactical is directly applicable — but it requires users to map their entire strategy tree, which is a much bigger ask than tracking individual assumptions.


From tracker to thinking environment

The research crystallized something I'd been circling around but hadn't articulated clearly: Assumption Mapper shouldn't be a better spreadsheet. It should be a cognitive environment — a place where innovators think more clearly about what they don't know and act more decisively on what they learn.

Donald Norman's concept of cognitive artifacts captures this: the tool doesn't amplify strategic thinking — it transforms the task from "holding a mental model of all assumptions" into "spatially arranging and connecting visual objects." Proximity implies relationship. Position implies priority. Color implies status. The design shapes the thinking, not just the workflow.

The aspiration for the next version: make challenging assumptions the path of least resistance. Not through willpower. Through structure.

What makes assumption tracking stick as a habit comes down to five factors the research identified: embedding it in weekly rhythms (not one-time workshops), keeping it lightweight, making it visible (not buried in documents), connecting every learning to a decision, and celebrating learning rather than penalizing invalidation.

The 2x2, Up Next, and "We Believe That" are the structural interventions. They turn the tool from something you fill out into something that fights for your attention and pushes back on your certainty. That's the product I want to build.

Want to discuss this further?

I'm always happy to chat about building products and validation.

Get in touch