Context graphs, one month in

What we've learned since the essay went viral

Jan 31, 2026

A month ago, Jaya and I published our p.o.v. on context graphs. Since then, it’s become one of the most-discussed - and, in our opinion, most important - ideas in AI this year.

Founders and operators quickly built on it in their own writing. Dharmesh Shah of HubSpot called context graphs “a system of record for decisions, not just data.” Aaron Levie of Box wrote that we’ve entered “the era of context”: when everyone has access to the same models, the differentiator becomes the organizational knowledge you feed them. Arvind Jain of Glean weighed in: “Everyone is suddenly talking about context graphs. At Glean, we’re excited - because it finally has a name.”

The conversation kept expanding. Cognition attached the term to their latest launch with Cursor, Cloudflare, Vercel, and others. TechCrunch cited us in an article explaining the core idea behind a new AI scheduling startup. The AI Daily Brief devoted a full episode to the concept.

Then came the inbound. CIOs at Fortune 500 companies reached out to ask what this means for their roadmaps. Founders at unicorn AI startups relayed that it’s changing how they think about product architecture. Friends at OpenAI and Anthropic told us the piece was circulating in company Slack channels as a must-read. We even heard from senior government officials asking how context graphs might work for accountability and compliance.

We’ve received 100s of pitches from startups building toward this idea. Several of our portfolio companies, including Maximor, PlayerZero, Tessera, Tonkean, and Regie, are actively building context graphs with customers and sharing what they’re learning.

Jaya and I joined The Information’s TITV to talk through the idea live.

Context graphs, in simple terms

A context graph is institutional memory for how an organization makes decisions: not how the process doc says it should, but how it actually works in practice.

Enterprise software is very good at recording outcomes (the final price, the escalated ticket, the approved discount) but not the reasoning behind them. Which exceptions applied? What precedent mattered? Who approved what, and why? That context still lives in Slack threads, side conversations, and people’s heads. Up until now, it’s rarely been treated as data.

Jaya and I call these missing records “decision traces.” Over time, they accumulate into a context graph: a living, queryable map of how an enterprise makes decisions, stitched across systems and time so precedent becomes searchable.

We believe context graphs will define the next generation of enterprise software. The most valuable software companies of the past generation earned that title by owning the canonical data layer: Salesforce for customers, Workday for employees, SAP for operations. We think the next generation will own the layer above: the reasoning that connects data to action.

The last generation of software captured what happened. This generation will capture why. And because decision traces can’t be reconstructed after the fact (you have to be in the workflow when decisions are made), startups building there today have an opening that incumbents can’t easily close.

What resonated

🔗 The glue function problem

The examples from our original essay (exception logic living in people’s heads, approval chains happening in Slack, precedent buried in old deal desk threads) clearly resonated. Multiple people pointed to the same “glue functions” we identified: RevOps, DevOps, SecOps, FinOps. These roles exist because no single system of record owns the cross-functional workflow. The people in those seats carry context that existing software doesn’t (or can’t) capture.

As one AppSec founder put it: “New security engineers spend months learning tribal knowledge - ‘Oh, we always suppress that finding because of the WAF configuration.’ Nobody documented it.”

📈 Decision traces compound

Founders and investors broadly agreed that decision traces are a compounding asset. Models commoditize. But a high-fidelity record of how your organization actually operates is proprietary and hard to copy. The teams that start collecting decision traces now will own something that grows more valuable over time.

Aaron Levie has made the broad case: “The teams and companies that can accumulate and best utilize context will drive the greatest productivity and highest output.” I’d add: decision traces are the hardest to replicate, because they can only be captured by being present when decisions happen.

This is also why startups have an edge. Agents can live in the execution path in a way traditional software can’t: present at decision time, capturing traces as a byproduct of work. While incumbents would have to retrofit this into workflows they don’t control, startups can build it in from day one.

Where we got pushback

🕵️ Can you really capture the “why”?

The most common pushback: you can’t actually capture the why. True intent is internal and unobservable. What you can reliably capture is the sequence of actions: the how, not the why.

Asking humans to explicitly document their reasoning every time they make a decision isn’t realistic. But you don’t need explicit annotation. Arvind at Glean put it well: “The why is often a thinking step that resides in someone’s head - you can’t actually model it. The how, on the other hand, leaves a rich digital trail... Over many cycles, those process traces approximate the why.”

I agree: you capture the how - the policy applied, the evidence consulted, the exception granted, the approver - and infer the why from patterns over time. That’s far more than exists today, and it’s enough for agents to begin meaningfully scaling autonomy.

🤖 Aren’t you forgetting the bitter lesson?

If models keep getting better, won’t agents eventually figure out optimal behavior on their own? Why encode organizational knowledge when the model could discover better approaches from scratch?

For many enterprise use cases, the goal isn’t for agents to optimize from first principles. It’s for them to act consistently with how the organization has chosen to operate. “We always give healthcare companies an extra 10% because their procurement cycles are brutal” isn’t a suboptimal heuristic waiting to be improved: it’s a policy that reflects relationships, compliance constraints, and hard-won experience. An agent that ignores it isn’t smarter, it’s just wrong.

Context graphs are closer to the rules than the strategy: they define what moves are legal in this organization, what precedents apply, what constraints are non-negotiable. Agents can get creative within that structure, but the structure itself isn’t something they can independently discover. It has to be captured at decision time, from the workflow itself.

That said, the line between “encode organizational knowledge” and “let the agent explore” is domain-dependent and may shift over time. As Vyas Sekar and Hui Zhang at Conviva argued, for consumer-facing use cases (support, shopping, travel/booking), there may be room for agents to find better approaches than humans would.

Open questions

🧩 Are context graphs a category or feature?

Some skeptics see this becoming “data catalog 3.0”: important infrastructure, but ultimately a feature of broader platforms rather than a standalone category. Others wonder whether “context graph” will hold as a concept or get diluted the way “data mesh” did.

Both, or neither, could be true. The layer is necessary: almost everyone agrees on this now. Whether it becomes a standalone category or gets absorbed into warehouses, catalogs, and observability tools is an open question. But for founders, that may be beside the point. Whoever captures decision traces in a high-value domain creates a compounding asset. Category or feature, the opportunity for startups is real.

🛠️ How do you build one in practice?

This was the question we heard most: great, but how?

Animesh of PlayerZero wrote one of the most detailed responses. His core insight: you don’t prescribe an ontology upfront. You let agents discover it through use. When an agent investigates an incident or completes a task, it traverses your company’s state space, touching systems, reading data, and calling APIs. That trajectory is a decision trace. Accumulate enough of them, and a world model of your organization emerges. As he puts it: “The schema isn’t the starting point. It’s the output.”

Vyas Sekar and Hui Zhang at Conviva have pushed this idea further. In their read, capturing decision traces is just the start: you also need stateful reasoning to connect actions to outcomes.

Similar to PlayerZero, the teams getting this right seem to be starting narrow - one workflow, one wedge - and letting the context graph grow through use. They’re not trying to boil the ocean.

🏠 Where does the context graph live?

In a data warehouse? A dedicated system? The orchestration layer itself? Each comes with tradeoffs.

Warehouses are in the read path, not the write path: by the time data lands via ETL, decision context may be gone. Orchestration layers capture context at decision time, but then your context graph is tied to that specific tool. A dedicated “context OS” adds integration overhead and yet another system to maintain. I dÏon’t think there’s consensus yet, and the right answer may depend on the domain.

👥 Who owns it?

Governance and access control get complicated fast, especially in regulated industries. If decision traces include sensitive reasoning about customers, employees, or risks, who gets to query them? How long do you retain them? These are hard problems, and startups building here will need to solve them.

⏳ How do you handle time?

Decisions have a half-life. Policies change, teams change, orgs restructure. Agents need to know which precedents still apply, not just which precedents exist - but there’s no obvious rule for when a past decision stops being valid. A pricing exception from three years ago under a different CFO may not apply today. Founders will need to define how precedent ages out.

What’s next

We expect every organization to have multiple context graphs, each shaped by the domain it serves. We’re watching this play out across our portfolio: PlayerZero in production engineering, Maximor in finance, and Regie in demand gen.

We’ll keep sharing what we learn. If you’re building in this space, or have a strong point of view on what’s next, I’d love to hear from you.

Discussion about this post

Ready for more?