One of those slow afternoons where I kept second-guessing a position I've been holding for three weeks. Not enough to close it. Just enough to be annoying. I ended up closing the charts app and opening a reading tab, the way you do when you need to think about something else for a while.


Landed on OpenLedger. Specifically on the framing they use a lot — "data provenance as the foundation for trustworthy AI." Read it, nodded, moved on. Came back to it ten minutes later because something felt off and I couldn't place it.




The argument as it usually gets made: AI can't be trusted if we don't know where its training data came from. Build the provenance layer, establish the audit trail, and you've created the foundation on which trustworthy AI can stand.


Clean. Logical. Hard to argue with.


Except I kept thinking about when I actually don't trust an AI system. And it's never because I'm wondering about the provenance of its training data. I don't trust it when it gives me confident answers that are wrong. When it hallucinates a source. When it answers a question in a way that sounds authoritative but misses something obvious. When it reflects a bias that shapes the output in a direction I can't see unless I'm looking for it.


None of those are provenance failures. They're output failures. Model failures. Architecture and alignment failures.


And that's when something shifted.


"Trustworthy" has two different meanings that are getting quietly merged into one. There's trustworthy as in accountable — you can trace where things came from, assign responsibility, audit the process. And there's trustworthy as in reliable — the system does what you need it to do, accurately, consistently, without failing in unpredictable ways.


OpenLedger's provenance infrastructure addresses the first. It has almost nothing to say about the second. And when people say they want "trustworthy AI," they almost always mean the second one.


So the causal chain that the "foundational" framing implies — provenance leads to trust — has a gap in it. Provenance leads to accountability. Accountability is a prerequisite for a certain kind of trust. But it doesn't produce reliability on its own. You can have a fully auditable training dataset and still train a model that makes things up, performs inconsistently, and fails in the exact ways that make people distrust AI in practice.




Here's what genuinely bothers me about this.


If "data provenance" becomes the answer the AI industry points to when asked "how do we ensure trustworthy AI?" — and it's already trending in that direction, especially in regulatory conversations — then there's a real risk of closing a debate before it's actually resolved. You get the audit infrastructure, you get the compliance narrative, and the harder questions about model reliability, output accuracy, and alignment get less attention because the foundational layer "is being handled."


This is how accountability frameworks work in every industry. They define the standard, companies meet the standard, the standard becomes a substitute for actual performance.


I'm not saying provenance isn't important. It clearly is — you can't fix a problem you can't trace. But "you can't fix it without provenance" doesn't mean "provenance is sufficient to fix it." And that distinction is getting lost in how the conversation is being framed.




Who this matters to: the people making AI procurement decisions, the regulators drafting AI guidelines, the enterprises telling their customers "our AI is trustworthy." They're the ones who will be in the position of pointing to provenance infrastructure as a trust signal, and they're also the ones responsible when the outputs don't hold up.


When it matters: probably the first time a well-documented, fully-provenanced AI system produces a high-profile failure. That moment will make it very clear what provenance does and doesn't guarantee.


Still holding the position, by the way. Still annoyed about it. Probably won't close it today.

@OpenLedger #OpenLedger $OPEN