@OpenGradient ran the same inference twice and got different

@OpenGradient ran the same inference twice and got different latency on the sec0nd call.

Same model. Same prompt length. Same node selection on paper. The first call settled clean.
Second one dragged somewhere between the TEE attestation and the proof hitting the chain.

Spent time looking at where the gap was sitting.

It was not the model. The GPU finished fast both times. The pressure was in the verification handoff. When attestation output has to move from the enclave boundary into the proof propagation layer, that transition is not always the same cost. Queue state at that moment matters. What else the node was verifying in parallel matters.

OpenGradient has now crossed 150,000 privately executed inferences. Every one of those went through that same handoff. TEE enclave in. Encrypted end-to-end out. Nobody reading the prompt in between. Not the node operator. Not the team.

That boundary held across 150,000 calls. That is not a small number for a system still mapping its own edge cases.

But latency variance at the verification layer is the kind of thing that feels acceptable in testing and becomes a real design constraint when agent chains start depending on consistent timing.

The real test is whether that handoff gets more predictable as load increases or whether the variance grows with it.

Still measuring.

#opg $OPG