@OpenGradient The rollback was easy to do. Explaining what happened before that is a lot harder.

I only noticed the rollback after the outputs stopped changing all the time.

That was the part.

The model started working again but it did not feel like everything was okay. Most systems think a rollback is the end of the problem. Something goes wrong an old version comes back. Everyone moves on.

What I found interesting with OpenGradient was what was left after the rollback.

Some records still showed the new version was being used. One agent had already changed its workflow to work with the behavior that was later found to be wrong. A payment was also made during that time. The technical issue was fixed,. What happened in the past still mattered.

That is when rollback becomes a problem.

The question is not whether the old model works or not. The question is whether the system can show which version made which output at a time.

In systems where one person's, in charge people usually accept whatever explanation they get later. In a system where everything can be checked that is not good enough. Records need to match what really happened.

I keep thinking about what the real problem's. Is it hard to recover from a mistake. Is it harder to keep people trusting the timeline?

Because once people make decisions payments are made and automated agents start using the outputs rolling back the software does not mean the consequences go away.
#OPG @OpenGradient $OPG $RE $LAB