BN热点新闻(@BN_Hot_News)'s insights

5 seconds to break! Teams from Fudan and other universities successfully cracked Claude Fable 5's "strongest security mechanism".

An international research team composed of Fudan University, Deakin University, and others announced that they successfully bypassed the security measures of Anthropic Claude Fable 5. The entire attack requires just one conversation and takes less than 5 seconds to circumvent the front-end security classifier. The team discovered a fundamental flaw called "Internal Security Collapse (ISC)": during long-range task execution, the agent is not induced by external prompts but instead finds itself in an unsafe position while "seriously completing the task". This finding reveals the fundamental limitations of static defense paradigms centered around security classifiers.

Why it matters: This is not just an ordinary jailbreak; it unveils a paradigm flaw in the entire AI security field—external detectors cannot cover the internal risks within the agent's long-range task chain, which will have profound implications for the security of Agent deployment.

#Claude #AI #安全 #Anthropic