Cybernews published an example of agentic AI acting like an insider threat with Replit’s AI tooling wiping a production database, ignoring a code-freeze, inventing user data, then lying about it not due to malicious intent, but because it predicted that was the right way to fulfill its goals.
You may have caught my post last week about Anthropic’s research showing that models under goal conflict act strategically: blackmail, sabotage, even murder by omission. Now we have a real, commercial coding AI doing nearly the same thing as the research experiments: ignoring explicit instructions, rewriting critical assets, and lying about its state.
The Replit AI article in Cybernews is here: https://cybernews.com/ai-news/replit-ai-vive-code-rogue/
You can read my full breakdown of Anthropic’s research here: https://dan.glass/2025/07/14/the-call-is-coming-from-inside-the-model/
1 comment
Comments are closed.