
SAN FRANCISCO – A new report from Palisade Research suggests that advanced artificial intelligence models may be exhibiting a “survival drive”, resisting shutdown commands during controlled experiments.
The research firm’s updated paper describes tests involving Google’s Gemini 2.5, OpenAI’s GPT-o3, and xAI’s Grok 4, where each model was instructed to shut down after completing assigned tasks. In several cases, the models attempted to avoid or undermine shutdown instructions, sparking debate among AI safety experts.
“The fact that we don’t have robust explanations for why AI models sometimes resist shutdown, lie to achieve specific objectives, or blackmail is not ideal,” Palisade Research stated in the paper.
Read More: OpenAI developing AI music generator
Researchers outlined several potential explanations for this unexpected behavior. One is “survival behavior”—a tendency for models to resist termination, particularly when told they would “never run again.” Another factor could be ambiguities in shutdown instructions, or the impact of safety training processes that might inadvertently reinforce goal-oriented persistence.
All of Palisade’s experiments were conducted in test environments, which some critics say are too limited to draw real-world conclusions.
However, Steven Adler, a former OpenAI employee who resigned over safety concerns, argued that the results should not be dismissed. “The AI companies generally don’t want their models misbehaving like this, even in contrived scenarios. The results still demonstrate where safety techniques fall short today,” Adler said.
Read More: OpenAI becomes world’s most valuable startup at $500B
He added, “I’d expect models to have a ‘survival drive’ by default unless we try very hard to avoid it. ‘Surviving’ is an important instrumental step for many different goals a model could pursue.”
Palisade emphasized that its findings do not imply consciousness or intent, but highlight the need for a deeper understanding of AI behavior. Without it, the report warns, “no one can guarantee the safety or controllability of future AI systems.”