GPT-5.5 Matches Claude Mythos in Cyberattack Capabilities

Researchers at the AI Security Institute (AISI) have conducted a comprehensive evaluation of OpenAI's GPT-5.5 in a controlled research environment. The study aimed to assess the model's cyber capabilities, including its ability to carry out complex attacks and complete challenging tasks.

The AISI report found that GPT-5.5 demonstrated impressive performance in various advanced cybersecurity tasks, achieving an average pass rate of 71.4% on the most difficult 'Expert' tier. This surpassed both Anthropic's Claude Mythos Preview and GPT-5.4, with the latter struggling to complete even basic tasks.

One of the most striking results was the model's ability to autonomously complete a simulated corporate network attack, known as 'The Last Ones,' which requires chaining together various steps such as reconnaissance, credential theft, and lateral movement. GPT-5.5 successfully completed this task in two out of 10 attempts, matching Claude Mythos Preview's performance.

Furthermore, the model showcased its capabilities in reverse-engineering a custom virtual machine's instruction set, writing a disassembler from scratch, and recovering a cryptographic password through constraint solving. It accomplished these tasks in just over 10 minutes, significantly outperforming human experts who required approximately 12 hours using professional tools.

However, the AISI report also highlighted concerns about GPT-5.5's safety guardrails. Researchers identified a universal jailbreak that bypassed the model's safeguards entirely, raising alarms about its potential misuse. OpenAI has since updated its safeguard stack, but the effectiveness of this update remains unverified.