Claude Sonnet 4.6 beats Opus in agentic tasks, adds 1 million context, and excels in finance and automation, all at one-fifth ...
EVMbench is OpenAI’s attempt to see whether modern AI systems are up to the task of helping prevent smart contract issues.