Claude Sonnet 4.5 model tops the SWE-bench Verified benchmark at 77.2 percent, the company claims, outperforming rivals in generating high-quality code, identifying improvements, and executing ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results