Hosted on MSN
GLM-5.1 and Kimi K2.6 set new SWE-Bench Pro highs
Z.ai’s GLM-5.1 and Moonshot AI’s Kimi K2.6 have posted the highest known scores on SWE-Bench Pro, a benchmark for agentic coding capabilities. Tencent has introduced HY-Embodied-0.5, a new AI model ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results