It may be overfit so it’s a bit early to declare any sort of victory but it is really good to see open-weight models catch up to the closed source labs so quickly after their latest releases.
With recursive RL, the delta in time between a closed source model and an equivalent open weight (and ideally open source) model will shrink considerably over the next few years.
Without distinct data sets to train over and specifically narrow use cases to build expertise on, convergence seems at hand for this class of model architecture.
Design Arena (@Designarena)
BREAKING: GLM-5.2 is now 1st on Design Arena.
With an Elo of 1360, GLM-5.2 has jumped ahead of the now unavailable Claude Fable 5.
And it's open weights.
This is an improvement of 4 positions and 27 Elo points to achieve one of the highest Elo scores in our code categories since Design Arena started.
Huge congratulations to the @Zai_org on the release!
— https://nitter.net/Designarena/status/2066940737011560652#m