This means that if the final DS4 PRO model will be very strong, it will have an edge for local inference, speed-wise. I'm assuming GLM 5.2 2-bits (3?) quantized can't match the results of DS4 PRO final at the same quantization, which is likely but not obvious. Experiments needed.