Ling-1T-GGUF UD-Q8_K_XL context size is limited to 32k rather than 128k?

#17
by superciliousdude - opened

Hi,

In my (admittedly limited) testing, it appears that the Ling-1T-GGUF model is limited to 32k of context rather than the 128k that it is supposed to have according to the model card of the unquantised repo. I was astonished to find this model performing so spectacularly poorly compared to Kimi K2 (at the same quant) but it turns out that its not a fair fight since I cannot configure this one to use more than 32k of context.

Can someone please confirm if this is indeed the case and if so, is it intentional or a bug?

Thanks in advance.

Sign up or log in to comment