Actually functional this time Alpha/SimPO checkpoint trained from Apertus base.

Trained on a mix of curated C2, Gutenberg, and Instruct Skill-Mix

Alpaca chat template, temp .5, min_p .05, rep pen 1.05 seems reasonable.

Now I just have to make a better preference dataset...

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for ConicCat/Apertus-AlphaPO-5e-6-8B

Base model

Finetuned

(8)

this model

Quantizations