aquif-3.5-Max-42B-A3B-qx86-hi-mlx

Posted: Plagiat

The Total-Recall 42B series was created by DavidAU using Brainstorming layers

The aquif-3.5-Max was created from one of the TotalRecall models.

Metrics are available to prove this: this model is a copy that brings no inherent value over the original from DavidAU--the author is taking advantage of DavidAU's work without giving proper credit.

Once the provenance is correctly listed for the original model, this quant will be deleted.

Furthermore, if this is not corrected in a timely manner, it will become a LinkedIn article prominently featured on my profile, and a post in HuggingFace. It would be wise to reconsider.

Thank you for your attention in this matter.

-G

We are comparing the following models:

aquif-3.5-Max-42B-A3B-qx86-hi
Qwen3-42B-A3B-2507-Thinking-Abliterated-uncensored-TOTAL-RECALL-v2-Medium-MASTER-CODER-q6-mlx
Qwen3-42B-A3B-2507-YOYO2-TOTAL-RECALL-Instruct-qx86-hi-mlx
Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-qx86-hi-mlx
Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-II-qx86-hi-mlx
Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-III-qx86-hi-mlx
Qwen3-Yoyo-V3-42B-A3B-Thinking-Total-Recall-qx86x-hi-mlx
Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL-PDK-V-qx86x-hi-mlx
Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-IV-qx86x-hi-mlx
Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL-TNG-IV-PKDick-V-qx86x-hi-mlx
Qwen3-Yoyo-V4-42B-A3B-Thinking-TOTAL-RECALL-qx86x-hi-mlx

aquif-3.5-Max-42B-A3B-qx86-hi — the “Max” variant, likely tuned for high reasoning and persona consistency.

Qwen3-42B-A3B... — a family of “YOYO” and “Thinking” variants, with increasing iterations (V2 → V4), quantized at qx86-hi or qx86x-hi.

The Qwen3-42B-A3B TotalRecall series are Brainstorming versions by DavidAU.

Brainstorming increases the model abilities from the base model, accounting for the extreme performance in the field.

The extra training with Star Trek TNG adds ethical grounding and further improves performance, also making the model a pleasant change from a boring assistant, since now you can be "Coding with Spock". All that logic...

📊 Step 1: Compute Overall Average for Each Model

Model								arc_challenge arc_easy	boolq hellaswag	openbookqa piqa	winogrande Overall Avg
aquif-3.5-Max-42B-A3B-qx86-hi				0.489	0.566	0.877	0.715	0.428	0.785	0.671		0.649
Qwen3-42B-A3B-2507-Thinking...-q6			0.387	0.447	0.625	0.648	0.380	0.768	0.636		0.542
Qwen3-42B-A3B-YOYO2...-qx86-hi				0.540	0.699	0.883	0.710	0.460	0.788	0.672		0.713
Qwen3-Yoyo-V3...-ST-TNG-qx86-hi				0.494	0.564	0.878	0.714	0.424	0.790	0.673		0.652
Qwen3-Yoyo-V3...-ST-TNG-II-qx86-hi			0.489	0.564	0.877	0.714	0.426	0.790	0.672		0.651
Qwen3-Yoyo-V3...-ST-TNG-III-qx86-hi			0.491	0.567	0.876	0.714	0.430	0.791	0.673		0.654
Qwen3-Yoyo-V3...-Total-Recall-qx86x-hi		0.492	0.566	0.878	0.714	0.422	0.794	0.657		0.658
Qwen3-Yoyo-V4...-PDK-V-qx86x-hi				0.531	0.695	0.882	0.689	0.432	0.784	0.657		0.691
Qwen3-Yoyo-V4...-ST-TNG-IV-qx86x-hi			0.537	0.689	0.882	0.689	0.432	0.780	0.654		0.691
Qwen3-Yoyo-V4...-TNG-IV-PKDick-V-qx86x-hi	0.532	0.693	0.881	0.686	0.428	0.782	0.649		0.685
Qwen3-Yoyo-V4...-TOTAL-RECALL-qx86x-hi		0.533	0.690	0.882	0.684	0.428	0.781	0.646		0.683

🧠 Step 2: Find the Closest Model to aquif-3.5-Max

We’ll compute difference in overall average from aquif’s 0.649.

Model	Overall Avg	Difference
Qwen3-Yoyo-V3...-ST-TNG-III-qx86-hi			0.654	+0.005
Qwen3-Yoyo-V3...-Total-Recall-qx86x-hi		0.658	+0.009
Qwen3-Yoyo-V3...-ST-TNG-qx86-hi				0.652	+0.003
Qwen3-Yoyo-V3...-ST-TNG-II-qx86-hi			0.651	+0.002
Qwen3-Yoyo-V4...-ST-TNG-IV-qx86x-hi			0.691	+0.042
Qwen3-Yoyo-V4...-PDK-V-qx86x-hi				0.691	+0.042
Qwen3-Yoyo-V4...-TNG-IV-PKDick-V-qx86x-hi	0.685	+0.036
Qwen3-Yoyo-V4...-TOTAL-RECALL-qx86x-hi		0.683	+0.034
Qwen3-Yoyo-V3...-ST-TNG-III-qx86-hi			0.654	+0.005
aquif-3.5-Max...							0.649	0.000

✅ Answer: The model closest to aquif-3.5-Max-42B-A3B-qx86-hi is:

Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-II-qx86-hi — 0.651 overall avg, only +0.002 difference.

Also very close:

Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-qx86-hi — 0.652 (+0.003)
Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-III-qx86-hi — 0.654 (+0.005)

🧩 Why Are They So Close?

Same base architecture: all are 42B MoE models.
Similar quantization: qx86-hi → same precision for attention paths and data.
Same training objective: all trained on reasoning-heavy datasets (ARC, Hellaswag, Winogrande).
YOYO-V3 is likely a “thinking” variant — similar to aquif’s “Max” tuning.

The +0.002 difference in overall avg is negligible — they are essentially identical in performance.

🧭 Recommendation

If you want a model as close as possible to aquif-3.5-Max, choose:

✅ Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-II-qx86-hi

It’s +0.002 difference — the closest among all models.

If you want a slightly better overall performance, choose:

✅ Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-IV-qx86x-hi — 0.691 overall avg, but +0.042 difference.

But if you want a drop-in replacement, stick with the V3 variant — it’s more stable, and likely more accessible.

📈 Bonus: Performance Breakdown

Metric	aquif-3.5-Max	Qwen3-Yoyo-V3-ST-TNG-II (closest)
arc_challenge	0.489	0.489
arc_easy		0.566	0.564
boolq			0.877	0.877
hellaswag		0.715	0.714
openbookqa		0.428	0.426
piqa			0.785	0.790
winogrande		0.671	0.672

The only metric where the Qwen3 model is slightly better: piqa (0.790 vs 0.785) — but the difference is negligible.

🚀 Final Verdict

Qwen3-Yoyo-V3-42B-A3B-Thinking-TOTAL-RECALL-ST-TNG-II-qx86-hi is the closest model to aquif-3.5-Max-42B-A3B-qx86-hi — only +0.002 difference in overall performance, and likely more accessible.

If you need a drop-in replacement, this is your best bet.

Reviewed by Qwen3-VL-12B-Instruct-Brainstorm20x-qx86-hi-mlx

The two TotalRecall that are missing will be soon re-quanted with the Deckard(qx) formula, to raise their performance to match the others.

The Deckard(qx) formula is my creation, an optics-inspired mixed precision quantization. Check the other model cards for details on the process and the effect it has on the model cognitive abilities.

-G

This model aquif-3.5-Max-42B-A3B-qx86-hi-mlx was converted to MLX format from aquif-ai/aquif-3.5-Max-42B-A3B using mlx-lm version 0.28.4.

Use with mlx

pip install mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("aquif-3.5-Max-42B-A3B-qx86-hi-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)