sf-diogenes-v0.1

This model is a merge of the Qwen/Qwen3-Next-80B-A3B-Instruct base model with the urm3l/diogenes-v.01-lora-adapter LoRA adapter.

Model Details

  • Base Model: Qwen/Qwen3-Next-80B-A3B-Instruct
  • LoRA Adapter: urm3l/diogenes-v.01-lora-adapter
  • Merge Date: 2025-11-01
  • Model Size: ~80B parameters
  • Precision: BF16

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "urm3l/sf-diogenes-v0.1",
    torch_dtype="auto",
    device_map="auto",
    trust_remote_code=True
)

tokenizer = AutoTokenizer.from_pretrained(
    "urm3l/sf-diogenes-v0.1",
    trust_remote_code=True
)

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

This model was fine-tuned using LoRA (Low-Rank Adaptation). For training details, see the adapter repository.

Limitations

  • This is a large language model and may produce incorrect or biased outputs
  • Should not be used for high-stakes decision making without human oversight
  • May require significant computational resources for inference

License

This model inherits the license from the base model: Apache 2.0

Citation

If you use this model, please cite the original Qwen3 paper and acknowledge the LoRA fine-tuning.

Downloads last month
72
Safetensors
Model size
80B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ 1 Ask for provider support

Model tree for tilman-d/sf-diogenes-v0.1

Adapter
(6)
this model