sf-diogenes-v0.1
This model is a merge of the Qwen/Qwen3-Next-80B-A3B-Instruct base model with the urm3l/diogenes-v.01-lora-adapter LoRA adapter.
Model Details
- Base Model: Qwen/Qwen3-Next-80B-A3B-Instruct
- LoRA Adapter: urm3l/diogenes-v.01-lora-adapter
- Merge Date: 2025-11-01
- Model Size: ~80B parameters
- Precision: BF16
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"urm3l/sf-diogenes-v0.1",
torch_dtype="auto",
device_map="auto",
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(
"urm3l/sf-diogenes-v0.1",
trust_remote_code=True
)
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Details
This model was fine-tuned using LoRA (Low-Rank Adaptation). For training details, see the adapter repository.
Limitations
- This is a large language model and may produce incorrect or biased outputs
- Should not be used for high-stakes decision making without human oversight
- May require significant computational resources for inference
License
This model inherits the license from the base model: Apache 2.0
Citation
If you use this model, please cite the original Qwen3 paper and acknowledge the LoRA fine-tuning.
- Downloads last month
- 72
Model tree for tilman-d/sf-diogenes-v0.1
Base model
Qwen/Qwen3-Next-80B-A3B-Instruct