README.md · ertghiu256/Qwen3-4b-tcomanr-merge-v2.2 at main

Qwen3-4b-tcomanr-merge-v2.2 / README.md

ertghiu256

Update README.md

6dbde00 verified 3 months ago

preview code

raw

history blame contribute delete

5.9 kB

	---
	base_model:
	- huihui-ai/Huihui-Qwen3-4B-Thinking-2507-abliterated
	- Tesslate/UIGEN-T3-4B-Preview-MAX
	- ValiantLabs/Qwen3-4B-Esper3
	- ValiantLabs/Qwen3-4B-ShiningValiant3
	- ertghiu256/Qwen3-Hermes-4b
	- ertghiu256/qwen3-math-reasoner
	- ertghiu256/deepseek-r1-0528-distilled-qwen3
	- ertghiu256/qwen-3-4b-mixture-of-thought
	- Qwen/Qwen3-4B-Thinking-2507
	- POLARIS-Project/Polaris-4B-Preview
	- ertghiu256/qwen3-multi-reasoner
	- ertghiu256/qwen3-4b-code-reasoning
	library_name: transformers
	tags:
	- mergekit
	- merge
	- thinking
	- think
	- reasoning
	- reason
	- code
	- math
	- qwen
	- qwen3
	new_version: ertghiu256/Qwen3-4b-tcomanr-merge-v2.3
	---
	# Ties merged COde MAth aNd Reasoning model
	This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

	## Merge Details
	This model is a revision of the [ertghiu256/Qwen3-4b-tcomanr-merge-v2](https://huggingface.co/ertghiu256/Qwen3-4b-tcomanr-merge-v2/)

	This model aims to combine the code and math capabilities by merging Qwen 3 2507 with multiple Qwen 3 finetunes.

	# How to run
	You can run this model by using multiple interface choices

	## Transformers
	As the qwen team suggested to use
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "ertghiu256/Qwen3-4b-tcomanr-merge-v2.2"

	# load the tokenizer and the model
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype="auto",
	device_map="auto"
	)

	# prepare the model input
	prompt = "Give me a short introduction to large language model."
	messages = [
	{"role": "user", "content": prompt}
	]
	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True,
	)
	model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

	# conduct text completion
	generated_ids = model.generate(
	**model_inputs,
	max_new_tokens=32768
	)
	output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()

	# parsing thinking content
	try:
	# rindex finding 151668 (</think>)
	index = len(output_ids) - output_ids[::-1].index(151668)
	except ValueError:
	index = 0

	thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
	content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")

	print("thinking content:", thinking_content) # no opening <think> tag
	print("content:", content)
	```

	## Vllm
	Run this command
	```bash
	vllm serve ertghiu256/Qwen3-4b-tcomanr-merge-v2.2 --enable-reasoning --reasoning-parser deepseek_r1
	```

	## Sglang
	Run this command
	```bash
	python -m sglang.launch_server --model-path ertghiu256/Qwen3-4b-tcomanr-merge-v2.2 --reasoning-parser deepseek-r1
	```

	## llama.cpp
	Run this command
	```bash
	llama-server --hf-repo ertghiu256/Qwen3-4b-tcomanr-merge-v2.2
	```
	or
	```bash
	llama-cli --hf ertghiu256/Qwen3-4b-tcomanr-merge-v2.2
	```

	## Ollama
	Run this command
	```bash
	ollama run hf.co/ertghiu256/Qwen3-4b-tcomanr-merge-v2.2:Q8_0
	```
	or
	```bash
	ollama run hf.co/ertghiu256/Qwen3-4b-tcomanr-merge-v2.2:Q5_K_M
	```

	## LM Studio
	Search
	```
	ertghiu256/Qwen3-4b-tcomanr-merge-v2.2
	```
	in the lm studio model search list then download

	### Recomended parameters
	```
	temp: 0.7
	num_ctx: ≥8192
	top_p: 0.95
	top_k: 40
	Repeat Penalty: 1.1
	```

	### Merge Method

	This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [Qwen/Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507) as a base.

	### Models Merged

	The following models were included in the merge:
	* [huihui-ai/Huihui-Qwen3-4B-Thinking-2507-abliterated](https://huggingface.co/huihui-ai/Huihui-Qwen3-4B-Thinking-2507-abliterated)
	* [Tesslate/UIGEN-T3-4B-Preview-MAX](https://huggingface.co/Tesslate/UIGEN-T3-4B-Preview-MAX)
	* [ValiantLabs/Qwen3-4B-Esper3](https://huggingface.co/ValiantLabs/Qwen3-4B-Esper3)
	* [ValiantLabs/Qwen3-4B-ShiningValiant3](https://huggingface.co/ValiantLabs/Qwen3-4B-ShiningValiant3)
	* [ertghiu256/Qwen3-Hermes-4b](https://huggingface.co/ertghiu256/Qwen3-Hermes-4b)
	* [ertghiu256/qwen3-math-reasoner](https://huggingface.co/ertghiu256/qwen3-math-reasoner)
	* [ertghiu256/deepseek-r1-0528-distilled-qwen3](https://huggingface.co/ertghiu256/deepseek-r1-0528-distilled-qwen3)
	* [ertghiu256/qwen-3-4b-mixture-of-thought](https://huggingface.co/ertghiu256/qwen-3-4b-mixture-of-thought)
	* [POLARIS-Project/Polaris-4B-Preview](https://huggingface.co/POLARIS-Project/Polaris-4B-Preview)
	* [ertghiu256/qwen3-multi-reasoner](https://huggingface.co/ertghiu256/qwen3-multi-reasoner)
	* [ertghiu256/qwen3-4b-code-reasoning](https://huggingface.co/ertghiu256/qwen3-4b-code-reasoning)

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	models:
	- model: ertghiu256/qwen3-math-reasoner
	parameters:
	weight: 0.8
	- model: ertghiu256/qwen3-4b-code-reasoning
	parameters:
	weight: 0.9
	- model: ertghiu256/qwen-3-4b-mixture-of-thought
	parameters:
	weight: 0.9
	- model: POLARIS-Project/Polaris-4B-Preview
	parameters:
	weight: 0.9
	- model: ertghiu256/qwen3-multi-reasoner
	parameters:
	weight: 0.8
	- model: ertghiu256/Qwen3-Hermes-4b
	parameters:
	weight: 0.8
	- model: ValiantLabs/Qwen3-4B-Esper3
	parameters:
	weight: 0.8
	- model: Tesslate/UIGEN-T3-4B-Preview-MAX
	parameters:
	weight: 0.9
	- model: ValiantLabs/Qwen3-4B-ShiningValiant3
	parameters:
	weight: 0.6
	- model: ertghiu256/deepseek-r1-0528-distilled-qwen3
	parameters:
	weight: 0.1
	- model: huihui-ai/Huihui-Qwen3-4B-Thinking-2507-abliterated
	parameters:
	weight: 0.6
	- model: Qwen/Qwen3-4B-Thinking-2507
	parameters:
	weight: 0.9
	merge_method: ties
	base_model: Qwen/Qwen3-4B-Thinking-2507
	parameters:
	normalize: true
	int8_mask: true
	lambda: 1.0
	dtype: float16

	```