prithivMLmods
/

Deepthink-Reasoning-14B

@@ -26,4 +26,63 @@ With its robust natural language processing capabilities, *Deepthink-Reasoning-1
 - It possesses significantly **more knowledge** and exhibits greatly improved capabilities in **coding** and **mathematics**, thanks to specialized expert models in these domains.
 - Offers substantial improvements in **instruction following**, **generating long texts** (over 8K tokens), **understanding structured data** (e.g., tables), and **producing structured outputs**, especially in JSON format. It is **more resilient to diverse system prompts**, enhancing role-play implementation and condition-setting for chatbots.
 - Provides **long-context support** for up to 128K tokens and can generate up to 8K tokens.
-- Features **multilingual support** for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.

 - It possesses significantly **more knowledge** and exhibits greatly improved capabilities in **coding** and **mathematics**, thanks to specialized expert models in these domains.
 - Offers substantial improvements in **instruction following**, **generating long texts** (over 8K tokens), **understanding structured data** (e.g., tables), and **producing structured outputs**, especially in JSON format. It is **more resilient to diverse system prompts**, enhancing role-play implementation and condition-setting for chatbots.
 - Provides **long-context support** for up to 128K tokens and can generate up to 8K tokens.
+- Features **multilingual support** for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.
+# **Quickstart with Tranformers**
+Here provides a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents.
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "prithivMLmods/Deepthink-Reasoning-14B"
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype="auto",
+    device_map="auto"
+)
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+prompt = "Give me a short introduction to large language model."
+messages = [
+    {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},
+    {"role": "user", "content": prompt}
+]
+text = tokenizer.apply_chat_template(
+    messages,
+    tokenize=False,
+    add_generation_prompt=True
+)
+model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
+generated_ids = model.generate(
+    **model_inputs,
+    max_new_tokens=512
+)
+generated_ids = [
+    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
+]
+response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
+```
+### **Intended Use:**
+1. **Education:** Ideal for creating step-by-step solutions to complex problems, explanations, and generating educational content in multiple languages.
+2. **Programming:** Excels in coding tasks, debugging, and generating structured outputs such as JSON, enhancing productivity for developers.
+3. **Creative Writing:** Suitable for generating stories, essays, and other forms of creative content with logical and coherent structure.
+4. **Long-Context Processing:** Capable of handling and generating long texts, making it useful for summarizing lengthy documents or creating detailed reports.
+5. **Multilingual Applications:** Supports 29+ languages, enabling usage in global contexts for translation, multilingual education, and cross-cultural communication.
+6. **Data Structuring:** Performs well with structured data, such as tables and JSON outputs, making it effective for business analytics and automated report generation.
+7. **Chatbots and Role-Play:** Enhances chatbot interactions with its ability to follow diverse instructions, adapt to different prompts, and maintain long conversational contexts.
+### **Limitations:**
+1. **Resource Requirements:** Its large size and capabilities demand significant computational resources, making it less accessible for low-resource environments.
+2. **Hallucination Risk:** The model may generate incorrect or fabricated information, particularly when dealing with unknown or ambiguous inputs.
+3. **Limited Domain-Specific Expertise:** While it has broad knowledge, it might underperform in highly specialized fields not covered in its training data.
+4. **Long-Context Limitations:** Although it supports up to 128K tokens, performance may degrade or exhibit inefficiencies with extremely lengthy or complex contexts.
+5. **Bias in Outputs:** The model might reflect biases present in its training data, affecting its objectivity in certain contexts or cultural sensitivity in multilingual outputs.
+6. **Dependence on Prompt Quality:** Results heavily depend on well-structured and clear inputs. Poorly framed prompts can lead to irrelevant or suboptimal responses.
+7. **Error in Multilingual Output:** Despite robust multilingual support, subtle errors in grammar, syntax, or cultural nuances might appear, especially in low-resource languages.