prithivMLmods commited on
Commit
08fd00d
·
verified ·
1 Parent(s): 3969f67

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -1
README.md CHANGED
@@ -26,4 +26,63 @@ With its robust natural language processing capabilities, *Deepthink-Reasoning-1
26
  - It possesses significantly **more knowledge** and exhibits greatly improved capabilities in **coding** and **mathematics**, thanks to specialized expert models in these domains.
27
  - Offers substantial improvements in **instruction following**, **generating long texts** (over 8K tokens), **understanding structured data** (e.g., tables), and **producing structured outputs**, especially in JSON format. It is **more resilient to diverse system prompts**, enhancing role-play implementation and condition-setting for chatbots.
28
  - Provides **long-context support** for up to 128K tokens and can generate up to 8K tokens.
29
- - Features **multilingual support** for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
  - It possesses significantly **more knowledge** and exhibits greatly improved capabilities in **coding** and **mathematics**, thanks to specialized expert models in these domains.
27
  - Offers substantial improvements in **instruction following**, **generating long texts** (over 8K tokens), **understanding structured data** (e.g., tables), and **producing structured outputs**, especially in JSON format. It is **more resilient to diverse system prompts**, enhancing role-play implementation and condition-setting for chatbots.
28
  - Provides **long-context support** for up to 128K tokens and can generate up to 8K tokens.
29
+ - Features **multilingual support** for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.
30
+
31
+
32
+ # **Quickstart with Tranformers**
33
+
34
+ Here provides a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents.
35
+
36
+ ```python
37
+ from transformers import AutoModelForCausalLM, AutoTokenizer
38
+
39
+ model_name = "prithivMLmods/Deepthink-Reasoning-14B"
40
+
41
+ model = AutoModelForCausalLM.from_pretrained(
42
+ model_name,
43
+ torch_dtype="auto",
44
+ device_map="auto"
45
+ )
46
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
47
+
48
+ prompt = "Give me a short introduction to large language model."
49
+ messages = [
50
+ {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},
51
+ {"role": "user", "content": prompt}
52
+ ]
53
+ text = tokenizer.apply_chat_template(
54
+ messages,
55
+ tokenize=False,
56
+ add_generation_prompt=True
57
+ )
58
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
59
+
60
+ generated_ids = model.generate(
61
+ **model_inputs,
62
+ max_new_tokens=512
63
+ )
64
+ generated_ids = [
65
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
66
+ ]
67
+
68
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
69
+ ```
70
+
71
+ ### **Intended Use:**
72
+ 1. **Education:** Ideal for creating step-by-step solutions to complex problems, explanations, and generating educational content in multiple languages.
73
+ 2. **Programming:** Excels in coding tasks, debugging, and generating structured outputs such as JSON, enhancing productivity for developers.
74
+ 3. **Creative Writing:** Suitable for generating stories, essays, and other forms of creative content with logical and coherent structure.
75
+ 4. **Long-Context Processing:** Capable of handling and generating long texts, making it useful for summarizing lengthy documents or creating detailed reports.
76
+ 5. **Multilingual Applications:** Supports 29+ languages, enabling usage in global contexts for translation, multilingual education, and cross-cultural communication.
77
+ 6. **Data Structuring:** Performs well with structured data, such as tables and JSON outputs, making it effective for business analytics and automated report generation.
78
+ 7. **Chatbots and Role-Play:** Enhances chatbot interactions with its ability to follow diverse instructions, adapt to different prompts, and maintain long conversational contexts.
79
+
80
+
81
+ ### **Limitations:**
82
+ 1. **Resource Requirements:** Its large size and capabilities demand significant computational resources, making it less accessible for low-resource environments.
83
+ 2. **Hallucination Risk:** The model may generate incorrect or fabricated information, particularly when dealing with unknown or ambiguous inputs.
84
+ 3. **Limited Domain-Specific Expertise:** While it has broad knowledge, it might underperform in highly specialized fields not covered in its training data.
85
+ 4. **Long-Context Limitations:** Although it supports up to 128K tokens, performance may degrade or exhibit inefficiencies with extremely lengthy or complex contexts.
86
+ 5. **Bias in Outputs:** The model might reflect biases present in its training data, affecting its objectivity in certain contexts or cultural sensitivity in multilingual outputs.
87
+ 6. **Dependence on Prompt Quality:** Results heavily depend on well-structured and clear inputs. Poorly framed prompts can lead to irrelevant or suboptimal responses.
88
+ 7. **Error in Multilingual Output:** Despite robust multilingual support, subtle errors in grammar, syntax, or cultural nuances might appear, especially in low-resource languages.