Add pipeline tag: text-generation

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +46 -45
README.md CHANGED
@@ -1,46 +1,47 @@
1
- ---
2
- license: apache-2.0
3
- language:
4
- - en
5
- library_name: transformers
6
- ---
7
-
8
- # Introduction
9
-
10
- This repository contains the checkpoints of ICLR 2025 paper **[“Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models](https://arxiv.org/pdf/2411.03884)”.**
11
- In this work, we introduce a novel activation function called **Polynomial Composition (PolyCom)**, which enhances the expressiveness of large language models (LLMs) through dynamic polynomial compositions. Our method significantly improves the performance of dense and mixture of experts (MoE) models across a variety of downstream tasks, without adding significant computational overhead.
12
-
13
- # Datasets and Training
14
-
15
- We use the [RedPajama-Data-1T](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T) dataset and pretrain the PolyCom model on 250B tokens. For more training details, please refer to [the source code](https://github.com/BryceZhuo/PolyCom).
16
-
17
-
18
- # Inference
19
-
20
- Here is an example of how to use the PolyCom model for inference:
21
-
22
- ```python
23
- from transformers import AutoModelForCausalLM, AutoTokenizer
24
-
25
- model = AutoModelForCausalLM.from_pretrained(path_of_model, device_map="cuda",trust_remote_code=True)
26
- tokenizer = AutoTokenizer.from_pretrained(path_of_model, padding_side="right",trust_remote_code=True)
27
-
28
- prompt = "Hello, my name is"
29
- input_ids = tokenizer.encode(prompt, return_tensors='pt').to('cuda')
30
-
31
- greedy_output = model.generate(input_ids)
32
- print(tokenizer.decode(greedy_output[0], skip_special_tokens=True))
33
- ```
34
-
35
-
36
- # Citing this work
37
-
38
- If you find this work helpful or use it in your research, please consider citing our paper:
39
- ```bibtex
40
- @inproceedings{zhuo2025polycom,
41
- title={Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models},
42
- author={Zhijian Zhuo and Ya Wang and Yutao Zeng and Xiaoqing Li and Xun Zhou and Jinwen Ma},
43
- booktitle={ICLR 2025},
44
- year={2025}
45
- }
 
46
  ```
 
1
+ ---
2
+ language:
3
+ - en
4
+ library_name: transformers
5
+ license: apache-2.0
6
+ pipeline_tag: text-generation
7
+ ---
8
+
9
+ # Introduction
10
+
11
+ This repository contains the checkpoints of ICLR 2025 paper **[“Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models](https://arxiv.org/pdf/2411.03884)”.**
12
+ In this work, we introduce a novel activation function called **Polynomial Composition (PolyCom)**, which enhances the expressiveness of large language models (LLMs) through dynamic polynomial compositions. Our method significantly improves the performance of dense and mixture of experts (MoE) models across a variety of downstream tasks, without adding significant computational overhead.
13
+
14
+ # Datasets and Training
15
+
16
+ We use the [RedPajama-Data-1T](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T) dataset and pretrain the PolyCom model on 250B tokens. For more training details, please refer to [the source code](https://github.com/BryceZhuo/PolyCom).
17
+
18
+
19
+ # Inference
20
+
21
+ Here is an example of how to use the PolyCom model for inference:
22
+
23
+ ```python
24
+ from transformers import AutoModelForCausalLM, AutoTokenizer
25
+
26
+ model = AutoModelForCausalLM.from_pretrained(path_of_model, device_map="cuda",trust_remote_code=True)
27
+ tokenizer = AutoTokenizer.from_pretrained(path_of_model, padding_side="right",trust_remote_code=True)
28
+
29
+ prompt = "Hello, my name is"
30
+ input_ids = tokenizer.encode(prompt, return_tensors='pt').to('cuda')
31
+
32
+ greedy_output = model.generate(input_ids)
33
+ print(tokenizer.decode(greedy_output[0], skip_special_tokens=True))
34
+ ```
35
+
36
+
37
+ # Citing this work
38
+
39
+ If you find this work helpful or use it in your research, please consider citing our paper:
40
+ ```bibtex
41
+ @inproceedings{zhuo2025polycom,
42
+ title={Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models},
43
+ author={Zhijian Zhuo and Ya Wang and Yutao Zeng and Xiaoqing Li and Xun Zhou and Jinwen Ma},
44
+ booktitle={ICLR 2025},
45
+ year={2025}
46
+ }
47
  ```