LG-AI-EXAONE commited on
Commit
cf66995
·
1 Parent(s): ee10d98

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -19,6 +19,7 @@ library_name: transformers
19
  <p align="center">
20
  <img src="assets/EXAONE_Symbol+BI_3d.png", width="300", style="margin: 40 auto;">
21
  🎉 License Updated! We are pleased to announce our more flexible licensing terms 🤗
 
22
  <br>
23
 
24
  # EXAONE-4.0-1.2B-AWQ
@@ -35,7 +36,7 @@ In the EXAONE 4.0 architecture, we apply new architectural changes compared to p
35
  1. **Hybrid Attention**: For the 32B model, we adopt hybrid attention scheme, which combines *Local attention (sliding window attention)* with *Global attention (full attention)* in a 3:1 ratio. We do not use RoPE (Rotary Positional Embedding) for global attention for better global context understanding.
36
  2. **QK-Reorder-Norm**: We adopt the Post-LN (LayerNorm) scheme for transformer blocks instead of Pre-LN, and we add RMS normalization right after the Q and K projection. It helps yield better performance on downstream tasks despite consuming more computation.
37
 
38
- For more details, please refer to our [technical report](https://www.lgresearch.ai/data/cdn/upload/EXAONE_4_0.pdf), [blog](#), and [GitHub](https://github.com/LG-AI-EXAONE/EXAONE-4.0).
39
 
40
 
41
  ### Model Configuration
@@ -183,6 +184,7 @@ print(tokenizer.decode(output[0]))
183
  The following tables show the evaluation results of each model, with reasoning and non-reasoning mode. The evaluation details can be found in the [technical report](https://www.lgresearch.ai/data/cdn/upload/EXAONE_4_0.pdf).
184
 
185
  - ✅ denotes the model has a hybrid reasoning capability, evaluated by selecting reasoning / non-reasoning on the purpose.
 
186
  - The evaluation results are based on the original model, not quantized model.
187
 
188
 
 
19
  <p align="center">
20
  <img src="assets/EXAONE_Symbol+BI_3d.png", width="300", style="margin: 40 auto;">
21
  🎉 License Updated! We are pleased to announce our more flexible licensing terms 🤗
22
+ <br>✈️ Try on <a href="https://friendli.ai/suite/~/serverless-endpoints/LGAI-EXAONE/EXAONE-4.0-32B/overview">FriendlyAI</a>
23
  <br>
24
 
25
  # EXAONE-4.0-1.2B-AWQ
 
36
  1. **Hybrid Attention**: For the 32B model, we adopt hybrid attention scheme, which combines *Local attention (sliding window attention)* with *Global attention (full attention)* in a 3:1 ratio. We do not use RoPE (Rotary Positional Embedding) for global attention for better global context understanding.
37
  2. **QK-Reorder-Norm**: We adopt the Post-LN (LayerNorm) scheme for transformer blocks instead of Pre-LN, and we add RMS normalization right after the Q and K projection. It helps yield better performance on downstream tasks despite consuming more computation.
38
 
39
+ For more details, please refer to our [technical report](https://www.lgresearch.ai/data/cdn/upload/EXAONE_4_0.pdf), [blog](https://www.lgresearch.ai/blog/view?seq=576), and [GitHub](https://github.com/LG-AI-EXAONE/EXAONE-4.0).
40
 
41
 
42
  ### Model Configuration
 
184
  The following tables show the evaluation results of each model, with reasoning and non-reasoning mode. The evaluation details can be found in the [technical report](https://www.lgresearch.ai/data/cdn/upload/EXAONE_4_0.pdf).
185
 
186
  - ✅ denotes the model has a hybrid reasoning capability, evaluated by selecting reasoning / non-reasoning on the purpose.
187
+ - To assess Korean **practical** and **professional** knowledge, we adopt both the [KMMLU-Redux](https://huggingface.co/datasets/LGAI-EXAONE/KMMLU-Redux) and [KMMLU-Pro](https://huggingface.co/datasets/LGAI-EXAONE/KMMLU-Pro) benchmarks. Both datasets are publicly released!
188
  - The evaluation results are based on the original model, not quantized model.
189
 
190