XiaodongChen
/

Llama-2-4.7B

Model card Files Files and versions

XiaodongChen commited on Feb 26

Commit

6b5e304

·

verified ·

1 Parent(s): e0e2dd8

Update README.md

Files changed (1) hide show

README.md +12 -3

README.md CHANGED Viewed

@@ -1,3 +1,12 @@
----
-license: mit
----

+---
+license: mit
+base_model:
+- meta-llama/Llama-2-7b-hf
+---
+The model is derived from Llama-2-7b-hf through pruning using LLM-Streamline **(Streamlining Redundant Layers to Compress Large Language Models, ICLR 2025 Spotlight)**. The entire training process required only 0.06B tokens.
+Below are the results of the evaluation using lm-eval:
+|              | arc_c | arc_e | boolq | hellaswag | openbookqa | rte  | winogrande | Avg  |
+|--------------|-------|-------|-------|-----------|------------|------|------------|------|
+| Llama-2-7b   | 43.3  | 76.4  | 77.7  | 57.2      | 31.4       | 62.8 | 69.1       | 59.7 |
+| Llama-2-4.7b | 34.0  | 64.6  | 74.7  | 49.8      | 27.4       | 61.7 | 66.4       | 54.1 |