GPT-QModel quantized: GPT-QModel.
| Metric | MARLIN |
|--------------------------------|----------|
| arc_challenge :: acc,none | 0.2884 |
| arc_challenge :: acc_norm,none | 0.3208 |
| mmlu :: acc,none | 0.442 |
| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|----------------------------------------|------:|------|-----:|--------|---|-----:|---|-----:|
|arc_challenge | 1|none | 0|acc |↑ |0.2884|± |0.0132|
| | |none | 0|acc_norm|↑ |0.3208|± |0.0136|
|mmlu | 2|none | |acc |↑ |0.4420|± |0.0041|
|mmlu_humanities | 2|none | |acc |↑ |0.4066|± |0.0070|
|mmlu_formal_logic | 1|none | 0|acc |↑ |0.3413|± |0.0424|
|mmlu_high_school_european_history | 1|none | 0|acc |↑ |0.5576|± |0.0388|
|mmlu_high_school_us_history | 1|none | 0|acc |↑ |0.4804|± |0.0351|
|mmlu_high_school_world_history | 1|none | 0|acc |↑ |0.5612|± |0.0323|
|mmlu_international_law | 1|none | 0|acc |↑ |0.6364|± |0.0439|
|mmlu_jurisprudence | 1|none | 0|acc |↑ |0.5556|± |0.0480|
|mmlu_logical_fallacies | 1|none | 0|acc |↑ |0.4785|± |0.0392|
|mmlu_moral_disputes | 1|none | 0|acc |↑ |0.5289|± |0.0269|
|mmlu_moral_scenarios | 1|none | 0|acc |↑ |0.2380|± |0.0142|
|mmlu_philosophy | 1|none | 0|acc |↑ |0.4469|± |0.0282|
|mmlu_prehistory | 1|none | 0|acc |↑ |0.5062|± |0.0278|
|mmlu_professional_law | 1|none | 0|acc |↑ |0.3514|± |0.0122|
|mmlu_world_religions | 1|none | 0|acc |↑ |0.5497|± |0.0382|
|mmlu_other | 2|none | |acc |↑ |0.4841|± |0.0088|
|mmlu_business_ethics | 1|none | 0|acc |↑ |0.5000|± |0.0503|
|mmlu_clinical_knowledge | 1|none | 0|acc |↑ |0.4717|± |0.0307|
|mmlu_college_medicine | 1|none | 0|acc |↑ |0.4624|± |0.0380|
|mmlu_global_facts | 1|none | 0|acc |↑ |0.3000|± |0.0461|
|mmlu_human_aging | 1|none | 0|acc |↑ |0.4933|± |0.0336|
|mmlu_management | 1|none | 0|acc |↑ |0.6408|± |0.0475|
|mmlu_marketing | 1|none | 0|acc |↑ |0.7051|± |0.0299|
|mmlu_medical_genetics | 1|none | 0|acc |↑ |0.5000|± |0.0503|
|mmlu_miscellaneous | 1|none | 0|acc |↑ |0.5147|± |0.0179|
|mmlu_nutrition | 1|none | 0|acc |↑ |0.5196|± |0.0286|
|mmlu_professional_accounting | 1|none | 0|acc |↑ |0.3617|± |0.0287|
|mmlu_professional_medicine | 1|none | 0|acc |↑ |0.3419|± |0.0288|
|mmlu_virology | 1|none | 0|acc |↑ |0.4277|± |0.0385|
|mmlu_social_sciences | 2|none | |acc |↑ |0.5044|± |0.0089|
|mmlu_econometrics | 1|none | 0|acc |↑ |0.3158|± |0.0437|
|mmlu_high_school_geography | 1|none | 0|acc |↑ |0.5404|± |0.0355|
|mmlu_high_school_government_and_politics| 1|none | 0|acc |↑ |0.5130|± |0.0361|
|mmlu_high_school_macroeconomics | 1|none | 0|acc |↑ |0.4410|± |0.0252|
|mmlu_high_school_microeconomics | 1|none | 0|acc |↑ |0.4370|± |0.0322|
|mmlu_high_school_psychology | 1|none | 0|acc |↑ |0.6055|± |0.0210|
|mmlu_human_sexuality | 1|none | 0|acc |↑ |0.4580|± |0.0437|
|mmlu_professional_psychology | 1|none | 0|acc |↑ |0.4314|± |0.0200|
|mmlu_public_relations | 1|none | 0|acc |↑ |0.5273|± |0.0478|
|mmlu_security_studies | 1|none | 0|acc |↑ |0.5224|± |0.0320|
|mmlu_sociology | 1|none | 0|acc |↑ |0.6070|± |0.0345|
|mmlu_us_foreign_policy | 1|none | 0|acc |↑ |0.7200|± |0.0451|
|mmlu_stem | 2|none | |acc |↑ |0.3923|± |0.0086|
|mmlu_abstract_algebra | 1|none | 0|acc |↑ |0.3100|± |0.0465|
|mmlu_anatomy | 1|none | 0|acc |↑ |0.4519|± |0.0430|
|mmlu_astronomy | 1|none | 0|acc |↑ |0.5132|± |0.0407|
|mmlu_college_biology | 1|none | 0|acc |↑ |0.4167|± |0.0412|
|mmlu_college_chemistry | 1|none | 0|acc |↑ |0.3000|± |0.0461|
|mmlu_college_computer_science | 1|none | 0|acc |↑ |0.4200|± |0.0496|
|mmlu_college_mathematics | 1|none | 0|acc |↑ |0.3400|± |0.0476|
|mmlu_college_physics | 1|none | 0|acc |↑ |0.3039|± |0.0458|
|mmlu_computer_security | 1|none | 0|acc |↑ |0.6000|± |0.0492|
|mmlu_conceptual_physics | 1|none | 0|acc |↑ |0.3532|± |0.0312|
|mmlu_electrical_engineering | 1|none | 0|acc |↑ |0.5586|± |0.0414|
|mmlu_elementary_mathematics | 1|none | 0|acc |↑ |0.3386|± |0.0244|
|mmlu_high_school_biology | 1|none | 0|acc |↑ |0.5226|± |0.0284|
|mmlu_high_school_chemistry | 1|none | 0|acc |↑ |0.3941|± |0.0344|
|mmlu_high_school_computer_science | 1|none | 0|acc |↑ |0.4100|± |0.0494|
|mmlu_high_school_mathematics | 1|none | 0|acc |↑ |0.2926|± |0.0277|
|mmlu_high_school_physics | 1|none | 0|acc |↑ |0.2781|± |0.0366|
|mmlu_high_school_statistics | 1|none | 0|acc |↑ |0.3056|± |0.0314|
|mmlu_machine_learning | 1|none | 0|acc |↑ |0.4286|± |0.0470|
| Groups |Version|Filter|n-shot|Metric| |Value | |Stderr|
|--------------------|------:|------|------|------|---|-----:|---|-----:|
|mmlu | 2|none | |acc |↑ |0.4420|± |0.0041|
|mmlu_humanities | 2|none | |acc |↑ |0.4066|± |0.0070|
|mmlu_other | 2|none | |acc |↑ |0.4841|± |0.0088|
|mmlu_social_sciences| 2|none | |acc |↑ |0.5044|± |0.0089|
|mmlu_stem | 2|none | |acc |↑ |0.3923|± |0.0086|
- Downloads last month
- 50
Model tree for ModelCloud/Qwen2.5-0.5B-Instruct-gptqmodel-w4a16
Base model
Qwen/Qwen2.5-0.5B