westenfelder
/

Llama-3.2-1B-Instruct-NL2SH

@@ -27,15 +27,15 @@ model-index:
 ---
 # Model Card for Llama-3.2-1B-Instruct-NL2SH
-This model translates natural language (English) instructions into Bash commands.
 ## Model Details
 ### Model Description
-This model is a fine-tuned version of the Llama-3.2-1B-Instruct model trained on the [NL2SH-ALFA](https://huggingface.co/datasets/westenfelder/NL2SH-ALFA) dataset for the task of natural language to Bash translation (NL2SH). For more information, please refer to the linked paper.
-- **Developed by:** Anyscale Learning For All (ALFA) Group at MIT-CSAIL
 - **Language:** English
 - **License:** MIT License
-- **Finetuned from model:** meta-llama/Llama-3.2-1B-Instruct
 ### Model Sources
 - **Repository:** [GitHub Repo](https://github.com/westenfelder/NL2SH)
@@ -47,6 +47,7 @@ This model is intended for research on machine translation. The model can also b
 ### Out-of-Scope Use
 This model should not be used in production or automated systems without human verification.
 **Considerations for use in high-risk environments:** This model should not be used in high-risk environments due to its low accuracy and potential for generating harmful commands.
 ## Bias, Risks, and Limitations
@@ -111,16 +112,16 @@ print(sh)
 This model was trained on the [NL2SH-ALFA](https://huggingface.co/datasets/westenfelder/NL2SH-ALFA) dataset.
 ### Training Procedure
-Please refer to section 4.1 and 4.3.4 of the paper for information about data pre-processing, training hyper-parameters and hardware.
 ## Evaluation
 This model was evaluated on the [NL2SH-ALFA](https://huggingface.co/datasets/westenfelder/NL2SH-ALFA) test set using the [InterCode-ALFA](https://github.com/westenfelder/InterCode-ALFA) benchmark.
 ### Results
-This model achieved an accuracy of 0.37 on the InterCode-ALFA benchmark.
 ## Environmental Impact
-Experiments were conducted using a private infrastructure, which has a carbon efficiency of 0.432 kgCO$_2$eq/kWh. A cumulative of 12 hours of computation was performed on hardware of type RTX A6000 (TDP of 300W). Total emissions are estimated to be 1.56 kgCO$_2$eq of which 0 percents were directly offset. Estimations were conducted using the [Machine Learning Emissions Calculator](https://mlco2.github.io/impact#compute).
 ## Citation
 **BibTeX:**

 ---
 # Model Card for Llama-3.2-1B-Instruct-NL2SH
+This model translates natural language (English) instructions to Bash commands.
 ## Model Details
 ### Model Description
+This model is a fine-tuned version of the [Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct) model trained on the [NL2SH-ALFA](https://huggingface.co/datasets/westenfelder/NL2SH-ALFA) dataset for the task of natural language to Bash translation (NL2SH). For more information, please refer to the [paper](https://arxiv.org/abs/2502.06858).
+- **Developed by:** [Anyscale Learning For All (ALFA) Group at MIT-CSAIL](https://alfagroup.csail.mit.edu/)
 - **Language:** English
 - **License:** MIT License
+- **Finetuned from model:** [meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct)
 ### Model Sources
 - **Repository:** [GitHub Repo](https://github.com/westenfelder/NL2SH)
 ### Out-of-Scope Use
 This model should not be used in production or automated systems without human verification.
 **Considerations for use in high-risk environments:** This model should not be used in high-risk environments due to its low accuracy and potential for generating harmful commands.
 ## Bias, Risks, and Limitations
 This model was trained on the [NL2SH-ALFA](https://huggingface.co/datasets/westenfelder/NL2SH-ALFA) dataset.
 ### Training Procedure
+Please refer to section 4.1 and 4.3.4 of the [paper](https://arxiv.org/abs/2502.06858) for information about data pre-processing, training hyper-parameters and hardware.
 ## Evaluation
 This model was evaluated on the [NL2SH-ALFA](https://huggingface.co/datasets/westenfelder/NL2SH-ALFA) test set using the [InterCode-ALFA](https://github.com/westenfelder/InterCode-ALFA) benchmark.
 ### Results
+This model achieved an accuracy of **0.37** on the InterCode-ALFA benchmark.
 ## Environmental Impact
+Experiments were conducted using a private infrastructure, which has a approximate carbon efficiency of 0.432 kgCO2eq/kWh. A cumulative of 12 hours of computation was performed on hardware of type RTX A6000 (TDP of 300W). Total emissions are estimated to be 1.56 kgCO2eq of which 0 percents were directly offset. Estimations were conducted using the [Machine Learning Emissions Calculator](https://mlco2.github.io/impact#compute).
 ## Citation
 **BibTeX:**