Update README.md
Browse files
README.md
CHANGED
|
@@ -63,13 +63,21 @@ Features of this architecture:
|
|
| 63 |
|
| 64 |
### Step 1: Environment Setup
|
| 65 |
|
| 66 |
-
Since Hymba-1.5B-Instruct employs [FlexAttention](https://pytorch.org/blog/flexattention/), which relies on Pytorch2.5 and other related dependencies,
|
|
|
|
|
|
|
| 67 |
|
| 68 |
```
|
| 69 |
wget --header="Authorization: Bearer YOUR_HF_TOKEN" https://huggingface.co/nvidia/Hymba-1.5B-Base/resolve/main/setup.sh
|
| 70 |
bash setup.sh
|
| 71 |
```
|
| 72 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 73 |
|
| 74 |
### Step 2: Chat with Hymba-1.5B-Instruct
|
| 75 |
After setting up the environment, you can use the following script to chat with our Model
|
|
@@ -99,7 +107,7 @@ stopping_criteria = StoppingCriteriaList([StopStringCriteria(tokenizer=tokenizer
|
|
| 99 |
outputs = model.generate(
|
| 100 |
tokenized_chat,
|
| 101 |
max_new_tokens=256,
|
| 102 |
-
do_sample=
|
| 103 |
temperature=0.7,
|
| 104 |
use_cache=True,
|
| 105 |
stopping_criteria=stopping_criteria
|
|
|
|
| 63 |
|
| 64 |
### Step 1: Environment Setup
|
| 65 |
|
| 66 |
+
Since Hymba-1.5B-Instruct employs [FlexAttention](https://pytorch.org/blog/flexattention/), which relies on Pytorch2.5 and other related dependencies, we provide two ways to setup the environment:
|
| 67 |
+
|
| 68 |
+
- **[Local install]** Install the related packages using our provided `setup.sh` (support CUDA 12.1/12.4):
|
| 69 |
|
| 70 |
```
|
| 71 |
wget --header="Authorization: Bearer YOUR_HF_TOKEN" https://huggingface.co/nvidia/Hymba-1.5B-Base/resolve/main/setup.sh
|
| 72 |
bash setup.sh
|
| 73 |
```
|
| 74 |
|
| 75 |
+
- **[Docker]** A docker image is provided with all of Hymba's dependencies installed. You can download our docker image and start a container using the following commands:
|
| 76 |
+
```
|
| 77 |
+
docker pull ghcr.io/tilmto/hymba:v1
|
| 78 |
+
docker run --gpus all -v /home/$USER:/home/$USER -it ghcr.io/tilmto/hymba:v1 bash
|
| 79 |
+
```
|
| 80 |
+
|
| 81 |
|
| 82 |
### Step 2: Chat with Hymba-1.5B-Instruct
|
| 83 |
After setting up the environment, you can use the following script to chat with our Model
|
|
|
|
| 107 |
outputs = model.generate(
|
| 108 |
tokenized_chat,
|
| 109 |
max_new_tokens=256,
|
| 110 |
+
do_sample=False,
|
| 111 |
temperature=0.7,
|
| 112 |
use_cache=True,
|
| 113 |
stopping_criteria=stopping_criteria
|