Update README.md
Browse files
README.md
CHANGED
|
@@ -123,4 +123,18 @@ input_ids, attention_mask = preprocess_input(tokenizer, system_prompt, initial_q
|
|
| 123 |
predicted_approach, _ = predict_approach(router_model, input_ids, attention_mask, device)
|
| 124 |
|
| 125 |
print(f"Router predicted approach: {predicted_approach}")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 126 |
```
|
|
|
|
| 123 |
predicted_approach, _ = predict_approach(router_model, input_ids, attention_mask, device)
|
| 124 |
|
| 125 |
print(f"Router predicted approach: {predicted_approach}")
|
| 126 |
+
```
|
| 127 |
+
|
| 128 |
+
## Citation
|
| 129 |
+
|
| 130 |
+
If you use this in your work, please cite:
|
| 131 |
+
|
| 132 |
+
```bibtex
|
| 133 |
+
@software{optillm,
|
| 134 |
+
title = {Optillm: Optimizing inference proxy for LLMs},
|
| 135 |
+
author = {Asankhaya Sharma},
|
| 136 |
+
year = {2024},
|
| 137 |
+
publisher = {GitHub},
|
| 138 |
+
url = {https://github.com/codelion/optillm}
|
| 139 |
+
}
|
| 140 |
```
|