hooman650 commited on
Commit
b385e6a
·
1 Parent(s): 69a20e6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md CHANGED
@@ -1,3 +1,32 @@
1
  ---
2
  license: mit
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ language:
4
+ - en
5
+ library_name: transformers
6
+ pipeline_tag: feature-extraction
7
  ---
8
+
9
+ # BGE-Large-En-V1.5-ONNX-O4
10
+
11
+ This is an `ONNX O4` strategy optimized version of [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) optimal for `Cuda`. It should be much faster than the original
12
+ version.
13
+
14
+ https://media.githubusercontent.com/media/huggingface/text-embeddings-inference/main/assets/bs1-tp.png
15
+
16
+ ## Usage
17
+
18
+ ```
19
+ # pip install "optimum[onnxruntime-gpu]" transformers
20
+
21
+ from optimum.onnxruntime import ORTModelForFeatureExtraction
22
+ from transformers import AutoTokenizer
23
+
24
+ tokenizer = AutoTokenizer.from_pretrained('hooman650/bge-large-en-v1.5-onnx-o4')
25
+ model = ORTModelForFeatureExtraction.from_pretrained('hooman650/bge-large-en-v1.5-onnx-o4')
26
+ model.to("cuda")
27
+
28
+ pairs = ["pandas usually live in the jungles"]
29
+ with torch.no_grad():
30
+ inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors='pt', max_length=512)
31
+ logits = model(**inputs, return_dict=True).logits
32
+ ```