nvidia
/

llama-embed-nemotron-8b

Feature Extraction

sentence-transformers

sentence-similarity

Model card Files Files and versions

ybabakhin commited on 5 days ago

Commit

f15a675

·

verified ·

1 Parent(s): d82b0aa

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -133,7 +133,7 @@ attn_implementation = "eager"  # Or "flash_attention_2"
 model = SentenceTransformer(
     "nvidia/llama-embed-nemotron-8b",
     trust_remote_code=True,
-    model_kwargs={"attn_implementation": attn_implementation, "torch_dtype": "float16"},
     tokenizer_kwargs={"padding_side": "left"},
 )

 model = SentenceTransformer(
     "nvidia/llama-embed-nemotron-8b",
     trust_remote_code=True,
+    model_kwargs={"attn_implementation": attn_implementation, "torch_dtype": "float32"},
     tokenizer_kwargs={"padding_side": "left"},
 )