metadata
library_name: transformers
license: mit
datasets:
- AnkitSatpute/zbMath_allft
language:
- en
base_model:
- meta-llama/Llama-3.1-8B
pipeline_tag: text-classification
Model Card for Model ID
The model is trained to generate document embeddings for math research papers and use the embeddings to find similar ranked documents.
Model Details
LLaMa model that is trained with contrastive learning for sequence classification.
How to use for generating embeddings
from transformers import AutoTokenizer, AutoModel import torch
sentences = ["Hello Who are you?", "I am fine thank you"]
tokenizer = AutoTokenizer.from_pretrained('AnkitSatpute/Llama-3.1-ReRank-AllFTbtAbstr')
model = AutoModel.from_pretrained('AnkitSatpute/Llama-3.1-ReRank-AllFTbtAbstr')
model.eval()
with torch.no_grad():
model_output = model(**encoded_input)
sentence_embeddings = model_output[0][:, 0]
sentence_embeddings = torch.nn.functional.normalize(sentence_embeddings, p=2, dim=1)
print("Sentence embeddings:", sentence_embeddings)