AnkitSatpute's picture
Update README.md
936f079 verified
metadata
library_name: transformers
license: mit
datasets:
  - AnkitSatpute/zbMath_allft
language:
  - en
base_model:
  - meta-llama/Llama-3.1-8B
pipeline_tag: text-classification

Model Card for Model ID

The model is trained to generate document embeddings for math research papers and use the embeddings to find similar ranked documents.

Model Details

LLaMa model that is trained with contrastive learning for sequence classification.

How to use for generating embeddings

from transformers import AutoTokenizer, AutoModel import torch

sentences = ["Hello Who are you?", "I am fine thank you"]

tokenizer = AutoTokenizer.from_pretrained('AnkitSatpute/Llama-3.1-ReRank-AllFTbtAbstr')

model = AutoModel.from_pretrained('AnkitSatpute/Llama-3.1-ReRank-AllFTbtAbstr')

model.eval()

with torch.no_grad():

model_output = model(**encoded_input)

sentence_embeddings = model_output[0][:, 0]

sentence_embeddings = torch.nn.functional.normalize(sentence_embeddings, p=2, dim=1)

print("Sentence embeddings:", sentence_embeddings)