llm-course-hw1 / README.md
dmitry315's picture
Update README.md
fc84d38 verified
metadata
tags:
  - LLM
  - BPE
license: apache-2.0
language:
  - ru
model-index:
  - name: llm-course-hw1
    results:
      - task:
          type: text-generation
        metrics:
          - name: Crossentropy
            type: Crossentropy
            value: 3.093
datasets:
  - IgorVolochay/russian_jokes

Homework for VK NLP course.

Contains BPE Tokenizer and Transformer Model weights for Russian jokes generation.

Model was trained on IgorVolochay/russian_jokes dataset on next token generation.

Code for model class is available on VK NLP course.

How to use

device = torch.device("cuda")
# for cpu:
# device = torch.device("cpu")

# generate
tokenizer = ByteLevelBPETokenizer.from_pretrained(REPO_NAME)
check_model = TransformerForCausalLM.from_pretrained(REPO_NAME)
check_model = check_model.to(device)
check_model = check_model.eval()

# generate
text = "Штирлиц пришел домой" # your joke start is here
input_ids = torch.tensor(tokenizer.encode(text)[:-1], device=device)
model_output = check_model.generate(
    input_ids[None, :], max_new_tokens=200, eos_token_id=tokenizer.eos_token_id, do_sample=True, top_k=10
)
print(tokenizer.decode(model_output[0].tolist()))
# > Штирлиц пришел домой, а снега вдруг с кем-то на лесу и пьет. Слушай, а тот сейчас и весь день победила, но не сбежал. Просто у нее сдалось.