Update README.md

8b107b9 verified 6 months ago

3.69 kB

	---
	library_name: transformers
	tags:
	- text-generation-inference
	- code
	- llama-3.2
	- math
	- general-purpose
	license: llama3.2
	language:
	- en
	base_model:
	- meta-llama/Llama-3.2-1B
	pipeline_tag: text-generation
	---
	![8.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/K_bYZlzTOZjl5YJnjEy8j.png)

	# Oganesson-TinyLlama-1.2B

	> Oganesson-TinyLlama-1.2B is a lightweight and efficient language model built on the LLaMA 3.2 1.2B architecture. Fine-tuned for general-purpose inference, mathematical reasoning, and code generation, it’s ideal for edge devices, personal assistants, and educational applications requiring a compact yet capable model.

	> \[!note]
	> GGUF: [https://huggingface.co/prithivMLmods/Oganesson-TinyLlama-1.2B-GGUF](https://huggingface.co/prithivMLmods/Oganesson-TinyLlama-1.2B-GGUF)

	---

	## Key Features

	1. LLaMA 3.2 1.2B Core
	Powered by the latest TinyLLaMA (1.2B) variant of Meta's LLaMA 3.2, offering modern instruction-following and multilingual capabilities in a very small footprint.

	2. Modular Fine-Tuning
	Trained on a handcrafted modular dataset covering general-purpose reasoning, programming problems, and mathematical challenges.

	3. Mathematical Competence
	Solves equations, explains concepts, and performs symbolic logic in algebra, geometry, and calculus—ideal for lightweight tutoring use cases.

	4. Code Understanding & Generation
	Produces clean, interpretable code in Python, JavaScript, and more. Useful for micro-agents, code assistants, and embedded development tools.

	5. Versatile Output Formats
	Handles JSON, Markdown, LaTeX, and structured data output, enabling integration into tools and platforms needing formatted results.

	6. Edge-Optimized
	At only 1.2B parameters, this model is built for local inference, on-device usage, and battery-efficient environments.

	---

	## Quickstart with Transformers

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "prithivMLmods/Oganesson-TinyLlama-1.2B"

	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype="auto",
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained(model_name)

	prompt = "Write a Python function to compute the Fibonacci sequence."

	messages = [
	{"role": "system", "content": "You are a helpful coding and math assistant."},
	{"role": "user", "content": prompt}
	]

	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)

	model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

	generated_ids = model.generate(
	**model_inputs,
	max_new_tokens=512
	)
	generated_ids = [
	output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
	]

	response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
	print(response)
	```

	---

	## Intended Use

	* Lightweight reasoning for embedded and edge AI
	* Basic math tutoring and symbolic computation
	* Code generation and explanation for small apps
	* Technical content in Markdown, JSON, and LaTeX
	* Educational tools, personal agents, and low-power deployments

	---

	## Limitations

	* Smaller context window than 7B+ models
	* Less suitable for abstract reasoning or creative writing
	* May require prompt engineering for complex technical queries
	* Knowledge is limited to pretraining and fine-tuning datasets

	---

	## References

	1. [LLaMA 3 Technical Report (Meta)](https://ai.meta.com/llama/)
	2. [YaRN: Efficient Context Window Extension of Large Language Models](https://arxiv.org/pdf/2309.00071)