granite-8b-code-instruct-4k-FP8
This is an FP8 quantized version of granite-8b-code-instruct-4k for efficient inference.
Model Description
- Base Model: granite-8b-code-instruct-4k
- Quantization: FP8 (E4M3 format)
- Quantization Method: llmcompressor oneshot with FP8 scheme
- Calibration Dataset: open_platypus (512 samples)
- Quantization Time: 21.6 minutes
Usage
With Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"TevunahAi/granite-8b-code-instruct-4k-FP8",
torch_dtype=torch.float8_e4m3fn, # FP8 dtype
device_map="auto",
low_cpu_mem_usage=True,
)
tokenizer = AutoTokenizer.from_pretrained("TevunahAi/granite-8b-code-instruct-4k-FP8")
# Generate
prompt = "Write a Python function to calculate fibonacci numbers:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
With vLLM (Recommended for production)
from vllm import LLM, SamplingParams
llm = LLM(model="TevunahAi/granite-8b-code-instruct-4k-FP8")
sampling_params = SamplingParams(temperature=0.7, max_tokens=256)
prompts = ["Write a Python function to calculate fibonacci numbers:"]
outputs = llm.generate(prompts, sampling_params)
Quantization Details
- Target Layers: All Linear layers except lm_head
- Precision: FP8 (E4M3 format)
- Hardware Requirements: NVIDIA Ada Lovelace or Hopper (native FP8) or Ampere with emulation
Quantization Infrastructure
Quantized on professional hardware to ensure quality and reliability:
- CPUs: Dual Intel Xeon Max 9480 (224 threads, 128GB HBM2e)
- GPU: NVIDIA RTX 5000 Ada Generation (32GB VRAM) with native FP8 support
- Memory: 256GB DDR5 + 128GB HBM2e = 384GB total
- Software: Ubuntu 25.10 | Python 3.12 | PyTorch 2.8 | CUDA 13 | llm-compressor
License
Apache 2.0 (same as original model)
Credits
- Original model by IBM Granite
- Quantized by TevunahAi
- Quantization powered by llm-compressor
- Downloads last month
- -
Model tree for TevunahAi/granite-8b-code-instruct-4k-FP8
Base model
ibm-granite/granite-8b-code-base-4k
Finetuned
ibm-granite/granite-8b-code-instruct-4k