Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
135
6
173
blakkd
owao
Follow
eyad-silx's profile picture
fpaupier's profile picture
HR1777's profile picture
6 followers
·
50 following
blakkd
[email protected]
AI & ML interests
None yet
Recent Activity
reacted
to
Kseniase
's
post
with 👍
about 3 hours ago
7+ Main precision formats used in AI: Precision is very important in AI as it shapes how accurate and efficient models are. It controls how finely numbers are represented, approximating real-world values with formats like fixed-point and floating-point. A recent BF16 → FP16 study renewed attention to precision impact. Here are the main precision types used in AI, from full precision for training to ultra-low precision for inference: 1. FP32 (Float32): Standard full-precision float used in most training: 1 sign bit, 8 exponent bits, 23 mantissa bits. Default for backward-compatible training and baseline numerical stability 2. FP16 (Float16) → https://arxiv.org/abs/2305.10947v6 Half-precision float. It balances accuracy and efficiency. 1 sign bit, 5 exponent bits, 10 mantissa bits. Common on NVIDIA Tensor Cores and mixed-precision setups. There’s now a new wave of using it in reinforcement learning: https://www.turingpost.com/p/fp16 3. BF16 (BFloat16) → https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus Same dynamic range as FP32 but fewer mantissa bits: 1 sign bit, 8 exponent bits (same as FP32), 7 mantissa bits. It was developed by the research group Google Brain as part of their AI/ML infrastructure work at Google. Preferred on TPUs and modern GPUs 4. FP8 (E4M3 / E5M2) → https://proceedings.neurips.cc/paper_files/paper/2018/file/335d3d1cd7ef05ec77714a215134914c-Paper.pdf Emerging standard for training and inference on NVIDIA Hopper (H100) and Blackwell (B200) tensor cores and AMD MI300. Also supported in NVIDIA’s Transformer Engine: https://developer.nvidia.com/blog/floating-point-8-an-introduction-to-efficient-lower-precision-ai-training/ E4M3 = 4 exponent, 3 mantissa bits E5M2 = 5 exponent, 2 mantissa bits Read further below ⬇️ If you like this, also subscribe to the Turing post: https://www.turingpost.com/subscribe
reacted
to
Kseniase
's
post
with 🤗
about 3 hours ago
7+ Main precision formats used in AI: Precision is very important in AI as it shapes how accurate and efficient models are. It controls how finely numbers are represented, approximating real-world values with formats like fixed-point and floating-point. A recent BF16 → FP16 study renewed attention to precision impact. Here are the main precision types used in AI, from full precision for training to ultra-low precision for inference: 1. FP32 (Float32): Standard full-precision float used in most training: 1 sign bit, 8 exponent bits, 23 mantissa bits. Default for backward-compatible training and baseline numerical stability 2. FP16 (Float16) → https://arxiv.org/abs/2305.10947v6 Half-precision float. It balances accuracy and efficiency. 1 sign bit, 5 exponent bits, 10 mantissa bits. Common on NVIDIA Tensor Cores and mixed-precision setups. There’s now a new wave of using it in reinforcement learning: https://www.turingpost.com/p/fp16 3. BF16 (BFloat16) → https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus Same dynamic range as FP32 but fewer mantissa bits: 1 sign bit, 8 exponent bits (same as FP32), 7 mantissa bits. It was developed by the research group Google Brain as part of their AI/ML infrastructure work at Google. Preferred on TPUs and modern GPUs 4. FP8 (E4M3 / E5M2) → https://proceedings.neurips.cc/paper_files/paper/2018/file/335d3d1cd7ef05ec77714a215134914c-Paper.pdf Emerging standard for training and inference on NVIDIA Hopper (H100) and Blackwell (B200) tensor cores and AMD MI300. Also supported in NVIDIA’s Transformer Engine: https://developer.nvidia.com/blog/floating-point-8-an-introduction-to-efficient-lower-precision-ai-training/ E4M3 = 4 exponent, 3 mantissa bits E5M2 = 5 exponent, 2 mantissa bits Read further below ⬇️ If you like this, also subscribe to the Turing post: https://www.turingpost.com/subscribe
liked
a model
about 22 hours ago
SamuelBang/AesCoder-4B
View all activity
Organizations
None yet
owao
's models
8
Sort: Recently updated
owao/ReaderLM-v2-Q8_0-GGUF
Text Generation
•
2B
•
Updated
2 days ago
•
5
owao/mem-agent-Q4_K_M-GGUF
Text Generation
•
4B
•
Updated
Sep 13
•
30
•
1
owao/MiroThinker-14B-DPO-v0.2-Q4_K_M-GGUF
Text Generation
•
15B
•
Updated
Sep 8
•
13
owao/MiroThinker-14B-DPO-v0.2-Q6_K-GGUF
Text Generation
•
15B
•
Updated
Sep 8
•
8
owao/Tri-7B-Search-preview-Q6_K-GGUF
8B
•
Updated
Jul 30
•
29
owao/Tri-7B-Search-preview-Q8_0-GGUF
8B
•
Updated
Jul 30
•
15
owao/EXAONE-4.0.1-32B-Q4_K_M-GGUF
Text Generation
•
32B
•
Updated
Jul 30
•
20
owao/RLT-32B-Q4_K_M-GGUF
Text Generation
•
33B
•
Updated
Jun 23
•
19
•
2