Model Quantization — Concepts

Model Quantization · FP32 → INT8 → INT4

unseel.com · Scale · Zero-point · 4–8× smaller

Bits/weight 32
Size vs FP32 1×

State —

FP32 weight
Snapped to grid
Outlier / clipped
Quantized model

Unseel.com · Model Quantization