AWQ vs GPTQ vs BitNet
Three ways to shrink an LLM — scale the salient weights, compensate the rounding with second-order math, or train ternary so the matmul becomes addition.
A clear, side-by-side comparison with examples — part of Rudrite Research.