- HIGGS — the innovative method for compressing large language models was developed in collaboration with teams at Yandex Research, MIT, KAUST and ISTA.
- HIGGS makes it possible to compress LLMs without additional data or resource-intensive parameter optimization.
- Unlike other compression methods, HIGGS does not require specialized hardware and powerful GPUs. Models can be quantized directly on a smartphone or laptop in just a few minutes with no significant quality loss.
- The method has already been used to quantize popular LLaMA 3.1 and 3.2-family models, as well as DeepSeek and Qwen-family models.
1 Like