LLMs No Longer Require Powerful Servers: Researchers from MIT, KAUST, ISTA, and Yandex Introduce a New AI Approach to Rapidly Compress Large Language Models without a Significant Loss of Quality

  • HIGGS — the innovative method for compressing large language models was developed in collaboration with teams at Yandex Research, MIT, KAUST and ISTA.
  • HIGGS makes it possible to compress LLMs without additional data or resource-intensive parameter optimization.
  • Unlike other compression methods, HIGGS does not require specialized hardware and powerful GPUs. Models can be quantized directly on a smartphone or laptop in just a few minutes with no significant quality loss.
  • The method has already been used to quantize popular LLaMA 3.1 and 3.2-family models, as well as DeepSeek and Qwen-family models.
1 Like