Microsoft Enhances LLM Accuracy & Efficiency, Mistral's GPT-4 Rival Leaked, and Microsoft Discloses a New Risk
Microsoft shared some techniques that make LLMs more accurate and efficient, Mistral's GPT-4 competitor was leaked, and Microsoft discloses a new risk in their earnings report.
Microsoft Enhances LLM Accuracy with LASER Technique
Microsoft shared how Layer-Selective Rank Reduction (or LASER) can make LLMs more accurate. The technique basically replaces selected weight matrices in an LLM with their suitable low-rank approximation.
Surprisingly, the right LASER adjustments can actually reduce model loss, which enhances performance despite using lower weights. Using LASER, they increased accuracy as high as 20-30% for 3 different open source models, including Llama 2.
GPT-J's performance for gender prediction based on biographies improved from 70.9 percent accurate to 97.5 percent after a LASER intervention.
This innovation addresses the critical issue of factual inaccuracies in AI models, offering a simple and promising solution to both hallucinations and the broader challenge of ensuring reliable and accurate AI outputs.
Microsoft Improves LLM Efficiency with SliceGPT
That wasn't all, though. Microsoft introduced SliceGPT on Hugging Face as well. SliceGPT is a post-training sparsification method that significantly reduces the computational and memory demands of large language models.
By replacing each weight matrix with a compact, dense matrix and reducing the network's embedding dimension, SliceGPT achieves remarkable efficiency. It successfully removes up to 25% of parameters in leading models like LLAMA2-70B, OPT 66B, and Phi-2 while preserving up to 99% of their zero-shot task performance.
This advancement allows the streamlined models to operate on fewer GPUs and at increased speeds, slashing the total compute for inference by over a third, all without extra code optimization. Their code is publicly available on GitHub.
Mistral's Leak Signals a New GPT-4 Rival
Mistral's CEO, Arthur Mensch, confirmed a leak of a large language model that is nearing the performance levels of OpenAI's GPT-4.
Initially shared by an employee on Hugging Face and 4chan, the model, dubbed "miqu-1-70b," has showcased remarkable abilities in standard LLM benchmarks.
Funny enough, instead of demanding that this model be taken down from Hugging Face, the CEO simply requested that they "might consider attribution" to Mistral.
This unexpected leak and the potential release of a GPT-4 class open source model signify a major shift in the AI landscape, challenging the dominance of proprietary models and indicating a rapid catch-up by the open source community in the high-stakes race of generative AI technology.
Microsoft Discloses a New Risk for Strategic Alliances
Microsoft's earnings were this week, and Ben Thompson at Stratechery called out a new risk in their 10Q.
We also have limited ability to control or influence third parties with whom we have arrangements, which may impact our ability to realize the anticipated benefits.
Perhaps this addition is related to OpenAI firing Sam Altman last year, which could signal that Microsoft did not see this coming when they first invested in the company.