Google Gemini's Questionable Metrics, Meta & IBM Launch AI Alliance, & xAI Files for Up to $1 Billion Raise

We'll dive into the questionable metrics and exaggerated demo that Google presented with Gemini Ultra, discuss a new open-source AI alliance, and Elon Musk is raising more money for his AI company.

Dec 08, 2023

Google Gemini Announcement Used Questionable Metrics & Exaggerated Demo

Google finally announced their long awaited AI model Gemini. The Ultra version will be available "early next year", and the Pro version will be available on December 13th.

They tout that Gemini Ultra is the highest performing LLM in the market (which, to be clear, is not on the market until "early next year"), and "exceeds current state-of-the-art results on 30 of the 32 widely-used academic benchmarks".

However, Joey Krug from Founders Fund caught that Google essentially fudged the benchmark comparisons in their fine print, meaning that GPT-4 is still superior at MMLU. Measuring Massive Multitask Language Understanding (MMLU) is a broad benchmark of how well a large language model understands language and can solve problems with the knowledge it encountered during training.

Notice how Gemini Ultra & GPT-4 used different methodologies for the same benchmarks (blue arrows).

Google created a new non-standard methodology, then showed that next to OpenAI, which was measured using standard methods. If you dig around for Google's technical document, you can compare Gemini Ultra vs GPT-4 apples to apples, which shows that GPT-4 still outperforms Gemini Ultra in MMLU (blue arrow).

It gets better, tech columnist Parmy Olson wrote that Google's impressive Gemini Ultra demo was not done in real time or in voice.

In her Bloomberg article, she wrote that "a Google spokesperson said it was made by using still image frames from the footage, and prompting via text." In other words, not at all like the demo Google showed above.

With all of the hype this week, here's a quick summary of Gemini:

Gemini Pro is available December 13th, no stats shared
Gemini Ultra has no firm release date, targeting "early next year"
Gemini Ultra scores used non-standard methodologies, which cannot be verified
GPT-4 is superior to Gemini Ultra in MMLU using standard methodologies

And that's about it. Given that, it is surprising how the news has been framing Gemini so far.

Perhaps we'll see more articles like Parmy Olson's next week.

Meta & IBM Launch AI Alliance As Counterweight to Closed Systems

Meta and IBM have spearheaded the formation of the AI Alliance, joining forces with names like Intel, Oracle, Cornell University, and the National Science Foundation to champion open innovation in artificial intelligence.

There are 5o+ companies in this alliance. Hugging Face, LangChain, and AMD are more names that represent the list.

This initiative emerges a year after the launch of OpenAI's ChatGPT, challenging the dominance of closed, proprietary AI systems like those developed by OpenAI, Anthropic, and Cohere. The Alliance, advocating for open source technology, aims to diversify the AI ecosystem, providing an alternative to the closed models that require payment for usage.

Elon Musk’s AI Venture xAI Aims for $1 Billion Funding to Challenge AI Giants

Elon Musk's AI startup, xAI, filed to raise up to $1 billion, positioning themselves as the world's AI alternative to Google, Microsoft, and OpenAI. He founded xAI in July with a mission to "to advance our collective understanding of the universe", xAI had already raised nearly $135 million.

xAI's recent launch of Grok, a bot which accesses real-time data by integrating with X (aka Twitter), marks its first major product release. Musk stated in November that X's investors will have a 25% stake in xAI. Access to Grok is being gradually extended to X platform's Premium+ subscribers.

AI Tools & Resources

Not really a tool, but if you ever wondered what a large language model looked like, this LLM visualization made by Brendan Bycroft is pretty fascinating.

Here's a nano-GPT, for instance.

And OpenAI's GPT-3