Close Menu
World Forbes – Business, Tech, AI & Global Insights
  • Home
  • AI
  • Billionaires
  • Business
  • Cybersecurity
  • Education
    • Innovation
  • Money
  • Small Business
  • Sports
  • Trump
What's Hot

Fusion between culture and modernity as children dance in Kenya’s refugee camp

June 27, 2025

Former Amazon CEO Bezos’ wedding brings celebrities and glitterati to Venice, Italy

June 27, 2025

Anna Wintour seeks leader to steer day-to-day operations at Vogue

June 26, 2025
Facebook X (Twitter) Instagram
Trending
  • Fusion between culture and modernity as children dance in Kenya’s refugee camp
  • Former Amazon CEO Bezos’ wedding brings celebrities and glitterati to Venice, Italy
  • Anna Wintour seeks leader to steer day-to-day operations at Vogue
  • Kim Kardashian, Oprah Winfrey And Tom Brady Arrive In Venice
  • Supreme Court to decide birthright citizenship, other cases
  • Adults with ADHD find ways to stay focused at work
  • Tania León and Maria Teresa Kumar among Carnegie’s 2025 ‘Great Immigrants, Great Americans’ honorees
  • Issey Miyake showcases men’s collection at the Cartier Foundation
World Forbes – Business, Tech, AI & Global InsightsWorld Forbes – Business, Tech, AI & Global Insights
Friday, June 27
  • Home
  • AI
  • Billionaires
  • Business
  • Cybersecurity
  • Education
    • Innovation
  • Money
  • Small Business
  • Sports
  • Trump
World Forbes – Business, Tech, AI & Global Insights
Home » Meta’s vanilla Maverick AI model ranks below rivals on a popular chat benchmark
AI

Meta’s vanilla Maverick AI model ranks below rivals on a popular chat benchmark

adminBy adminApril 11, 2025No Comments2 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email
Post Views: 46


Earlier this week, Meta landed in hot water for using an experimental, unreleased version of its Llama 4 Maverick model to achieve a high score on a crowdsourced benchmark, LM Arena. The incident prompted the maintainers of LM Arena to apologize, change their policies, and score the unmodified, vanilla Maverick.

Turns out, it’s not very competitive.

The unmodified Maverick, “Llama-4-Maverick-17B-128E-Instruct,” was ranked below models including OpenAI’s GPT-4o, Anthropic’s Claude 3.5 Sonnet, and Google’s Gemini 1.5 Pro as of Friday. Many of these models are months old.

The release version of Llama 4 has been added to LMArena after it was found out they cheated, but you probably didn’t see it because you have to scroll down to 32nd place which is where is ranks pic.twitter.com/A0Bxkdx4LX

— ρ:ɡeσn (@pigeon__s) April 11, 2025

Why the poor performance? Meta’s experimental Maverick, Llama-4-Maverick-03-26-Experimental, was “optimized for conversationality,” the company explained in a chart published last Saturday. Those optimizations evidently played well to LM Arena, which has human raters compare the outputs of models and choose which they prefer.

As we’ve written about before, for various reasons, LM Arena has never been the most reliable measure of an AI model’s performance. Still, tailoring a model to a benchmark — besides being misleading — makes it challenging for developers to predict exactly how well the model will perform in different contexts.

In a statement, a Meta spokesperson told TechCrunch that Meta experiments with “all types of custom variants.”

“‘Llama-4-Maverick-03-26-Experimental’ is a chat optimized version we experimented with that also performs well on LM Arena,” the spokesperson said. “We have now released our open source version and will see how developers customize Llama 4 for their own use cases. We’re excited to see what they will build and look forward to their ongoing feedback.”





Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
admin
  • Website

Related Posts

After Klarna, Zoom’s CEO also uses an AI avatar on quarterly call

May 23, 2025

Anthropic CEO claims AI models hallucinate less than humans

May 22, 2025

Anthropic’s latest flagship AI sure seems to love using the ‘cyclone’ emoji

May 22, 2025

A safety institute advised against releasing an early version of Anthropic’s Claude Opus 4 AI model

May 22, 2025

Anthropic’s new AI model turns to blackmail when engineers try to take it offline

May 22, 2025

Meta adds another 650 MW of solar power to its AI push

May 22, 2025
Add A Comment
Leave A Reply Cancel Reply

Don't Miss
Billionaires

Kim Kardashian, Oprah Winfrey And Tom Brady Arrive In Venice

June 26, 2025

Topline Celebrities and billionaires have arrived in Venice ahead of Amazon billionaire Jeff Bezos’ extravagant—and…

Forbes’ Richest Self-Made Women In The World 2025

June 25, 2025

Here’s How Much New York City Mayoral Candidate Zohran Mamdani Is Worth

June 25, 2025

Hims & Hers CEO No Longer A Billionaire After Novo Nordisk Deal Collapses

June 23, 2025
Our Picks

Fusion between culture and modernity as children dance in Kenya’s refugee camp

June 27, 2025

Former Amazon CEO Bezos’ wedding brings celebrities and glitterati to Venice, Italy

June 27, 2025

Anna Wintour seeks leader to steer day-to-day operations at Vogue

June 26, 2025

Kim Kardashian, Oprah Winfrey And Tom Brady Arrive In Venice

June 26, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

About Us
About Us

Welcome to World-Forbes.com
At World-Forbes.com, we bring you the latest insights, trends, and analysis across various industries, empowering our readers with valuable knowledge. Our platform is dedicated to covering a wide range of topics, including sports, small business, business, technology, AI, cybersecurity, and lifestyle.

Our Picks

After Klarna, Zoom’s CEO also uses an AI avatar on quarterly call

May 23, 2025

Anthropic CEO claims AI models hallucinate less than humans

May 22, 2025

Anthropic’s latest flagship AI sure seems to love using the ‘cyclone’ emoji

May 22, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Facebook X (Twitter) Instagram Pinterest
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA Policy
  • Privacy Policy
  • Terms & Conditions
© 2025 world-forbes. Designed by world-forbes.

Type above and press Enter to search. Press Esc to cancel.