Close Menu
World Forbes – Business, Tech, AI & Global Insights
  • Home
  • AI
  • Billionaires
  • Business
  • Cybersecurity
  • Education
    • Innovation
  • Money
  • Small Business
  • Sports
  • Trump
What's Hot

Farmers’ Almanac to cease publication after 2 centuries of predicting the weather

November 7, 2025

Rockefeller Christmas tree begins journey to NYC from upstate

November 6, 2025

What to do if your airport is on the FAA’s flight cut list

November 6, 2025
Facebook X (Twitter) Instagram
Trending
  • Farmers’ Almanac to cease publication after 2 centuries of predicting the weather
  • Rockefeller Christmas tree begins journey to NYC from upstate
  • What to do if your airport is on the FAA’s flight cut list
  • Why autoimmune diseases mostly strike women and are often misdiagnosed
  • Why autoimmune diseases mostly strike women and are often misdiagnosed
  • How A $500 Million Cash Infusion From Wall Street Adds Billions To Ripple’s Founders’ Net Worths
  • Thousands of miles of lost Roman roads are uncovered using aerial photos
  • Toy Hall of Fame recognizes Slime, Battleship, Trivial Pursuit
World Forbes – Business, Tech, AI & Global InsightsWorld Forbes – Business, Tech, AI & Global Insights
Friday, November 7
  • Home
  • AI
  • Billionaires
  • Business
  • Cybersecurity
  • Education
    • Innovation
  • Money
  • Small Business
  • Sports
  • Trump
World Forbes – Business, Tech, AI & Global Insights
Home » OpenAI launches program to design new ‘domain-specific’ AI benchmarks
AI

OpenAI launches program to design new ‘domain-specific’ AI benchmarks

By adminApril 9, 2025No Comments2 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email
Post Views: 95


OpenAI thinks AI benchmarks are broken. Now the company is launching a program to fix how AI models are scored.

The new OpenAI Pioneers Program will focus on creating evaluations for AI models that “set the bar for what good looks like,” as OpenAI phrased it in a blog post.

“As the pace of AI adoption accelerates across industries, there is a need to understand and improve its impact in the world,” the company continued in its post. “Creating domain-specific evals are one way to better reflect real-world use cases, helping teams assess model performance in practical, high-stakes environments.”

As the recent controversy with the crowdsourced benchmark LM Arena and Meta’s Maverick model illustrate, it’s tough to know, these days, precisely what differentiates one model from another. Many widely used AI benchmarks measure performance on esoteric tasks, like solving doctorate-level math problems. Others can be gamed, or don’t align well with most people’s preferences.

Through the Pioneers Program, OpenAI hopes to create benchmarks for specific domains like legal, finance, insurance, healthcare, and accounting. The lab says that, in the coming months, it’ll work with “multiple companies” to design tailored benchmarks and eventually share those benchmarks publicly, along with “industry-specific” evaluations.

“The first cohort will focus on startups who will help lay the foundations of the OpenAI Pioneers Program,” OpenAI wrote in the blog post. “We’re selecting a handful of startups for this initial cohort, each working on high-value, applied use cases where AI can drive real-world impact.”

Companies in the program will also have the opportunity to work with OpenAI’s team to create model improvements via reinforcement fine tuning, a technique that optimizes models for a narrow set of tasks, OpenAI says.

The big question is whether the AI community will embrace benchmarks whose creation was funded by OpenAI. OpenAI has supported benchmarking efforts financially before, and designed its own evaluations. But partnering with customers to release AI tests may be seen as an ethical bridge too far.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
admin
  • Website

Related Posts

After Klarna, Zoom’s CEO also uses an AI avatar on quarterly call

May 23, 2025

Anthropic CEO claims AI models hallucinate less than humans

May 22, 2025

Anthropic’s latest flagship AI sure seems to love using the ‘cyclone’ emoji

May 22, 2025

A safety institute advised against releasing an early version of Anthropic’s Claude Opus 4 AI model

May 22, 2025

Anthropic’s new AI model turns to blackmail when engineers try to take it offline

May 22, 2025

Meta adds another 650 MW of solar power to its AI push

May 22, 2025
Add A Comment
Leave A Reply

Don't Miss
Billionaires

How A $500 Million Cash Infusion From Wall Street Adds Billions To Ripple’s Founders’ Net Worths

November 6, 2025

The company behind the world’s fourth largest crypto is reinventing itself as a conglomerate. Two…

World’s Largest Bubble Tea Chain Mixue Mints Two Newcomers To China’s 100 Richest List

November 5, 2025

Combined Wealth Surges Nearly A Third To $1.35 Trillion; Bottled Water Billionaire Zhong Shanshan Is No. 1

November 5, 2025

The Biggest Billionaire Donors To HBCUs

November 5, 2025
Our Picks

Farmers’ Almanac to cease publication after 2 centuries of predicting the weather

November 7, 2025

Rockefeller Christmas tree begins journey to NYC from upstate

November 6, 2025

What to do if your airport is on the FAA’s flight cut list

November 6, 2025

Why autoimmune diseases mostly strike women and are often misdiagnosed

November 6, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

About Us
About Us

Welcome to World-Forbes.com
At World-Forbes.com, we bring you the latest insights, trends, and analysis across various industries, empowering our readers with valuable knowledge. Our platform is dedicated to covering a wide range of topics, including sports, small business, business, technology, AI, cybersecurity, and lifestyle.

Our Picks

After Klarna, Zoom’s CEO also uses an AI avatar on quarterly call

May 23, 2025

Anthropic CEO claims AI models hallucinate less than humans

May 22, 2025

Anthropic’s latest flagship AI sure seems to love using the ‘cyclone’ emoji

May 22, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Facebook X (Twitter) Instagram Pinterest
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA Policy
  • Privacy Policy
  • Terms & Conditions
© 2025 world-forbes. Designed by world-forbes.

Type above and press Enter to search. Press Esc to cancel.