MLCommons
banner
mlcommons.org
MLCommons
@mlcommons.org
MLCommons is an AI engineering consortium, built on a philosophy of open collaboration to improve AI systems. Through our collective engineering efforts, we continually measure and improve AI technologies' accuracy, safety, speed, and efficiency.
Pinned
1/
MLPerf Client v1.5 is here!
New AI PC benchmarking features:
-Windows ML support for improved GPU/NPU performance
-Expanded platforms: Windows x64, Windows on Arm, macOS, Linux, plus iPad app
-Experimental power/energy measurement
GUI improvements out of beta
Download: github.com/mlcommons/ml...
mlcommons.org
The best AI conversations don't happen in isolation.
They happen when researchers + practitioners + industry leaders are in the same room, working on shared challenges.
MLCommons Endpoints
Dec 1, San Diego (one day!)
Registration ↓
www.eventbrite.com/e/mlcommons-...
#Endpoints2025
MLCommons Endpoints
Join the MLCommons Endpoints event for a deep dive into the world of machine learning - it's going to be mind-blowing!
www.eventbrite.com
November 26, 2025 at 3:34 PM
Reposted by MLCommons
This is something we’ve put a lot of love into.

Good, bad, or indifferent, AI is something we have to deal with. Join us for a bit more of a journey into the fundamentals of AI and some of the research we’ve done with our members and partners.

Cc @odihq.bsky.social
Don’t miss #MLCommons Endpoints in San Diego, Dec 1–2!
Learn, connect, and shape the future of AI with top experts at Qualcomm Hall.
🗓 Dec 1–2 | 🎟 Free tickets available now!

www.eventbrite.com/e/mlcommons-...

#AI #MachineLearning #SanDiego
November 10, 2025 at 10:19 PM
Reposted by MLCommons
Don’t miss #MLCommons Endpoints in San Diego, Dec 1–2!
Learn, connect, and shape the future of AI with top experts at Qualcomm Hall.
🗓 Dec 1–2 | 🎟 Free tickets available now!

www.eventbrite.com/e/mlcommons-...

#AI #MachineLearning #SanDiego
November 10, 2025 at 10:17 PM
Thanks to Babel Tech Reviews for this deep dive of CORSAIR AI Workstation 300 and its handling of real-world AI workloads. Showing why tools like MLPerf Client matter: they give a clear view of how systems actually perform on common LLM & inference tasks.
babeltechreviews.com/the-corsair-...
The CORSAIR AI Workstation 300 Review – The Battle of the Mini PCs vs. the Maxi
CORSAIR sent us the AI Workstation 300, a remarkably compact system that is smaller than a shoebox. Designed primarily for AI/machine learning, it is a ... Read more
babeltechreviews.com
November 20, 2025 at 5:22 PM
MLCommons benchmarks are now cited in #ISO /IEC AI testing standards - and we're helping shape what comes next. Contributing to standards on benchmark quality (42119-8) and AI management systems (42003).
Standards are made by those in the room. mlcommons.org/2025/11/iso-...
#AIStandards #MLCommons
From Community Benchmarks to Global Standards: MLCommons Shaping AI Governance - MLCommons
MLCommons is bridging its open-source AI benchmarks with international standards (ISO/IEC JTC 1/SC 42), defining best practices for AI testing, governance, and risk management globally.
mlcommons.org
November 19, 2025 at 4:38 PM
1/
MLPerf Client v1.5 is here!
New AI PC benchmarking features:
-Windows ML support for improved GPU/NPU performance
-Expanded platforms: Windows x64, Windows on Arm, macOS, Linux, plus iPad app
-Experimental power/energy measurement
GUI improvements out of beta
Download: github.com/mlcommons/ml...
mlcommons.org
November 17, 2025 at 4:03 PM
MLPerf Training v5.1 results are live!
Record participation: 20 organizations submitted 65 unique systems featuring 12 different accelerators. Multi-node submissions increased 86% over last year, showing the industry's focus on scale.
Results: mlcommons.org/2025/11/trai...
#MLPerf
1/3
November 12, 2025 at 4:06 PM
Don’t miss #MLCommons Endpoints in San Diego, Dec 1–2!
Learn, connect, and shape the future of AI with top experts at Qualcomm Hall.
🗓 Dec 1–2 | 🎟 Free tickets available now!

www.eventbrite.com/e/mlcommons-...

#AI #MachineLearning #SanDiego
November 10, 2025 at 10:17 PM
IEEE Spectrum's analysis shows how our benchmarks capture the real industry challenge - LLMs scale exponentially while hardware improves incrementally.
This pattern highlights why evolving benchmarks are important. Stay tuned for MLPerf Training v5.1, out on 11/12.

spectrum.ieee.org/mlperf-trends
AI Model Growth Outpaces Hardware Improvements
AI training races are heating up as benchmarks get tougher.
spectrum.ieee.org
November 3, 2025 at 5:54 PM
🚨 NEW: We tested 39 AI models for security vulnerabilities.

Not a single one was as secure as it was "safe."

Today, we're releasing the industry's first standardized jailbreak benchmark. Here's what we found 🧵1/6

mlcommons.org/2025/10/ailu...
October 15, 2025 at 7:33 PM
We've just released MLPerf Mobile version 5.0.2 on the Google Play Store and GitHub. This version adds support for devices based on the Samsung Exynos 2500 SoC.

play.google.com/store/apps/d...
MLPerf Mobile - Apps on Google Play
An AI benchmark for mobile devices
play.google.com
October 1, 2025 at 4:07 PM
How can LLMs reliably find, understand, & analyze datasets?
By combining:

📂 #Croissant – AI-ready #metadata
#MCP – agentic access to data & tools
Our new blog introduces Eclair: tools that let #LLMs discover, download & explore millions of #datasets.

mlcommons.org/2025/10/croi...
Metadata, Meet Datasets: Croissant and MCP in Action - MLCommons
Metadata, Meet Datasets: Croissant and MCP in Action
mlcommons.org
October 1, 2025 at 3:31 PM
TinyML benchmarks finally address real-world deployment with MLCommons' new streaming benchmark in MLPerf Tiny v1.3. Tests 20-minute continuous wake word detection while measuring power and duty cycle.
Technical deep dive: mlcommons.org/2025/09/mlpe... #MLPerf #TinyML #EdgeAI
A New TinyML Streaming Benchmark for MLPerf Tiny v1.3 - MLCommons
A New TinyML Streaming Benchmark for MLPerf Tiny v1.3
mlcommons.org
September 24, 2025 at 6:29 PM
New MLPerf Tiny v1.3 results!

New streaming wake-word detection test + 70 results measuring sub-100KB neural networks across hardware platforms.

Thanks to Qualcomm , ST Microelectronics , Syntiantcorp & Kai Jiang for pushing tiny ML forward.

View results: mlcommons.org/2025/09/mlpe...
MLCommons New MLPerf Tiny 1.3 Benchmark Results Released - MLCommons
New data reveals advances in tiny neural network performance
mlcommons.org
September 17, 2025 at 3:54 PM
We have released MLPerf Mobile v5.0.1 with support for MediaTek Dimensity 9400 SoCs and a few other improvements.
The app is available for Android phones via the Google Play Store and the MLCommons GitHub repo.
Let us know what you think!
t.co/715ENpV6W4
https://play.google.com/store/apps/details?id=org.mlcommons.android.mlperfbench
t.co
September 16, 2025 at 4:22 PM
Reposted by MLCommons
@mlcommons.org dropped MLPerf inference 5.1 with three key benchmarks for enterprise AI: a reasoning model (DeepSeek-R1), speech recognition, and a smaller LLM for summarization tasks.

Catch up on this essential reading for AI decisions: www.techarena.ai/content/ai-b...

#MLCommons #MLPerf
AI Benchmarking Hits New Heights with MLPerf Inference 5.1 Release
Three groundbreaking inference benchmarks debut reasoning models, speech recognition, and ultra-low latency scenarios as 27 organizations deliver record results.
www.techarena.ai
September 9, 2025 at 3:27 PM
MLPerf Inference v5.1 results are live!
Record 27 organizations submitted 1,472 performance results across new and established AI workloads.
Three new benchmarks debut:

Reasoning with Deepseek R1
Speech to text with Whisper
Small LLM with Llama 3.1 8B

Read More: mlcommons.org/2025/09/mlpe...
September 9, 2025 at 6:15 PM
1/6
MLCommons & AVCC announce MLPerf Automotive v0.5 benchmark results—a major step for transparent, reproducible automotive AI performance data. mlcommons.org/2025/08/mlpe...
AVCC and MLCommons Release New MLPerf Automotive v0.5 Benchmark Results - MLCommons
AVCC® and MLCommons® announced new results for their new MLPerf® Automotive v0.5 benchmark
mlcommons.org
August 27, 2025 at 6:15 PM
1/ MLCommons just released results for the MLPerf Storage v2.0 benchmark—an industry-standard suite for measuring storage system performance in #ML workloads. This benchmark remains architecture-neutral, representative, and reproducible.
mlcommons.org/2025/08/mlpe...
New MLPerf Storage v2.0 Benchmark Results Demonstrate the Critical Role of Storage Performance in AI Training Systems - MLCommons
New checkpoint benchmarks provide “must-have” information for optimizing AI training
mlcommons.org
August 4, 2025 at 5:36 PM
MLPerf Client v1.0 is out! 🎉

The new benchmark for LLMs on PCs and client systems is now available—featuring expanded model support, new workload scenarios, and broad hardware integration.

Thank you to all submitters! #AMD, #Intel, @microsoft.com, #NVIDIA, #Qualcomm

mlcommons.org/2025/07/mlpe...
MLCommons Releases MLPerf Client v1.0: A New Standard for AI PC and Client LLM Benchmarking - MLCommons
MLCommons Releases MLPerf Client v1.0 with Expanded Models, Prompts, and Hardware Support, Standardizing AI PC Performance.
mlcommons.org
July 30, 2025 at 3:12 PM
MLCommons just launched MLPerf Mobile on the Google Play Store! 📱
Benchmark your Android device’s AI performance on real-world ML tasks with this free, open-source app.
Try it now: play.google.com/store/apps/d...
July 10, 2025 at 7:01 PM
Today, MLCommons is announcing a new collaboration with contributors from across academia, civil society, and industry to co-develop an open agent reliability evaluation standard to operationalize trust in agentic deployments.
🔗https://mlcommons.org/2025/06/ares-announce/
1/3
MLCommons Builds New Agentic Reliability Evaluation Standard in Collaboration with Industry Leaders - MLCommons
MLCommons and partners unite to create actionable reliability standards for next-generation AI agents.
mlcommons.org
June 27, 2025 at 7:07 PM
Reposted by MLCommons
We're all about acceleration! 😉
Watch @priya-kasimbeg.bsky.social & @fsschneider.bsky.social speedrun an explanation of the AlgoPerf benchmark, rules, and results all within a tight 5 minutes for our #ICLR2025 paper video on "Accelerating Neural Network Training". See you in Singapore!
April 3, 2025 at 11:15 AM
Companies are deploying AI tools that haven't been pressure-tested, and it's already backfiring.

In her new op-ed, our President, Rebecca Weiss, breaks down how industry-led AI reliability standards can help executives avoid costly, high-profile failures.

📖 More: bit.ly/3FP0kjg

@fastcompany.com
AI is posing immediate threats to your business. Here’s how to protect yourself
The AI threats your busines is facing right now, and how to prevent them
www.fastcompany.com
June 20, 2025 at 4:03 PM
Call for Submissions!

#MLCommons & @AVCConsortium are accepting submissions for the #MLPerf Automotive Benchmark Suite! Help drive fair comparisons & optimize AI systems in vehicles. Focus is on camera sensor perception.

📅 Submissions close June 13th, 2025

Join: mlcommons.org/community/su...
June 5, 2025 at 6:12 PM