Lightnews — Scholar-powered news

MLCommons

@mlcommons.org

MLCommons is an AI engineering consortium, built on a philosophy of open collaboration to improve AI systems. Through our collective engineering efforts, we continually measure and improve AI technologies' accuracy, safety, speed, and efficiency.

Posts Replies Media Videos

MLCommons

@mlcommons.org

3/3
Thanks to all submitters: AMD, ASUSTeK, Cisco, Datacrunch, Dell, Giga Computing, HPE, Krai, Lambda, Lenovo, MangoBoost, MiTAC, Nebius, NVIDIA, Oracle, Quanta Cloud Technology, Supermicro, University of Florida, Wiwynn

November 12, 2025 at 4:06 PM

MLCommons

@mlcommons.org

2/3
Two new benchmarks debut:
-Llama 3.1 8B: LLM pretraining (replaces BERT)
-Flux.1: Transformer-based text-to-image (replaces Stable Diffusion v2)

GenAI benchmarks show performance improvements outpacing Moore's Law, with 24% increase in Llama 2 70B LoRA submissions.

November 12, 2025 at 4:06 PM

MLCommons

@mlcommons.org

The next generation of AI won't just be innovative—it'll be resilient.

Access the benchmark and full findings: mlcommons.org/ailuminate/j...

Join the conversation!
6/6
#AIRiskandReliability #AISecurity

October 15, 2025 at 7:33 PM

MLCommons

@mlcommons.org

Why this matters:
→ Developers get standardized metrics to find and fix vulnerabilities
→ Policymakers get transparent, reproducible data
→ Users get systems they can actually trust

We're making hidden risks visible and measurable.
5/6

October 15, 2025 at 7:33 PM

MLCommons

@mlcommons.org

The Jailbreak Benchmark v0.5 tests AI resilience across:
-Text-to-text scenarios
-Multimodal scenarios
-12 hazard categories (violent crimes, CBRNE, child exploitation, suicide/self-harm, and more)

Built on our AILuminate safety benchmark methodology.
4/6

October 15, 2025 at 7:33 PM

MLCommons

@mlcommons.org

What is jailbreaking?

It's when users manipulate AI systems to bypass safety filters and produce harmful, unintended, or policy-violating content.

It's not theoretical. It's happening now.
3/6

October 15, 2025 at 7:33 PM

MLCommons

@mlcommons.org

The gap between AI safety and security is real—and dangerous.

89% of models showed degraded safety performance when exposed to common jailbreak techniques.

As AI powers healthcare, finance, and critical infrastructure, this vulnerability can't be ignored.
2/6

October 15, 2025 at 7:33 PM

MLCommons

@mlcommons.org

Nebius, NVIDIA, Oracle, Quanta Cloud Technology, Red Hat, Supermicro, TheStage.AI, University of Florida, Vultr

Results:
Datacenter: mlcommons.org/benchmarks/i...
Edge: mlcommons.org/benchmarks/i...
#MLPerf

September 9, 2025 at 6:15 PM

MLCommons

@mlcommons.org

Llama 2 70B shows remarkable progress - best systems now 5x faster than v4.0.
Thanks to all submitters AMD, Amitash Nanda, ASUSTeK, Broadcom, Cisco, CoreWeave, Dell, GATEOverflow, GigaComputing, Google, HPE, Intel, KRAI, Lambda, Lenovo, MangoBoost, Microsoft Azure, MiTAC,

September 9, 2025 at 6:15 PM

MLCommons

@mlcommons.org

6/6
Congrats to all contributors and working group members for advancing industry benchmarking! #MLPerf #Automotive #ADAS #AutonomousVehicles #AI #Cognata #Motional #NVIDIA #AVCC #MLCommons

August 27, 2025 at 6:15 PM

MLCommons

@mlcommons.org

5/6
The results are designed to help OEMs, suppliers, and the whole ecosystem make informed decisions for next-generation, safety-critical automotive AI systems. See results: mlcommons.org/benchmarks/mlperf-automotive/

Benchmark MLPerf Autotmotive MLCommons V0.5

The MLPerf Automotive benchmark suite measures the performance of computers intended for automotive, both for Advanced Driving Assistance System/Autonomous Driving (ADAS/AD) and In-Vehicle Infotainmen...

mlcommons.org

August 27, 2025 at 6:15 PM

MLCommons

@mlcommons.org

4/6
MLPerf Automotive v0.5 covers 2D object recognition & segmentation and 3D object recognition using high-res datasets from Cognata (8-megapixel imagery) and Motional (nuScenes).

August 27, 2025 at 6:15 PM

MLCommons

@mlcommons.org

3/6
Special thanks to submitters GateOverflow and NVIDIA, and dataset partners Cognata_Ltd and Motional for making these benchmarks possible.

August 27, 2025 at 6:15 PM

MLCommons

@mlcommons.org

2/6
This milestone is powered by collaboration across Ambarella, ARM, Bosch, C-Tuning Foundation, CeCaS, Cognata, Motional, NVIDIA, Qualcomm, Red Hat, Samsung, Siemens EDA, UC Davis, and ZF Group.

August 27, 2025 at 6:15 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news