Thanks to all submitters: AMD, ASUSTeK, Cisco, Datacrunch, Dell, Giga Computing, HPE, Krai, Lambda, Lenovo, MangoBoost, MiTAC, Nebius, NVIDIA, Oracle, Quanta Cloud Technology, Supermicro, University of Florida, Wiwynn
Thanks to all submitters: AMD, ASUSTeK, Cisco, Datacrunch, Dell, Giga Computing, HPE, Krai, Lambda, Lenovo, MangoBoost, MiTAC, Nebius, NVIDIA, Oracle, Quanta Cloud Technology, Supermicro, University of Florida, Wiwynn
Two new benchmarks debut:
-Llama 3.1 8B: LLM pretraining (replaces BERT)
-Flux.1: Transformer-based text-to-image (replaces Stable Diffusion v2)
GenAI benchmarks show performance improvements outpacing Moore's Law, with 24% increase in Llama 2 70B LoRA submissions.
Two new benchmarks debut:
-Llama 3.1 8B: LLM pretraining (replaces BERT)
-Flux.1: Transformer-based text-to-image (replaces Stable Diffusion v2)
GenAI benchmarks show performance improvements outpacing Moore's Law, with 24% increase in Llama 2 70B LoRA submissions.
Access the benchmark and full findings: mlcommons.org/ailuminate/j...
Join the conversation!
6/6
#AIRiskandReliability #AISecurity
Access the benchmark and full findings: mlcommons.org/ailuminate/j...
Join the conversation!
6/6
#AIRiskandReliability #AISecurity
→ Developers get standardized metrics to find and fix vulnerabilities
→ Policymakers get transparent, reproducible data
→ Users get systems they can actually trust
We're making hidden risks visible and measurable.
5/6
→ Developers get standardized metrics to find and fix vulnerabilities
→ Policymakers get transparent, reproducible data
→ Users get systems they can actually trust
We're making hidden risks visible and measurable.
5/6
-Text-to-text scenarios
-Multimodal scenarios
-12 hazard categories (violent crimes, CBRNE, child exploitation, suicide/self-harm, and more)
Built on our AILuminate safety benchmark methodology.
4/6
-Text-to-text scenarios
-Multimodal scenarios
-12 hazard categories (violent crimes, CBRNE, child exploitation, suicide/self-harm, and more)
Built on our AILuminate safety benchmark methodology.
4/6
It's when users manipulate AI systems to bypass safety filters and produce harmful, unintended, or policy-violating content.
It's not theoretical. It's happening now.
3/6
It's when users manipulate AI systems to bypass safety filters and produce harmful, unintended, or policy-violating content.
It's not theoretical. It's happening now.
3/6
89% of models showed degraded safety performance when exposed to common jailbreak techniques.
As AI powers healthcare, finance, and critical infrastructure, this vulnerability can't be ignored.
2/6
89% of models showed degraded safety performance when exposed to common jailbreak techniques.
As AI powers healthcare, finance, and critical infrastructure, this vulnerability can't be ignored.
2/6
Results:
Datacenter: mlcommons.org/benchmarks/i...
Edge: mlcommons.org/benchmarks/i...
#MLPerf
Results:
Datacenter: mlcommons.org/benchmarks/i...
Edge: mlcommons.org/benchmarks/i...
#MLPerf
Thanks to all submitters AMD, Amitash Nanda, ASUSTeK, Broadcom, Cisco, CoreWeave, Dell, GATEOverflow, GigaComputing, Google, HPE, Intel, KRAI, Lambda, Lenovo, MangoBoost, Microsoft Azure, MiTAC,
Thanks to all submitters AMD, Amitash Nanda, ASUSTeK, Broadcom, Cisco, CoreWeave, Dell, GATEOverflow, GigaComputing, Google, HPE, Intel, KRAI, Lambda, Lenovo, MangoBoost, Microsoft Azure, MiTAC,
Congrats to all contributors and working group members for advancing industry benchmarking! #MLPerf #Automotive #ADAS #AutonomousVehicles #AI #Cognata #Motional #NVIDIA #AVCC #MLCommons
Congrats to all contributors and working group members for advancing industry benchmarking! #MLPerf #Automotive #ADAS #AutonomousVehicles #AI #Cognata #Motional #NVIDIA #AVCC #MLCommons
The results are designed to help OEMs, suppliers, and the whole ecosystem make informed decisions for next-generation, safety-critical automotive AI systems. See results: mlcommons.org/benchmarks/mlperf-automotive/
The results are designed to help OEMs, suppliers, and the whole ecosystem make informed decisions for next-generation, safety-critical automotive AI systems. See results: mlcommons.org/benchmarks/mlperf-automotive/
MLPerf Automotive v0.5 covers 2D object recognition & segmentation and 3D object recognition using high-res datasets from Cognata (8-megapixel imagery) and Motional (nuScenes).
MLPerf Automotive v0.5 covers 2D object recognition & segmentation and 3D object recognition using high-res datasets from Cognata (8-megapixel imagery) and Motional (nuScenes).
Special thanks to submitters GateOverflow and NVIDIA, and dataset partners Cognata_Ltd and Motional for making these benchmarks possible.
Special thanks to submitters GateOverflow and NVIDIA, and dataset partners Cognata_Ltd and Motional for making these benchmarks possible.
This milestone is powered by collaboration across Ambarella, ARM, Bosch, C-Tuning Foundation, CeCaS, Cognata, Motional, NVIDIA, Qualcomm, Red Hat, Samsung, Siemens EDA, UC Davis, and ZF Group.
This milestone is powered by collaboration across Ambarella, ARM, Bosch, C-Tuning Foundation, CeCaS, Cognata, Motional, NVIDIA, Qualcomm, Red Hat, Samsung, Siemens EDA, UC Davis, and ZF Group.