Google Brain. ML Efficiency, LLMs,
@trustworthy_ml.
Paper out now!🔻
Paper out now!🔻
The @lmarena.bsky.social has become the go-to evaluation for AI progress.
Our release today demonstrates the difficulty in maintaining fair evaluations on the Arena, despite best intentions.
The @lmarena.bsky.social has become the go-to evaluation for AI progress.
Our release today demonstrates the difficulty in maintaining fair evaluations on the Arena, despite best intentions.
So how fair—and scientifically rigorous—is today’s most widely used evaluation benchmark?
We took a deep dive into Chatbot Arena to find out. 🧵
So how fair—and scientifically rigorous—is today’s most widely used evaluation benchmark?
We took a deep dive into Chatbot Arena to find out. 🧵
We have an awesome lineup of speakers who have made deep contributions to open-source in ML, e.g. @sarahooker.bsky.social , @chrisrackauckas.bsky.social, Matt Johnson, Tri Dao, @stellaathena.bsky.social, Evan Shelhamer.
codeml-workshop.github.io/codeml2025/
We have an awesome lineup of speakers who have made deep contributions to open-source in ML, e.g. @sarahooker.bsky.social , @chrisrackauckas.bsky.social, Matt Johnson, Tri Dao, @stellaathena.bsky.social, Evan Shelhamer.
A comprehensive multimodal & multilingual benchmark for VLMs! It contains real questions from exams in different languages.
🌍 20,911 questions and 18 languages
📚 14 subjects (STEM → Humanities)
📸 55% multimodal questions
A comprehensive multimodal & multilingual benchmark for VLMs! It contains real questions from exams in different languages.
🌍 20,911 questions and 18 languages
📚 14 subjects (STEM → Humanities)
📸 55% multimodal questions
Releasing open weights helps to make breakthroughs in VLMs accessible to the research community.
Releasing open weights helps to make breakthroughs in VLMs accessible to the research community.
@hf.co 🔥🔥
We launched open-weights with the goal of making VLM breakthroughs accessible to the research community - so exciting to see such a positive response.
huggingface.co/CohereForAI/...
@hf.co 🔥🔥
We launched open-weights with the goal of making VLM breakthroughs accessible to the research community - so exciting to see such a positive response.
huggingface.co/CohereForAI/...
Aya Vision adds breakthrough multimodal capabilities to our state-of-the-art multilingual 8B and 32B models. 🌿
Aya Vision adds breakthrough multimodal capabilities to our state-of-the-art multilingual 8B and 32B models. 🌿
Our policy primer explores ways to move towards more sustainable AI. 🌱
📜 cohere.com/research/pap...
Our policy primer explores ways to move towards more sustainable AI. 🌱
📜 cohere.com/research/pap...
In this work led by Sara Hooker, we seek to understand the viability of compute thresholds ⚖️ as a way to mitigate risk. 🦺
arxiv.org/abs/2407.05694
In this work led by Sara Hooker, we seek to understand the viability of compute thresholds ⚖️ as a way to mitigate risk. 🦺
arxiv.org/abs/2407.05694
📜https://arxiv.org/abs/2410.10801
📜https://arxiv.org/abs/2410.10801
We are proud to work on global AI that is efficient and accessible 🔥
We are proud to work on global AI that is efficient and accessible 🔥
INCLUDE: Evaluating Multilingual LLMs with Regional Knowledge (arxiv.org/abs/2411.19799)
A benchmark of ~200k QA pairs across 44 languages, capturing real-world cultural nuances.
A collaborative effort led by @cohereforai.bsky.social, with contributors worldwide.
/1
INCLUDE: Evaluating Multilingual LLMs with Regional Knowledge (arxiv.org/abs/2411.19799)
A benchmark of ~200k QA pairs across 44 languages, capturing real-world cultural nuances.
A collaborative effort led by @cohereforai.bsky.social, with contributors worldwide.
/1
We provide a taxonomy of open problem areas in TAIG organized by governance capacities and governance targets.
📜https://arxiv.org/pdf/2407.14981
We provide a taxonomy of open problem areas in TAIG organized by governance capacities and governance targets.
📜https://arxiv.org/pdf/2407.14981
This project focused on adapting educational materials to students’ skill levels, ensuring more effective and responsible AI integration in classrooms.
This project focused on adapting educational materials to students’ skill levels, ensuring more effective and responsible AI integration in classrooms.
I think a summit is typically most valuable as a catalyst, not as a solution in itself.
But, will share some observations.
I think a summit is typically most valuable as a catalyst, not as a solution in itself.
But, will share some observations.
www.linkedin.com/posts/bgamaz...
www.linkedin.com/posts/bgamaz...
Thanks to @baratunde.com for hosting Head of Cohere For AI, @sarahooker.bsky.social on the latest episode of Life with Machines.
Check out their full conversation on YouTube:
youtu.be/-BsobAoOJvk
Thanks to @baratunde.com for hosting Head of Cohere For AI, @sarahooker.bsky.social on the latest episode of Life with Machines.
Check out their full conversation on YouTube:
youtu.be/-BsobAoOJvk
Additionally, in some languages we're outperforming:
🔒proprietary models
🐘larger models
⛰️models built by more researchers with more infrastructure
Lots to be proud of today.
Additionally, in some languages we're outperforming:
🔒proprietary models
🐘larger models
⛰️models built by more researchers with more infrastructure
Lots to be proud of today.
This paper is part of a larger research agenda where we have focused on how to better represent the long tail = making AI work for almost all real world distributions.
Check out our cross-institutional collaboration discusses intriguing & previously unknown generalisation properties of compression.
📜Learn more: arxiv.org/abs/2211.02738
This paper is part of a larger research agenda where we have focused on how to better represent the long tail = making AI work for almost all real world distributions.
Comprehensive and a good starting pointing for researchers working on efficiency.
In an era of ever larger models, work on efficiency is ever more important. This cross-institutional collaboration provides a survey of the field for practitioners and researchers alike ⚙️.
📜Learn more: arxiv.org/pdf/2209.000...
Comprehensive and a good starting pointing for researchers working on efficiency.