www.buzzsprout.com/1930383/epis...
www.buzzsprout.com/1930383/epis...
Curious what the data says for Canada - might not be as bad but I'd be the trends are similar.
Curious what the data says for Canada - might not be as bad but I'd be the trends are similar.
HellaSwag is currently on of the most widely LLM benchmarks in the world. We introduce a new critical method to assess the validity of standard LLM evals and show it does not accurately measure common sense reasoning. arxiv.org/abs/2504.07825
HellaSwag is currently on of the most widely LLM benchmarks in the world. We introduce a new critical method to assess the validity of standard LLM evals and show it does not accurately measure common sense reasoning. arxiv.org/abs/2504.07825
I had the privilege of being a keynote speaker at the International Association for Safe & Ethical AI.
I focused on solutions and ideas for moving forward.
They made it public, and here it is.
Enjoy, and hope it's helpful! 🤗
oecdtv.webtv-solution.com/embed-or-en-...
I had the privilege of being a keynote speaker at the International Association for Safe & Ethical AI.
I focused on solutions and ideas for moving forward.
They made it public, and here it is.
Enjoy, and hope it's helpful! 🤗
oecdtv.webtv-solution.com/embed-or-en-...
buttondown.com/maiht3k/arch...
buttondown.com/maiht3k/arch...
The challenge of doing X with as few resources as possible (memory, power, etc).
What are OpenAI + friends doing? The opposite.
"Lets assume we have unlimited resources & try to build a god."
No wonder they're losing their shit with the DeepSeek thing.
The challenge of doing X with as few resources as possible (memory, power, etc).
What are OpenAI + friends doing? The opposite.
"Lets assume we have unlimited resources & try to build a god."
No wonder they're losing their shit with the DeepSeek thing.
READ THIS OR GET LEFT BEHIND
www.bloodinthemachine.com/p/the-critic...
READ THIS OR GET LEFT BEHIND
www.bloodinthemachine.com/p/the-critic...