Lightnews — Scholar-powered news

How Much Energy Does AI Use? The People Who Know Aren’t Saying

1 3 4

Reposted by Yacine Jernite

Dr Sasha Luccioni @sashamtl.bsky.social · Jul 17

Friends!
Does anyone know of any model distillation with public logs (W&B or other)?
I'm trying to figure out the energy tradeoffs between model training and distillation..

5 7

Reposted by Yacine Jernite

Avijit Ghosh @evijit.io · Jul 15

New blog post alert! 🚨"What is the Hugging Face Community Building?", with @yjernite.bsky.social and Irene Soliaman

The AI narrative focuses on big players, but the real story is happening in the open source AI ecosystem across 1.8M models, 450K datasets, and 560K apps, on
@hf.co.

1 3 13

Reposted by Yacine Jernite

Dr Sasha Luccioni @sashamtl.bsky.social · Jun 19

One of the biggest frustrations I have is the lack of transparency around AI's energy use and environmental impacts. I know the numbers are out there... but somehow we're not seeing them 🫠

Thank you @wired.com for covering this topic in such depth and detail !

www.wired.com/story/ai-car...

A growing body of research attempts to put a number on energy use and AI—even as the companies behind the most popular models keep their carbon emissions a secret.

www.wired.com

1 28 58

Reposted by Yacine Jernite

Daniel van Strien @danielvanstrien.bsky.social · Jun 17

“AI Scraping Bots Are Breaking Open Libraries, Archives, and Museums” – interesting piece via @404media.co

Not a perfect fix, but making ML-ready datasets from collections can help.

If you want help getting your data on @hf.co, I'd be happy to help.

Screenshot of the header of the article with text:

AI Scraping Bots Are Breaking Open Libraries, Archives, and Museums

4 14

Reposted by Yacine Jernite

Daniel van Strien @danielvanstrien.bsky.social · Jun 16

Institutional Books: Massive Historical Text Corpus

- 983K books, 242B tokens, 386M pages
- 19th-20th century texts in 254 languages
- Refined OCR with quality scores & metadata
- Noncommercial early-access release

huggingface.co/datasets/ins...

institutional/institutional-books-1.0 · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Open Source AI: A Cornerstone of Digital Sovereignty

15 36

Yacine Jernite @yjernite.bsky.social · Jun 11

Great blog post on *Digital Sovereignty and OS AI* led by the fantastic @frimelle.bsky.social!

Digital sovereignty for AI needs to properly account for:
📚 data
🧑‍🔬 technology
💽 infrastructure
⚖️ regulation

Open/transparent AI contributes to all, read for some concrete examples!
hf.co/blog/frimell...

A Blog post by Lucie-Aimée Kaffee on Hugging Face

Bigger isn't always better: how to choose the most efficient model for context-specific tasks 🌱🧑🏼‍💻

1 2

Reposted by Yacine Jernite

Lucie-Aimée Kaffee @frimelle.bsky.social · Jun 11

❗️New policy blogpost!
The EU is speaking a lot about sovereignty. A cornerstone of digital sovereignty is and has to be open source.
As AI becomes more central, the ability to govern, adapt, and understand these systems is no longer optional.

1 1 5

Reposted by Yacine Jernite

Alex Hanna-ween @alexhanna.bsky.social · Jun 6

Off-mark from @sanders.senate.gov.

We need progressive legislators to not buy into the clown show from AI CEOs. Labor replacement is not real, labor displacement is.

We need regulation to protect workers and anticipate the kind of worker speedups that employers buying into the hype will cause.

A tweet from Bernie Sanders. It reads:
The CEO of Anthropic (a powerful AI company) predicts that AI could wipe out HALF of entry-level white collar jobs in the next 5 years.

We must demand that increased worker productivity from AI benefits working people, not just wealthy stockholders on Wall St. AI IS A BIG DEAL.

4 27 140

Reposted by Yacine Jernite

Dr Sasha Luccioni @sashamtl.bsky.social · May 28

How can we make informed choices based on performance AND energy when using AI in real-life tasks like question answering? By evaluating them and picking the models that optimize both factors!
Check out my new blog post on the subject:
huggingface.co/blog/sasha/e...

A Blog post by Sasha Luccioni on Hugging Face

AI Personas: The Impact of Design Choices

1 3 9

Yacine Jernite @yjernite.bsky.social · May 7

I'm consistently impressed by @giadapistilli.com's extensive insights into AI technology 🤗

Her latest blog on design factors of AI "companions" shows that those go way beyond model performance and give some nice hands-on tool to do your own analysis - must read!

huggingface.co/blog/giadap/...

AI Personas: The Impact of Design Choices

2

Reposted by Yacine Jernite

Giada Pistilli @giadapistilli.com · May 7

Ever notice how some AI assistants feel like tools while others feel like companions? Turns out, it's not always about fancy tech upgrades, because sometimes it's just clever design.

huggingface.co/blog/giadap/...

Consent by Design: Approaches to User Data in Open AI Ecosystems

1 5 10

Reposted by Yacine Jernite

Dr Sasha Luccioni @sashamtl.bsky.social · Apr 29

We just integrated the new Qwen3-8B into Chat UI Energy and asked it to do a simple multiplication problem.
What's the energy cost?
→ Without reasoning: wrong (😅), but low energy use
→ With reasoning: correct (!!)… but using 42x more energy!

1 3 16

Reposted by Yacine Jernite

Giada Pistilli @giadapistilli.com · Apr 17

🤗 New from us! Just published a blog post exploring how we're rethinking consent in the AI ecosystem.

Here's what we're seeing in the @hf.co Hub that differs from traditional closed systems...

1 2 7

Yacine Jernite @yjernite.bsky.social · Apr 17

128k context windows + code specific were the deciding factors, and 32B caught a lot more than 7B!

Space Privacy - a Hugging Face Space by yjernite

Yacine Jernite @yjernite.bsky.social · Apr 16

The app comes with a bunch of pre-reviewed apps/Spaces, great to see how many process data locally or through (private) HF endpoints 🤗

Note that this is a POC, lots of exciting work to do to make it more robust, so:
- try it: hf.co/spaces/yjern...
- reach out to collab: hf.co/spaces/yjern...

4/4 🧵

Analysing privacy concerns in deployed Spaces

Yacine Jernite @yjernite.bsky.social · Apr 16

The app works in three stages:
1. Download all code files
2. Use the Code LM to generate a detailed report pointing to code where data is transferred/(AI-)processed (screen 1)
3. Summarize the app's main functionality and data journeys (screen 2)
4. Build a Privacy TLDR with those inputs

3/4 🧵

A sample from the detailed report with references to specific code snippets from the app

A sample from the summary describing the Space functionality, AI services, and data journeys

Space Privacy - a Hugging Face Space by yjernite

Yacine Jernite @yjernite.bsky.social · Apr 16

That requires actually reading the code though, which isn't always easy or quick!
Good news: code LMs have gotten pretty good at automatic review, so we can offload some of the work - here I'm using Qwen2.5-Coder to generate reports and it works pretty OK, have a look 👇
hf.co/spaces/yjern...

2/4 🧵

Analysing privacy concerns in deployed Spaces

Empowering Public Organizations: Preparing Your Data for the AI Era

Yacine Jernite @yjernite.bsky.social · Apr 16

Today in Privacy & AI Tooling - introducing a nifty new tool to examine where data goes in open-source apps on @hf.co 🤗

HF Spaces have tons (100Ks!) of cool demos leveraging or examining AI systems - and because most of them are OSS we can see exactly how they handle user data 📚🔍

1/4 🧵

Interface of Space Privacy Analyzer app, describing how it reviews Hugging Face Spaces for data privacy concerns, with a pre-loaded example for the Hugging Face demo app for SmolVLM2

A TLDR report generated by the Spaces Privacy app outlining the different types of data used and where they go when using the app

1 4 8

Reposted by Yacine Jernite

Avijit Ghosh @evijit.io · Apr 15

Thrilled to share that our paper:
"It's not a representation of me": Examining Accent Bias and Digital Exclusion in Synthetic AI Voice Services - has been accepted at @facct.bsky.social 2025! - with @shiramichel.bsky.social , Sufi Kaur, Sarah Gilespie, Jeffrey Gleason and Dr. Christo Wilson.

2 7 15

Yacine Jernite @yjernite.bsky.social · Apr 10

New blog post led by @evijit.io on wrangling public data for AI - and helping public orgs have more control over how AI systems serve their mission by shaping how their data's used📚
Have a read especially if your org's being asked to do more AI (common theme these days 🤗)

hf.co/blog/evijit/...

A Blog post by Avijit Ghosh on Hugging Face