narsilou.bsky.social
@narsilou.bsky.social
Me: This function is too slow. Find a faster algorithm.
Cursor: Hold my beer.

Me: *Slacking off with colleagues*
Cursor: Ping.

Me: 🤯
June 11, 2025 at 2:18 PM
Reposted
We just released text-generation-inference 3.3.0. This release adds prefill chunking for VLMs 🚀. We have also Gemma 3 faster & use less VRAM by switching to flashinfer for prefills with images.

github.com/huggingface/...
Release v3.3.0 · huggingface/text-generation-inference
Notable changes Prefill chunking for VLMs. What's Changed Fixing Qwen 2.5 VL (32B). by @Narsil in #3157 Fixing tokenization like https://github.com/huggingface/text-embeddin… by @Narsil in #3156...
github.com
May 9, 2025 at 3:39 PM
Hot take: Rust is really good for vibe coding, much better than Python or JS. Why ? The compiler will not let crap pass.

Yes the LLM can still get it wrong, and fail.

The elegant error messages will nudge the LLM, so I don't have to do it constantly.
May 8, 2025 at 2:13 PM
Tipping my toe into vibe coding myself to get a gist for it.

My first project is writing something like superwhisper, because I couldn't find anything that worked good enough for wayland.

github.com/Narsil/whisp...
It also works on Mac for kicks (Whisper.cpp backed)
GitHub - Narsil/whispering
Contribute to Narsil/whispering development by creating an account on GitHub.
github.com
May 8, 2025 at 2:12 PM
Want to run Deepseek R1 ?

Text-generation-inference v3.1.0 is out and supports it out of the box.

Both on AMD and Nvidia !
January 31, 2025 at 2:25 PM
Text-generation-inference v3.0.2 is out.

Basically we can run transformers models (that support flash) at roughly the same speed as native TGI ones.
What this means is broader model support.

Today it unlocks
Cohere2, Olmo, Olmo2 and Helium

Congrats Cyril Vallez

github.com/huggingface/...
Release v3.0.2 · huggingface/text-generation-inference
Tl;dr New transformers backend supporting flashattention at roughly same performance as pure TGI for all non officially supported models directly in TGI. Congrats @Cyrilvallez New models unlocked: ...
github.com
January 24, 2025 at 2:55 PM
Performance leap: TGI v3 is out. Processes 3x more tokens, 13x faster than vLLM on long prompts. Zero config !
December 10, 2024 at 10:08 AM
Reposted
We just deployed Qwen/QwQ-32B-Preview on HuggingChat! It's Qwen's latest experimental reasoning model.

It's super interesting to see the reasoning steps, and with really impressive results too. Feel free to try it out here: huggingface.co/chat/models/...

I'd love to get your feedback on it!
Qwen/QwQ-32B-Preview - HuggingChat
Use Qwen/QwQ-32B-Preview with HuggingChat
huggingface.co
November 28, 2024 at 8:20 PM
Reposted
I'm disheartened by how toxic and violent some responses were here.

There was a mistake, a quick follow up to mitigate and an apology. I worked with Daniel for years and is one of the persons most preoccupied with ethical implications of AI. Some replies are Reddit-toxic level. We need empathy.
I've removed the Bluesky data from the repo. While I wanted to support tool development for the platform, I recognize this approach violated principles of transparency and consent in data collection. I apologize for this mistake.
First dataset for the new @huggingface.bsky.social @bsky.app community organisation: one-million-bluesky-posts 🦋

📊 1M public posts from Bluesky's firehose API
🔍 Includes text, metadata, and language predictions
🔬 Perfect to experiment with using ML for Bluesky 🤗

huggingface.co/datasets/blu...
November 27, 2024 at 11:09 AM
Reposted
It's pretty sad to see the negative sentiment towards Hugging Face on this platform due to a dataset put by one of the employees. I want to write a small piece. 🧵

Hugging Face empowers everyone to use AI to create value and is against monopolization of AI it's a hosting platform above all.
November 27, 2024 at 3:23 PM