Leire Aguirre
leireaguirre.hf.co
Leire Aguirre
@leireaguirre.hf.co
Building Argilla @ Hugging Face 🤗
Reposted by Leire Aguirre
💫 Generate RAG data with the Synthetic Data Generator to improve your RAG system!

1️⃣ Generate from your documents, dataset, or dataset description.
2️⃣ Configure it.
3️⃣ Generate the synthetic dataset.
4️⃣ Fine-tune the retrieval and reranking models.
5️⃣ Build a RAG pipeline.
January 20, 2025 at 4:42 PM
Reposted by Leire Aguirre
The finish line is near! We're building FineWeb-Edu for many languages and need your help 🤗

Many FineWeb-C languages are close to 1,000 annotations!

Assamese is 99.4% done, French needs 64 more annotations, Tamil: 216.

Please help us reach the goal: huggingface.co/spaces/data-...
January 6, 2025 at 2:32 PM
Reposted by Leire Aguirre
💥 Ending 2024: A full data annotation journey on the Hugging Face Hub—from raw data to training-ready datasets!

With Argilla 2.6.0, push your data to the Hub from the UI

Let’s make 2025 the year anyone can build more transparent and accountable AI—no coding or model skills needed.
December 20, 2024 at 11:14 AM
Reposted by Leire Aguirre
🚀 Argilla v2.6.0 is here! 🎉

Let me show you how EASY it is to export your annotated datasets from Argilla to the Hugging Face Hub. 🤩

Take a look to this quick demo 👇

💁‍♂️ More info about the release at github.com/argilla-io/a...

#AI #MachineLearning #OpenSource #DataScience #HuggingFace #Argilla
December 19, 2024 at 12:39 PM
I've just contributed 10 examples to this dataset:

data-is-better-together-fineweb-c.hf.space/share-your-p...
eus - euskara - Basque
Join and contribute to the dataset eus - euskara - Basque
data-is-better-together-fineweb-c.hf.space
December 11, 2024 at 11:48 AM
Reposted by Leire Aguirre
In a couple of minutes, we’ll officially make the FineWeb 2 Annotation Sprint.

🎶 Go with your rhythms, and do what you can.
🤏 There is no minimum.
👐 Each contribution is welcomed.

The more we are, the better the result will be.
December 10, 2024 at 11:52 AM
Reposted by Leire Aguirre
🙌 I just wanted to share a few thoughts about the latest Argilla release, 2.5.0, as it's a pretty big one!

Argilla now has full support for webhooks, which means you can do some pretty cool stuff, like model training on the fly as annotations are created. 🤯

#MachineLearning #NLP #DataLabeling
December 2, 2024 at 11:14 AM
Reposted by Leire Aguirre
For anyone interested in fine-tuning or aligning LLMs, I’m running this free and open course called smol course. It’s not a big deal, it’s just smol.

🧵>>
December 3, 2024 at 9:21 AM