unrahul.bsky.social
@unrahul.bsky.social
Reposted
I did a 1 hr speed-run on multimodal computer vision (VLMs, multimodal retrieval, zero-shot vision) in MIT AI Visions

it's up on youtube by popular demand www.youtube.com/embed/_TlhKH...
November 1, 2024 at 5:51 PM
Reposted
Baguettotron is now taught in the classroom.
January 19, 2026 at 11:00 PM
Reposted
Been pretty heads-down finishing Chapter 6 on implementing RLVR via GRPO. Just finished, and it might be my favorite chapter so far.

Code notebook: github.com/rasbt/reason...

(And it should be added to the early access soon.)

The next chapter adds stability and performance improvements to GRPO.
January 18, 2026 at 2:58 PM
Reposted
In the spirit of NanoGPT, we created Picotron: The minimalist & most-hackable repository for pre-training Llama-like models with 4D Parallelism (Data, Tensor, Pipeline, Context parallel)
GitHub - huggingface/picotron: Minimalistic 4D-parallelism distributed training framework for education purpose
Minimalistic 4D-parallelism distributed training framework for education purpose - huggingface/picotron
buff.ly
January 29, 2025 at 5:00 PM
Reposted
Building datasets to train smaller, task-focused models used to be incredibly time-consuming.

Very excited to see SAM3 massively lower that barrier. Describe the class you want to detect and get annotated datasets automatically!

Try it yourself: huggingface.co/datasets/uv-...!
November 21, 2025 at 1:30 PM
Reposted
New study suggests that metastatic cancer cells may evade the immune system by stealing mitochondria from immune cells and setting up a ‘shield’ that protects them from being killed by immune cells. Wiley little bastards.

#Science 🧪
Cancer might evade immune defences by stealing mitochondria
Hijacking the energy-producing organelles from immune cells seems to help tumours in mice to infiltrate lymph nodes.
www.nature.com
January 17, 2026 at 1:39 PM
Reposted
Introducing DroPE: Extending Context by Dropping Positional Embeddings

We found embeddings like RoPE aid training but bottleneck long-sequence generalization. Our solution’s simple: treat them as a temporary training scaffold, not a permanent necessity.

arxiv.org/abs/2512.12167
pub.sakana.ai/DroPE
January 12, 2026 at 4:07 AM