Lightnews — Scholar-powered news

Mark Ibrahim @markibrahim.bsky.social · 2d

We explain how good delimiters steer attention heads to key input tokens and offer practical recommendations for prompts and delimiter choices to get the best performance from your LLM—tldr; use “!” or “\n”.

2

Mark Ibrahim @markibrahim.bsky.social · 2d

- MMLU performance can vary by +/- 23% depending on the choice of delimiter across leading open model families (Llama, Qwen, and Gemma).
- Closed models, GPT-4o, are also brittle to the choice of delimiter.

🧵

1 2

Mark Ibrahim @markibrahim.bsky.social · 2d

One can manipulate LLM rankings to put any model in the lead—only by modifying the single character separating demonstration examples. Learn more in our new paper arxiv.org/abs/2510.05152
w/ Jingtong Su, Jianyu Zhang, @karen-ullrich.bsky.social , and Léon Bottou.
🧵

1 2 27

Mark Ibrahim @markibrahim.bsky.social · Jul 21

Open-weights for our Llip multimodal vision-language model led by @lavoiems.bsky.social are public!

LLIP proposes new pre-training objective to capture the many ways to describe an image leading to strong performance across a suite of 22-zero shot benchmarks.

bsky.app/profile/lavo...

Samuel Lavoie @lavoiems.bsky.social · Jul 17

The code and model weights for Llip are finally out! I hope you will find this model useful!
Paper: arxiv.org/abs/2405.00740
Code: github.com/facebookrese...
Models:
- ViT-G: huggingface.co/lavoies/llip...
- ViT-B: huggingface.co/lavoies/llip...

Modeling Caption Diversity in Contrastive Vision-Language Pretraining

There are a thousand ways to caption an image. Contrastive Language Pretraining (CLIP) on the other hand, works by mapping an image and its caption to a single vector -- limiting how well CLIP-like mo...

arxiv.org

1

Mark Ibrahim @markibrahim.bsky.social · Jun 17

We also find better models are not necessarily better at abstention, suggesting the skill of abstention is an open-research question.

w/ @polkirichenko.bsky.social Sam Bell Kamalika Chaudhuri

Paper: arxiv.org/abs/2506.09038
Code: github.com/facebookrese...

bsky.app/profile/polk...

🧵2/2

1

Mark Ibrahim @markibrahim.bsky.social · Jun 17

A good language model should say “I don’t know” by reasoning about the limits of its knowledge. Our new work AbstentionBench carefully measures this overlooked skill in an open-codebase others can build on!

We find frontier reasoning degrades models’ ability to know when NOT to answer.

🧵1/2

1 1

Mark Ibrahim @markibrahim.bsky.social · May 2

Join us as a PhD research intern at FAIR w/ @polkirichenko.bsky.social and Kamalika Chaudhuri

to start this summer or fall with a focus on open science into multimodal models, agents and beyond! Email [email protected] with the title [Prospective Intern 2025] and attach your CV if interested!