Marek Suppa
Marek Suppa
@mrshu.bsky.social
𝗛𝗼𝗻𝗲𝘀𝘁𝗟𝗟𝗠

- Introduces 𝙃𝙊𝙉𝙀𝙎𝙀𝙏, a dataset with 930 queries in six categories to evaluate LLM honesty

- Proposes curiosity-driven prompting and two-stage fine-tuning for improving honesty and helpfulness

- Demonstrates up to 124.7% honesty and helpfulness improvement in models like Mistral-7b
December 6, 2024 at 9:06 PM
Reposted by Marek Suppa
This is my tiny hill I will die on.
November 29, 2024 at 9:10 PM
Multimodal Large Language Models Make Text-to-Image Generative Models Align Better

- VisionPrefer datset captures diverse preferences (prompt-following, aesthetic, fidelity, harmlessness) using multimodal LLMs

- VP-Score model matches human accuracy in preference prediction, guiding model tuning
December 5, 2024 at 10:28 PM
The Super Weight in Large Language Models

Setting as few as a single weight to zero will make various LLMs go from generating coherent text to outputting gibberish.

arxiv.org/abs/2411.07191
The Super Weight in Large Language Models
Recent works have shown a surprising result: a small fraction of Large Language Model (LLM) parameter outliers are disproportionately important to the quality of the model. LLMs contain billions of pa...
arxiv.org
November 28, 2024 at 9:16 AM