Lightnews — Scholar-powered news

riverdong.bsky.social @riverdong.bsky.social · Mar 5

Overall, our work introduces a multi-faceted evaluation framework for LLM personalization. We hope our framework and empirical insights will guide the development of more robust, inclusive, and responsible personalization approaches that can better serve diverse global users.

riverdong.bsky.social @riverdong.bsky.social · Mar 5

⚖️ Personalization Can Protect Minority Viewpoints!
In diverse-user settings, personalization helps amplify underrepresented perspectives (User 8 in the Figure). Without personalization, models tend to default to majority opinions, sidelining minority viewpoints.

riverdong.bsky.social @riverdong.bsky.social · Mar 5

⚠️ Personalization can hurt model safety & reasoning by up to 30%.

riverdong.bsky.social @riverdong.bsky.social · Mar 5

📊 Key Findings:
(1) Performance can vary by up to 36%
(2) Fine-tuning per user is a strong baseline
(3) For the recently proposed algorithms: Personalized Reward Modeling (PRM) achieves best performance. Group Preference Optimization (GPO) show fast adaptation to new users.

riverdong.bsky.social @riverdong.bsky.social · Mar 5

Dataset Characteristics Matter!
📌 OpenAI Reddit Summarization → Higher annotator agreement, minimal personalization needed.
📌 P-SOUPS → Strong user disagreement, useful for testing but unrealistic.
📌Personal-LLM → imbalance between majority & minority preference

riverdong.bsky.social @riverdong.bsky.social · Mar 5

🚀 Thrilled to share our paper: "A Multi-Faceted Analysis of Personalized Preference Learning." We introduce a multi-faceted framework to evaluate personalized preference learning algorithms in real-world conditions.
📄 Paper: arxiv.org/pdf/2502.19158

arxiv.org

riverdong.bsky.social @riverdong.bsky.social · Mar 5

🚨New Paper Alert🚨
Many personalization methods optimize performance but ignore real-world impact.
We examine its effects on:
✅ Performance
⚖️ Fairness: Can it represent minorities fairly?
⚠️ Unintended Effects: Does it harm safety?
🔄 Adaptability: Quickly adapt to new users?

6 1