Lightnews — Scholar-powered news

Hongli Zhan ✈️ ICML @hongli-zhan.bsky.social · Jul 8

I will also be on IBM's Expo Talk Panel on Monday, Jul 14, to discuss how SPRI can be incorporated in the industry for generating high-quality synthetic data. You can find more details of the talk here: icml.cc/virtual/2025...

ICML Expo Talk Panel Situating principles in context for synthetic dataICML 2025

icml.cc

Hongli Zhan ✈️ ICML @hongli-zhan.bsky.social · Jul 8

📜Link to the paper: icml.cc/virtual/2025...
👨🏻‍💻Code and data: github.com/honglizhan/S...

Shout out to an amazing team @jessyjli.bsky.social, @m-yurochkin.bsky.social, Muneeza Azmat & Raya Horesh! Also super grateful to the reviewers for their invaluable feedback!

#ICML2025 #LLMAlignment

ICML Poster SPRI: Aligning Large Language Models with Context-Situated PrinciplesICML 2025

icml.cc

1

Hongli Zhan ✈️ ICML @hongli-zhan.bsky.social · Jul 8

1️⃣SPRI generates principles as effective as psychologists to improve users’ well-being

2️⃣SPRI enables tailored rubrics for LLM-judges, matching human-crafted rubrics (e.g., BiGGen-Bench)

3️⃣SPRI-generated synthetic data boosts Llama/Mistral/Gemma (7~9B) on TruthfulQA, with no loss on other benchmarks

1

Hongli Zhan ✈️ ICML @hongli-zhan.bsky.social · Jul 8

🎯Motivation: Constitutional AI works great for aligning LLMs, but the principles can be too generic to apply. Can we guide responses with context-situated principles instead?

💡SPRI tackles this and proves to rival human oracle guidance in the three real-world use cases we tested on 👇

1

Hongli Zhan ✈️ ICML @hongli-zhan.bsky.social · Jul 8

I'll be at #ICML to present SPRI next week! Come by our poster on Tuesday, July 15, 4:30pm, and let’s catch up on LLM alignment! 😃

🚀TL;DR: We introduce Situated-PRInciples (SPRI), a framework that automatically generates input-specific principles to align responses — with minimal human effort.

🧵

1 1 3

Hongli Zhan ✈️ ICML @hongli-zhan.bsky.social · Jul 8

🎯Motivation: Constitutional AI works great for aligning LLMs, but the principles can be too generic to apply. Can we guide responses with context-situated principles instead?

💡SPRI tackles this and proves to rival human oracle guidance in the three real-world use cases we tested on 👇

1

Hongli Zhan ✈️ ICML @hongli-zhan.bsky.social · May 2

I’m excited to share that our paper has been accepted at #ICML2025! 🎉🥳🎊

This work was done during my internship at IBM Research, and it wouldn’t have been possible without a top-notch team and my amazing advisor 👏

1 4

Reposted by Hongli Zhan ✈️ ICML

Jessy Li @jessyjli.bsky.social · May 2

To appear #ICML2025!! 🎉

Hongli Zhan ✈️ ICML @hongli-zhan.bsky.social · Feb 6

Constitutional AI works great for aligning LLMs, but the principles can be too generic to apply.

Can we guide responses with context-situated principles instead?

Introducing SPRI, a system that produces principles tailored to each query, with minimal to no human effort.

arxiv.org/pdf/2502.03397

1 4

Hongli Zhan ✈️ ICML @hongli-zhan.bsky.social · Feb 7

I definitely agree :) I think SPRI can help generate SFT data for their constitutional classifiers that extrapolate *beyond* the "chemical weapons" context that they show in Sec 5 and Appendix B.

Thanks for sharing this!

1

Reposted by Hongli Zhan ✈️ ICML

Jessy Li @jessyjli.bsky.social · Feb 6

The principles that LLMs align with should be specific to the task at hand! Check out @hongli-zhan.bsky.social’s latest work 👇

Hongli Zhan ✈️ ICML @hongli-zhan.bsky.social · Feb 6

Constitutional AI works great for aligning LLMs, but the principles can be too generic to apply.

Can we guide responses with context-situated principles instead?

Introducing SPRI, a system that produces principles tailored to each query, with minimal to no human effort.

arxiv.org/pdf/2502.03397

1 4

Hongli Zhan ✈️ ICML @hongli-zhan.bsky.social · Feb 6

[5/5] Code and model generations: github.com/honglizhan/S...

This project was carried out during my internship at IBM Research, and I’d like to highlight the support and mentorship from my amazing hosts Muneeza Azmat, Raya Horesh, @m-yurochkin.bsky.social and advisor @jessyjli.bsky.social!

arxiv.org

Hongli Zhan ✈️ ICML @hongli-zhan.bsky.social · Feb 6

[4/5] In addition, when applying SPRI to generate SFT data for alignment, we observe substantial improvement on TruthfulQA.

1

Hongli Zhan ✈️ ICML @hongli-zhan.bsky.social · Feb 6

[3/5] We tested SPRI on 3 tasks: generating 1) cognitive reappraisals, 2) instance-specific rubrics for LLM-as-a-judge, and 3) SFT data for alignment.

SPRI turns out to work great for tasks that require complex principles, showcasing on-par performance as expert-guided methods.

1

Hongli Zhan ✈️ ICML @hongli-zhan.bsky.social · Feb 6

[2/5] Short for Situated-PRInciples, SPRI involves 2 stages: 1) synthesizing context-situated principles, and 2) crafting principle-guided responses.

In each stage, a base model and a critic model are used to create principles and responses from scratch through critique-refine.

1

Hongli Zhan ✈️ ICML @hongli-zhan.bsky.social · Feb 6

Constitutional AI works great for aligning LLMs, but the principles can be too generic to apply.

Can we guide responses with context-situated principles instead?

Introducing SPRI, a system that produces principles tailored to each query, with minimal to no human effort.

arxiv.org/pdf/2502.03397

2 2 4