Lightnews — Scholar-powered news

Apurv Verma

@apurv-verma.bsky.social

26 followers 150 following 3 posts

Building safer, more aligned models 🧭 📐
PhD student, NJIT 🎓 | NLP at Bloomberg 🛠️
Website: vermaapurv.com/aboutme/

Posts Replies Media Videos

Apurv Verma

@apurv-verma.bsky.social

Ever wondered about watermarking's effect on model alignment? 🤔
We found it shifts AI safety behavior. Our fix: generate 2-4 responses, pick the best one 🎯
"Watermarking Degrades Alignment in Language Models" 📄
arxiv.org/abs/2506.04462
#AIResearch #AISafety #Watermarking #LLMs

Watermarking Degrades Alignment in Language Models: Analysis and Mitigation

Watermarking techniques for large language models (LLMs) can significantly impact output quality, yet their effects on truthfulness, safety, and helpfulness remain critically underexamined. This paper...

arxiv.org

June 8, 2025 at 1:57 AM

Reposted by Apurv Verma

Melanie Mitchell

@melaniemitchell.bsky.social

Very good (technical) explainer answering "How has DeepSeek improved the Transformer architecture?". Aimed at readers already familiar with Transformers.

epoch.ai/gradient-upd...

How has DeepSeek improved the Transformer architecture?

This Gradient Updates issue goes over the major changes that went into DeepSeek’s most recent model.

epoch.ai

January 30, 2025 at 9:07 PM

Reposted by Apurv Verma

Ahmad Beirami

@abeirami.bsky.social

Very interesting paper by Ananda Theertha Suresh et al.

For categorical/Gaussian distributions, they derive the rate at which a sample is forgotten to be 1/k after k rounds of recursive training (hence 𝐦𝐨𝐝𝐞𝐥 𝐜𝐨𝐥𝐥𝐚𝐩𝐬𝐞 happens more slowly than intuitively expected)

December 27, 2024 at 11:35 PM

Apurv Verma

@apurv-verma.bsky.social

I am an AI researcher working on safe AI. My most recent work can be found at arxiv.org/abs/2407.14937. I am trying to connect with other AI researchers on 🦋; follow me here, and I will follow you back.

arxiv.org

November 19, 2024 at 2:15 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news