Author | Lightnews

Sachin Kumar

@shocheen.bsky.social

2K followers 250 following 11 posts

Assistant Professor at the Ohio State University (CSE). Hiring PhD Students (Fall'25). http://shocheen.com

shocheen.com

Posts Media Videos Starter Packs

Sachin Kumar @shocheen.bsky.social · Apr 15

Super excited for this workshop, Mark your calendars!

Tokenization Workshop (TokShop) @ICML2025 @tokshop.bsky.social · Apr 15

🚨 NEW WORKSHOP ALERT 🚨

We're thrilled to announce the first-ever Tokenization Workshop (TokShop) at #ICML2025 @icmlconf.bsky.social! 🎉

Submissions are open for work on tokenization across all areas of machine learning.

📅 Submission deadline: May 30, 2025
🔗 tokenization-workshop.github.io

Tokenization Workshop @ ICML 2025

tokenization-workshop.github.io

Sachin Kumar @shocheen.bsky.social · Apr 8

We hope this paper encourages more thorough and diverse evaluations of interpretability and steering techniques going forward. (4/4)

Sachin Kumar @shocheen.bsky.social · Apr 8

A common theme we noticed across many methods we explored—and in much of the existing literature in this area—is the limited evaluation scope. Many such papers still use Pythia or Llama 1/2 which have very very different trends than many of the newer models (for reasons we couldn't pin down). (3/4)

1 4

Sachin Kumar @shocheen.bsky.social · Apr 8

This project began nearly a year ago when I was at Ai2. Activation steering and related ideas were incredibly appealing, and we explored applying them to a range of problems. But none of the techniques we tried led to meaningful improvements, which prompted a deeper investigation. (2/4)

Sachin Kumar @shocheen.bsky.social · Apr 8

Really excited for this paper to be out, led by @patqdasilva.bsky.social 👇. Follow him for more exciting work coming soon. (1/4)

patqdasilva.bsky.social @patqdasilva.bsky.social · Apr 8

Steering language models by directly intervening on internal activations is appealing–but does it generalize?

We study 3 popular steering methods with 36 models from 14 families (1.5-70B), exposing brittle performance and fundamental flaws in underlying assumptions
🧵👇
(1/10)

Sachin Kumar @shocheen.bsky.social · Jan 22

I am looking for multiple emergency reviewers for December ARR for papers related to: disinformation, prompt engineering, reward modeling, and diffusion LMs. Please let me know if you can help!

1 2

Sachin Kumar @shocheen.bsky.social · Jan 21

We have queries like this in our recent paper: www.arxiv.org/abs/2407.12043

The Art of Saying No: Contextual Noncompliance in Language Models

Chat-based language models are designed to be helpful, yet they should not comply with every user request. While most existing work primarily focuses on refusal of "unsafe" queries, we posit that the ...

www.arxiv.org

Sachin Kumar @shocheen.bsky.social · Dec 9

3 - Liwei Jiang leads the effort to scale jailbreaking tactics and build adversarially safer LMs (Friday 4.30pm PT):

Sachin Kumar @shocheen.bsky.social · Dec 9

2 - We build tokenizer free multilingual LMs; led by @orevaahia.bsky.social (Thursday 4.30pm):

1 2

Sachin Kumar @shocheen.bsky.social · Dec 9

1 - In the D&B track, we study language model noncompliance beyond only safety; co-led with Faeze Brahman (Thursday 11am PT):

1 1

Sachin Kumar @shocheen.bsky.social · Dec 9

En route Vancouver to attend #NeurIPS2024 and excited to be a part of the following papers 👇!

I am also recruiting multiple PhD students for Fall '25. DM me here or on Whova, if interested in: multilinguality, personalized alignment, real use inspired evals (see website in bio for details).

1 3 6

Reposted by Sachin Kumar

Valentina Pyatkin @valentinapy.bsky.social · Dec 1

@shocheen.bsky.social and co will be at the Thursday poster session to present our paper on "Contextual Noncompliance"

1 1 10

Reposted by Sachin Kumar

Maria Antoniak @mariaa.bsky.social · Nov 19

I'm recruiting 1-2 PhD students to work with me at the University of Colorado Boulder! Looking for creative students with interests in #NLP and #CulturalAnalytics.

Boulder is a lovely college town 30 minutes from Denver and 1 hour from Rocky Mountain National Park 😎

Apply by December 15th!

A photo of Boulder, Colorado, shot from above the university campus and looking toward the Flatirons.

10 140 310