Lightnews — Scholar-powered news

Dilyara Bareeva

@dilya.bsky.social

770 followers 500 following 5 posts

PhD Candidate in Interpretability @FraunhoferHHI | 📍Berlin, Germany
dilyabareeva.github.io

Posts Replies Media Videos

Dilyara Bareeva

@dilya.bsky.social

Our lightweight adversarial fine-tuning attack lets you bend a feature to visualize any arbitrary concept. Off-manifold, we impose a hyperbolic activation landscape with its optimum at the target, while preserving on-distribution activations through a weighted two-term loss. 🕵️‍♀️

November 29, 2025 at 4:38 PM

Dilyara Bareeva

@dilya.bsky.social

✈️🇲🇽 Next Wednesday (Dec 3), 1–4 p.m. CST, I’ll be presenting Manipulating Feature Visualizations with Gradient Slingshots at NeurIPS 2025 in Mexico City!

Feature Visualization has long been a staple interpretability tool. Our work shows it’s far from reliable! 🚨

November 29, 2025 at 4:38 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news