Lightnews — Scholar-powered news

Light up
your news

About Privacy Terms Help

Dilyara Bareeva

Dilyara Bareeva

@dilya.bsky.social

770 followers 500 following 5 posts

PhD Candidate in Interpretability @FraunhoferHHI | 📍Berlin, Germany
dilyabareeva.github.io

Posts Replies Media Videos

Dilyara Bareeva

@dilya.bsky.social

Paper: openreview.net/forum?id=Tgc...
Code: github.com/dilyabareeva...

Manipulating Feature Visualizations with Gradient Slingshots

Feature Visualization (FV) is a widely used technique for interpreting concepts learned by Deep Neural Networks (DNNs), which synthesizes input patterns that maximally activate a given feature....

November 29, 2025 at 4:38 PM

Dilyara Bareeva

@dilya.bsky.social

Huge thanks to my fantastic co-authors Marina MC Höhne, Alexander Warnecke, @lpirch.bsky.social, Klaus-Robert Müller, @rieck.mlsec.org, @slapuschkin.bsky.social, @kirillbykov.bsky.social, and to the UMI Lab, @aifraunhoferhhi.bsky.social, @xai-berlin.bsky.social and @bifold.berlin for the support!

November 29, 2025 at 4:38 PM

Dilyara Bareeva

@dilya.bsky.social

Our lightweight adversarial fine-tuning attack lets you bend a feature to visualize any arbitrary concept. Off-manifold, we impose a hyperbolic activation landscape with its optimum at the target, while preserving on-distribution activations through a weighted two-term loss. 🕵️‍♀️

November 29, 2025 at 4:38 PM