Lightnews — Scholar-powered news

Matteo Saponati

@matteosaponati.bsky.social

200 followers 86 following 15 posts

I am a research scientist in Machine Learning and Neuroscience. I am fascinated by life and intelligence, and I like to study complex systems. I love to play music and dance. Postdoctoral Research Scientist @ ETH Zürich ↳ https://matteosaponati.github.io

matteosaponati.github.io

Posts Media Videos Starter Packs

Matteo Saponati @matteosaponati.bsky.social · May 29

really great work! nice to see some feedback control :)

Matteo Saponati @matteosaponati.bsky.social · May 29

uh! Very interesting, great work! nice to see that feedback control approaches are getting more famous :)

Matteo Saponati @matteosaponati.bsky.social · Apr 16

@melikapayvand.bsky.social

Matteo Saponati @matteosaponati.bsky.social · Apr 16

Take our short 5-min anonymous survey on the Neuromorphic field’s current state & future:

📋 tinyurl.com/3jkszrnr
🗓️ Open until May 12, 2025

Results will be shared openly and submitted for publication. Your input will help us understand how interdisciplinary trends are shaping the field.

Neuromorphic Questionnaire

This form collects valuable information from the Neuromorphic Community as part of a project led by Matteo Saponati, Laura Kriener, Sebastian Billaudelle, Filippo Moro, and Melika Payvand. The goal is...

tinyurl.com

1 10 8

Reposted by Matteo Saponati

Jérôme Lecoq @jeromelecoq.bsky.social · Apr 15

How does our brain predict the future? Our review of predictive processing + research program is now on arXiv arxiv.org/abs/2504.09614
50+ neuroscientists distributed across the world worked together to create this unique community project.

2 30 84

Reposted by Matteo Saponati

Elisa Donati @elisadonati.bsky.social · Apr 2

🌟 Paper out in npj Unconventional Computing!
www.nature.com/articles/s44...

A system built with just a few neurons, yet able to solve a complex task — not by stacking layers or going deeper, but by embracing unconventional thinking.

This is neuromorphic to me!

A neuromorphic multi-scale approach for real-time heart rate and state detection - npj Unconventional Computing

npj Unconventional Computing - A neuromorphic multi-scale approach for real-time heart rate and state detection

www.nature.com

1 5 14

Reposted by Matteo Saponati

Giacomo Indiveri @giacomoi.bsky.social · Apr 2

I'm extremely proud of this work, which shows how using the physics of analog electronic circuits helps us understand learning and computational principles of cortical neural networks and build efficient neural processing systems that can complement and outperform AI accelerators in edge computing!

bioRxivpreprint @biorxivpreprint.bsky.social · Apr 2

A canonical cortical electronic circuit for neuromorphic intelligence https://www.biorxiv.org/content/10.1101/2025.03.28.646019v1

1 4 8

Matteo Saponati @matteosaponati.bsky.social · Mar 23

fantastic post, and tasty food for thoughts.

shamelessly adding here that many different types of STDP come about from minimizing a prediction of the future loss function with spikes :)

hopefully another case of successful predictions.

www.nature.com/articles/s41...

Sequence anticipation and spike-timing-dependent plasticity emerge from a predictive learning rule - Nature Communications

Prediction of future inputs is a key computational task for the brain. Here, the authors proposed a predictive learning rule in neurons that leads to anticipation and recall of inputs, and that reprod...

www.nature.com

4 11

Matteo Saponati @matteosaponati.bsky.social · Feb 18

#preprint #machinelearning #transformers #selfattention #ml #deeplearning

Matteo Saponati @matteosaponati.bsky.social · Feb 18

7/ I would like to thank Pascal Sager for all the training, the writing, the discussion, and whatnot, Pau Vilimelis Aceituno for the hours spent on refining the math, Thilo Stadelmann and Benjamin Grewe for their great contribution and supervision, and all the people at INI.

cheers 💜

a cartoon of two robots standing next to each other and the words `` bye '' .

ALT: a cartoon of two robots standing next to each other and the words `` bye '' .

media.tenor.com

1 2

Matteo Saponati @matteosaponati.bsky.social · Feb 18

6/ TL;DR

- Self-attention matrices in Transformers show universal structural differences based on training.
- Bidirectional models → Symmetric self-attention
- Autoregressive models → Directional, column-dominant
- Using symmetry as an inductive bias improves training.

⬇️

1 1

Matteo Saponati @matteosaponati.bsky.social · Feb 18

5/ Finally, we leveraged symmetry to improve Transformer training.

- Initializing self-attention matrices symmetrically improves training efficiency for bidirectional models, leading to faster convergence.

This suggest that imposing structures at initialization can enhance training dynamics.

⬇️

1 1

Matteo Saponati @matteosaponati.bsky.social · Feb 18

4/ We validate our analysis empirically showing that these patterns consistently emerge different language models and input modalities such as text, vision, and audio models.

- ModernBERT, GPT, LLaMA3, Mistral, etc
- Text, vision, and audio models
- Different model sizes, and architectures

⬇️

1 2

Matteo Saponati @matteosaponati.bsky.social · Feb 18

3/ We demonstrate that the self-attention matrices behaves differently for different training objectives:

- Bidirectional training (BERT-style) induces symmetric self-attention structures.
- Autoregressive training (GPT-style) induces directional structures with column dominance.

⬇️

1 1

Matteo Saponati @matteosaponati.bsky.social · Feb 18

2/ Self-attention is the backbone of Transformer models, but how does training shape the internal structure of self-attention matrices?

We introduce a mathematical framework to study these matrices and uncover fundamental differences in how they are updated during gradient descent.

⬇️

1 1

Matteo Saponati @matteosaponati.bsky.social · Feb 18

1/ I am very excited to announce that our paper "The underlying structures of self-attention: symmetry, directionality, and emergent dynamics in Transformer training" is available on arXiv 💜

arxiv.org/abs/2502.10927

How is information encoded in self-attention matrices?How to interpret it?

⬇️

The underlying structures of self-attention: symmetry, directionality, and emergent dynamics in Transformer training

Self-attention is essential to Transformer architectures, yet how information is embedded in the self-attention matrices and how different objective functions impact this process remains unclear. We p...

arxiv.org

1 3 8

Matteo Saponati @matteosaponati.bsky.social · Jan 23

hey Bluesky world, I realized I didn't post anything here yet

I'll start with sharing a recent blog post on the workshop we organised at the last Bernstein Conference 2024. I hope you enjoy it 💃

Thank you @melikapayvand.bsky.social , Laura, and Ana for the feedback and suggestions 💜

It’s been a while since I last published something on my personal blog. Life has moved along in beautiful, unexpected ways, and I’ve experienced many absolutely lovely moments. On the professional side, I recently had the opportunity to organize a workshop at the Bernstein Conference in Frankfurt am Main (my lovely and chaotic Frankfurt <3). This workshop was brought to life thanks to the other incredible organizers, Laura Kriener and Melika Payvand, with their creativity, initiative, and vision. I feel grateful to be a part of this vibrant community. So, why not use this opportunity to write here again?

matteosaponati.github.io

2 15

Matteo Saponati @matteosaponati.bsky.social · Jan 7

Hey Dan! I would like to be added :)

1 1