Stefan Lattner
stefanlattner.bsky.social
Stefan Lattner
@stefanlattner.bsky.social
Research Leader @ Sony CSL Paris
🎉 New ISMIR 2025 paper!

Autoregressive Diffusion Models estimate musical surprisal more effectively than GIVT — capturing pitch expectations & segment boundaries 🎶

📜 arxiv.org/abs/2508.05306

#ListenerModels #Diffusion #ISMIR2025 @sonycsl-paris.bsky.social
August 20, 2025 at 2:01 PM
🎶 New paper alert!
Do AI audio embeddings *hear* timbre like we do?
➡️ Benchmarked 18 reps vs 2.6 K human ratings (21 datasets)
🏅 Style embeddings from CLAP & our sound-matching model are best aligned!
Paper: arxiv.org/abs/2507.07764
#ISMIR2025 #MIR #AudioAI #SonyCSLMusic
Assessing the Alignment of Audio Representations with Timbre Similarity Ratings
Psychoacoustical so-called "timbre spaces" map perceptual similarity ratings of instrument sounds onto low-dimensional embeddings via multidimensional scaling, but suffer from scalability issues and a...
arxiv.org
July 11, 2025 at 2:23 PM
As Sony Techhub went offline, here is the direct link to DrumGAN:

drumgan.csl.sony.fr
DrumGAN
DrumGAN is able to generate audio content from scratch, or make variations of a user’s content.
drumgan.csl.sony.fr
May 16, 2025 at 11:00 AM
🔥Visit our talks and posters at #ICASSP2025! 👀

Music2Latent2: Audio Compression with Summary Embeddings and Autoregressive Decoding
M. Pasini, S. Lattner, G. Fazekas
Wednesday, April 9 ( pm): Deep generative models I
April 8, 2025 at 6:40 AM
🤩 From our series "@ieeeICASSP paper released", we announce that "Zero-shot Musical Stem Retrieval with Joint-Embedding Predictive Architectures" is online!

📜 Paper: arxiv.org/pdf/2411.19806

Thx to my colleagues Alain Riou, Geoffroy Peeters, Gaetan Hadjeres and Antonin Gagneré!

🎶 SonyCSLMusic 🎶
arxiv.org
January 29, 2025 at 1:34 PM
Our #ICASSP paper "Hybrid Losses for Hierarchical Embedding Learning" by Haokun Tian et al. is now online! 💫

We assess the organization of a hierarchical embedding space using different (combinations of) losses and improve on the SOTA.

📜 Paper: arxiv.org/pdf/2501.12796

#SonyCSLParis
January 24, 2025 at 1:42 PM
Recently, I had the honour of giving a keynote speech on Audio Representation Learning and Generation at the DMRN+ workshop at @c4dm at Queen Mary University. 💫

🎬🎙️ Recording:
echo360.org.uk/media/f037dc...

🎶 More Info:
www.qmul.ac.uk/dmrn/dmrn19/
January 22, 2025 at 1:15 PM
Our #ICASSP paper "Estimating Musical Surprisal in Audio" is now online. 😯 <- surprised 😁

Great work by Mathias Bjare and Giorgia Cantisani! 👏

We use an autoregressive transformer and Gaussian mixture models to estimate the information content in music2latent representations. 🧵👇
January 21, 2025 at 3:20 PM
🎶✨ New Paper Announcement! ✨🎶
We present "Improving Musical Accompaniment Co-creation via Diffusion Transformers" 🎹🎸—a study advancing our Diff-A-Riff stem generator through improved quality, efficiency, and control.

📜Read the full paper here: arxiv.org/pdf/2410.23005 🧵👇
arxiv.org
January 20, 2025 at 1:42 PM
🧑‍🎓 Our #ISMIR Conference Tutorial "Deep Learning 101 for Audio-based MIR" provides a broad introduction to music audio processing, analysis, and generation.

📘 The book and jupyter notebooks:
geoffroypeeters.github.io/deeplearning...

🎥 The recording of the tutorial:
us02web.zoom.us/rec/share/Qz...
Deep Learning 101 for Audio-based MIR — Deep Learning 101 for Audio-based MIR
geoffroypeeters.github.io
January 15, 2025 at 2:19 AM
😃 Accepted #ICASSP papers of Sony CSL Music Team:

Accompaniment Prompt Adherence: A Measure for Evaluating Music Accompaniment Systems
M. Grachten, J. Nistal

Estimating Musical Surprisal in Audio
M. Bjare, G. Cantisani, S. Lattner and G. Widmer
January 14, 2025 at 12:53 PM