Autoregressive Diffusion Models estimate musical surprisal more effectively than GIVT — capturing pitch expectations & segment boundaries 🎶
📜 arxiv.org/abs/2508.05306
#ListenerModels #Diffusion #ISMIR2025 @sonycsl-paris.bsky.social
Autoregressive Diffusion Models estimate musical surprisal more effectively than GIVT — capturing pitch expectations & segment boundaries 🎶
📜 arxiv.org/abs/2508.05306
#ListenerModels #Diffusion #ISMIR2025 @sonycsl-paris.bsky.social
Do AI audio embeddings *hear* timbre like we do?
➡️ Benchmarked 18 reps vs 2.6 K human ratings (21 datasets)
🏅 Style embeddings from CLAP & our sound-matching model are best aligned!
Paper: arxiv.org/abs/2507.07764
#ISMIR2025 #MIR #AudioAI #SonyCSLMusic
Do AI audio embeddings *hear* timbre like we do?
➡️ Benchmarked 18 reps vs 2.6 K human ratings (21 datasets)
🏅 Style embeddings from CLAP & our sound-matching model are best aligned!
Paper: arxiv.org/abs/2507.07764
#ISMIR2025 #MIR #AudioAI #SonyCSLMusic
Music2Latent2: Audio Compression with Summary Embeddings and Autoregressive Decoding
M. Pasini, S. Lattner, G. Fazekas
Wednesday, April 9 ( pm): Deep generative models I
Music2Latent2: Audio Compression with Summary Embeddings and Autoregressive Decoding
M. Pasini, S. Lattner, G. Fazekas
Wednesday, April 9 ( pm): Deep generative models I
📜 Paper: arxiv.org/pdf/2411.19806
Thx to my colleagues Alain Riou, Geoffroy Peeters, Gaetan Hadjeres and Antonin Gagneré!
🎶 SonyCSLMusic 🎶
📜 Paper: arxiv.org/pdf/2411.19806
Thx to my colleagues Alain Riou, Geoffroy Peeters, Gaetan Hadjeres and Antonin Gagneré!
🎶 SonyCSLMusic 🎶
We assess the organization of a hierarchical embedding space using different (combinations of) losses and improve on the SOTA.
📜 Paper: arxiv.org/pdf/2501.12796
#SonyCSLParis
We assess the organization of a hierarchical embedding space using different (combinations of) losses and improve on the SOTA.
📜 Paper: arxiv.org/pdf/2501.12796
#SonyCSLParis
🎬🎙️ Recording:
echo360.org.uk/media/f037dc...
🎶 More Info:
www.qmul.ac.uk/dmrn/dmrn19/
🎬🎙️ Recording:
echo360.org.uk/media/f037dc...
🎶 More Info:
www.qmul.ac.uk/dmrn/dmrn19/
Great work by Mathias Bjare and Giorgia Cantisani! 👏
We use an autoregressive transformer and Gaussian mixture models to estimate the information content in music2latent representations. 🧵👇
Great work by Mathias Bjare and Giorgia Cantisani! 👏
We use an autoregressive transformer and Gaussian mixture models to estimate the information content in music2latent representations. 🧵👇
We present "Improving Musical Accompaniment Co-creation via Diffusion Transformers" 🎹🎸—a study advancing our Diff-A-Riff stem generator through improved quality, efficiency, and control.
📜Read the full paper here: arxiv.org/pdf/2410.23005 🧵👇
We present "Improving Musical Accompaniment Co-creation via Diffusion Transformers" 🎹🎸—a study advancing our Diff-A-Riff stem generator through improved quality, efficiency, and control.
📜Read the full paper here: arxiv.org/pdf/2410.23005 🧵👇
📘 The book and jupyter notebooks:
geoffroypeeters.github.io/deeplearning...
🎥 The recording of the tutorial:
us02web.zoom.us/rec/share/Qz...
📘 The book and jupyter notebooks:
geoffroypeeters.github.io/deeplearning...
🎥 The recording of the tutorial:
us02web.zoom.us/rec/share/Qz...
Accompaniment Prompt Adherence: A Measure for Evaluating Music Accompaniment Systems
M. Grachten, J. Nistal
Estimating Musical Surprisal in Audio
M. Bjare, G. Cantisani, S. Lattner and G. Widmer
Accompaniment Prompt Adherence: A Measure for Evaluating Music Accompaniment Systems
M. Grachten, J. Nistal
Estimating Musical Surprisal in Audio
M. Bjare, G. Cantisani, S. Lattner and G. Widmer