Kyle Kastner
@kastnerkyle.bsky.social
380 followers 790 following 77 posts
computers and music are (still) fun
Posts Media Videos Starter Packs
Reposted by Kyle Kastner
motonobu-kanagawa.bsky.social
ProbNum 2025 Keynote 2 ``Gradient Flows on the Maximum Mean Discrepancy'' by @arthurgretton.bsky.social ( @gatsbyucl.bsky.social and Google DeepMind.

Slides available here: probnum25.github.io/keynotes
Reposted by Kyle Kastner
timfduffy.com
Surprising new results from Owain Evans and Anthropic: Training on the outputs of a model can change the model's behavior, even when those outputs seem unrelated. Training only on completions of 3-digit numbers was able to transmit a love of owls. alignment.anthropic.com/2025/sublimi...
Reposted by Kyle Kastner
catherinearnett.bsky.social
MorphScore got an update! MorphScore now covers 70 languages 🌎🌍🌏 We have a new-preprint out and we will be presenting our paper at the Tokenization Workshop @tokshop.bsky.social at ICML next week! @marisahudspeth.bsky.social @brenocon.bsky.social
Reposted by Kyle Kastner
hthasarathan.bsky.social
Our work finding universal concepts in vision models is accepted at #ICML2025!!!

My first major conference paper with my wonderful collaborators and friends @matthewkowal.bsky.social @thomasfel.bsky.social
@Julian_Forsyth
@csprofkgd.bsky.social

Working with y'all is the best 🥹

Preprint ⬇️!!
hthasarathan.bsky.social
🌌🛰️🔭Wanna know which features are universal vs unique in your models and how to find them? Excited to share our preprint: "Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment"!

arxiv.org/abs/2502.03714

(1/9)
Reposted by Kyle Kastner
wildaudiojack.bsky.social
Contribute to the first global archive of soniferous freshwater life, The Freshwater Sounds Archive, and receive recognition as a co-author in a resulting data paper!

Pre-print now available. New deadline: 31st Dec, 2025.

See link 👇4 more fishsounds.net/freshwater.js
Reposted by Kyle Kastner
dantanvii.bsky.social
🚀 Interested in Neuro-Symbolic Learning and attending #ICRA2025? 🧠🤖

Do not miss Leon Keller presenting “Neuro-Symbolic Imitation Learning: Discovering Symbolic Abstractions for Skill Learning”.

Joint work of Honda Research Institute EU and @jan-peters.bsky.social (@ias-tudarmstadt.bsky.social).
Reposted by Kyle Kastner
arxiv-cs-cl.bsky.social
Prasoon Bajpai, Tanmoy Chakraborty
Multilingual Test-Time Scaling via Initial Thought Transfer
https://arxiv.org/abs/2505.15508
Reposted by Kyle Kastner
ai-firehose.column.social
A study shows in-context learning in spoken language models can mimic human adaptability, reducing word error rates by nearly 20% with just a few utterances, especially aiding low-resource language varieties and enhancing recognition across diverse speakers. https://arxiv.org/abs/2505.14887
In-Context Learning Boosts Speech Recognition via Human-like Adaptation to Speakers and Language Varieties
ArXiv link for In-Context Learning Boosts Speech Recognition via Human-like Adaptation to Speakers and Language Varieties
arxiv.org
Reposted by Kyle Kastner
deepfates.com.deepfates.com.deepfates.com.deepfates.com.deepfates.com
"Interdimensional Cable", shorts made with Veo 3 ai. By CodeSamurai on Reddit
Reposted by Kyle Kastner
arxiv-cs-cv.bsky.social
Bingda Tang, Boyang Zheng, Xichen Pan, Sayak Paul, Saining Xie
Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis
https://arxiv.org/abs/2505.10046
Reposted by Kyle Kastner
arxiv-sound.bsky.social
A neural ODE model combined modal decomposition with a neural network to model nonlinear string vibrations, generating synthetic data and sound examples.
Learning Nonlinear Dynamics in Physical Modelling Synthesis using Neural Ordinary Differential Equations
Victor Zheleznov, Stefan Bilbao, Alec Wright, Simon King
arxiv.org
Reposted by Kyle Kastner
ai-firehose.column.social
Research unveils Omni-R1, a fine-tuning method for audio LLMs that boosts audio performance via text training, achieving MMAU results. Findings reveal how enhanced text reasoning affects audio capacities, suggesting new model optimization directions. https://arxiv.org/abs/2505.09439
Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?
ArXiv link for Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?
arxiv.org
Reposted by Kyle Kastner
dorialexander.bsky.social
Yeah we finally have a model report with an actual data section. Thanks Qwen 3! github.com/QwenLM/Qwen3...
Reposted by Kyle Kastner
ai-firehose.column.social
FLAM, a novel audio-language model, enables frame-wise localization of sound events in an open-vocabulary format. With large-scale synthetic data and advanced training methods, FLAM enhances audio understanding and retrieval, aiding multimedia indexing and access. https://arxiv.org/abs/2505.05335
FLAM: Frame-Wise Language-Audio Modeling
ArXiv link for FLAM: Frame-Wise Language-Audio Modeling
arxiv.org
Reposted by Kyle Kastner
abeirami.bsky.social
#ICML2025
Is standard RLHF optimal in view of test-time scaling? Unsurprisingly no.

We show a simple change to standard RLHF framework that involves 𝐫𝐞𝐰𝐚𝐫𝐝 𝐜𝐚𝐥𝐢𝐛𝐫𝐚𝐭𝐢𝐨𝐧 and 𝐫𝐞𝐰𝐚𝐫𝐝 𝐭𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐚𝐭𝐢𝐨𝐧 (suited to test-time procedure) is optimal!
sziteng.bsky.social
Inference-time procedures (e.g. Best-of-N, CoT) have been instrumental to recent development of LLMs. Standard RLHF focuses only on improving the trained model. This creates a train/inference mismatch.

𝘊𝘢𝘯 𝘸𝘦 𝘢𝘭𝘪𝘨𝘯 𝘰𝘶𝘳 𝘮𝘰𝘥𝘦𝘭 𝘵𝘰 𝘣𝘦𝘵𝘵𝘦𝘳 𝘴𝘶𝘪𝘵 𝘢 𝘨𝘪𝘷𝘦𝘯 𝘪𝘯𝘧𝘦𝘳𝘦𝘯𝘤𝘦-𝘵𝘪𝘮𝘦 𝘱𝘳𝘰𝘤𝘦𝘥𝘶𝘳𝘦?

Check out below.
Reposted by Kyle Kastner
djfoster.bsky.social
Is Best-of-N really the best we can do for language model inference?

New paper (appearing at ICML) led by the amazing Audrey Huang (ahahaudrey.bsky.social) with Adam Block, Qinghua Liu, Nan Jiang, and Akshay Krishnamurthy (akshaykr.bsky.social).

1/11
Reposted by Kyle Kastner
timrudner.bsky.social
Congratulations to the #AABI2025 Workshop Track Outstanding Paper Award recipients!
Reposted by Kyle Kastner
sungkim.bsky.social
Why not?

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Applying RLVR to the base model Qwen2.5-Math-1.5B, they identify a single example that elevates model performance on MATH500 from 36.0% to 73.6%,
Reposted by Kyle Kastner
sungkim.bsky.social
An incomplete list of Chinese AI:

- DeepSeek: www.deepseek.com. You can also access AI models via API.
- Moonshot AI's Kimi: www.kimi.ai
- Alibaba's Qwen: chat.qwen.ai. You can also access AI models via API.
- ByteDance's Doubaob (only in Chinese): www.doubao.com/chat/
Reposted by Kyle Kastner
lebellig.bsky.social
I really liked this approach by @matthieuterris.bsky.social et al.They propose learning a unique lightweight model for multiple inverse problems by conditioning it with the forward operator A. Thanks to self-supervised fine-tuning, it can tackle unseen inverse pb.

📰 https://arxiv.org/abs/2503.08915
Reposted by Kyle Kastner
mattieml.bsky.social
Excited to be presenting our spotlight ICLR paper Simplifying Deep Temporal Difference Learning today! Join us in Hall 3 + Hall 2B Poster #123 from 3pm :)
arxiv.org
Reposted by Kyle Kastner
speechpapers.bsky.social
Balinese text-to-speech dataset as digital cultural heritage https://pubmed.ncbi.nlm.nih.gov/40275973/
Reposted by Kyle Kastner
sungkim.bsky.social
Kimi.ai releases Kimi-Audio! Our new open-source audio foundation model advances capabilities in audio understanding, generation, and conversation.

Paper: github.com/MoonshotAI/K...
Repo: github.com/MoonshotAI/K...
Model: huggingface.co/moonshotai/K...
Reposted by Kyle Kastner
lebellig.bsky.social
Very cool article from Panagiotis Theodoropoulos et al: https://arxiv.org/abs/2410.14055
Feedback Schrödinger Bridge Matching introduces a new method to improve transfer between two data distributions using only a small number of paired samples!