This fascinating work applies CoT inspired by human “thinking while listening”, training models to find the inflection point when reasoning starts.
📄 arxiv.org/abs/2510.07497
This fascinating work applies CoT inspired by human “thinking while listening”, training models to find the inflection point when reasoning starts.
📄 arxiv.org/abs/2510.07497
This new work dives into 6 SLU tasks and reveals some interesting takeaways!
arxiv.org/abs/2508.17863
This new work dives into 6 SLU tasks and reveals some interesting takeaways!
arxiv.org/abs/2508.17863
This paper arxiv.org/abs/2505.19937 proposes a new metric to measure layer-wise correlation between the two, with a focus on SLU tasks. 🔍🗣️📄
This paper arxiv.org/abs/2505.19937 proposes a new metric to measure layer-wise correlation between the two, with a focus on SLU tasks. 🔍🗣️📄
Find out the best positioning for speech and text—and the novel adapter that aligns speech and text modalities!
arxiv.org/abs/2412.01145
Find out the best positioning for speech and text—and the novel adapter that aligns speech and text modalities!
arxiv.org/abs/2412.01145