Principal SWE @ Google Deepmind.
♊🌊 Gemini Audio and Astra core team.
http://rjryan.me/ https://google.github.io/tacotron
We study learning and memory in mind, brains and machines. I am open to collaborations and hiring a lab technician (lab manager/junior specialist). Job ad & application here: recruit.ap.uci.edu/JPF09400.
We study learning and memory in mind, brains and machines. I am open to collaborations and hiring a lab technician (lab manager/junior specialist). Job ad & application here: recruit.ap.uci.edu/JPF09400.
Long-Form Speech Generation with Spoken Language Models
https://arxiv.org/abs/2412.18603
Long-Form Speech Generation with Spoken Language Models
https://arxiv.org/abs/2412.18603
(Did I mention we are hiring on the Generative Media team, btw 👀)
blog.google/technology/g...
(Did I mention we are hiring on the Generative Media team, btw 👀)
blog.google/technology/g...
Please consider applying if you have expertise in the domain or related areas such as multimodal models, video generation 📹, etc.
boards.greenhouse.io/deepmind/job...
Please consider applying if you have expertise in the domain or related areas such as multimodal models, video generation 📹, etc.
boards.greenhouse.io/deepmind/job...
They're so useful! You just need to be aware of those limitations, just like with any tool.
Very Attentive Tacotron: Robust and Unbounded Length Generalization in Autoregressive Transformer-Based Text-to-Speech
https://arxiv.org/abs/2410.22179
Very Attentive Tacotron: Robust and Unbounded Length Generalization in Autoregressive Transformer-Based Text-to-Speech
https://arxiv.org/abs/2410.22179
pdf ❌
abs ✅
pdf ❌
abs ✅