RJ Skerry-Ryan
rjsr.bsky.social
RJ Skerry-Ryan
@rjsr.bsky.social
🌮🤖 Speech and language modeling researcher.
Principal SWE @ Google Deepmind.
♊🌊 Gemini Audio and Astra core team.

http://rjryan.me/ https://google.github.io/tacotron
Reposted by RJ Skerry-Ryan
I have recently launched the relational cognition lab at UC Irvine: relcoglab.org!
We study learning and memory in mind, brains and machines. I am open to collaborations and hiring a lab technician (lab manager/junior specialist). Job ad & application here: recruit.ap.uci.edu/JPF09400.
Relational Cognition Lab
relcoglab.org
January 12, 2025 at 8:46 PM
Earnest Q: What's the connection between this (very impressive) blinkenlight drone swarm (srsly, I am dying with envy of whoever got to build this) and military applications of drones? The drones the US has been bombing people with for decades now have nothing to do with this type of drone, no?
January 1, 2025 at 5:36 AM
Reposted by RJ Skerry-Ryan
In 1994, a math professor discovered that Intel's Pentium chip sometimes gave the wrong answer when dividing. Fixing this "FDIV" bug cost Intel $475 million. I analyzed the Pentium chip and found the bug. 1/N
December 28, 2024 at 6:57 PM
Reposted by RJ Skerry-Ryan
Se Jin Park, Julian Salazar, Aren Jansen, Keisuke Kinoshita, Yong Man Ro, RJ Skerry-Ryan
Long-Form Speech Generation with Spoken Language Models
https://arxiv.org/abs/2412.18603
December 25, 2024 at 5:15 AM
Reposted by RJ Skerry-Ryan
Here's Veo 2, the latest version of our video generation model, as well as a substantial upgrade for Imagen 3 🧑‍🍳🚢

(Did I mention we are hiring on the Generative Media team, btw 👀)

blog.google/technology/g...
State-of-the-art video and image generation with Veo 2 and Imagen 3
We’re rolling out a new, state-of-the-art video model, Veo 2, and updates to Imagen 3. Plus, check out our new experiment, Whisk.
blog.google
December 16, 2024 at 5:35 PM
Reposted by RJ Skerry-Ryan
🚨🚨My team @GoogleDeepMind in Tokyo is looking for a talented research scientist to work on audio generative models! 🔊
Please consider applying if you have expertise in the domain or related areas such as multimodal models, video generation 📹, etc.
boards.greenhouse.io/deepmind/job...
DeepMind
boards.greenhouse.io
December 6, 2024 at 7:09 AM
Totally true for domains where you are a good enough verifier (and you don't get lulled into a false sense of security with it), but a problem I've seen is where you end up trusting it in domains you're not a verifier because it tends to be correct in domains you are a verifier in.
LLM-based tools have plenty of limitations, but "I don't use them because they can make stuff up" feels really short-sighted. And I see this take a lot in programming communities.

They're so useful! You just need to be aware of those limitations, just like with any tool.
November 28, 2024 at 5:48 PM
Thanksgiving shout out to this legend, as I wait in line at the pharmacy to pick up antibiotics for the second time in 2 weeks.
November 28, 2024 at 5:39 PM
Reposted by RJ Skerry-Ryan
Eric Battenberg, RJ Skerry-Ryan, Daisy Stanton, Soroosh Mariooryad, Matt Shannon, Julian Salazar, David Kao
Very Attentive Tacotron: Robust and Unbounded Length Generalization in Autoregressive Transformer-Based Text-to-Speech
https://arxiv.org/abs/2410.22179
October 30, 2024 at 9:30 AM
Reposted by RJ Skerry-Ryan
Arxiv sharing reminder

pdf ❌
abs ✅
November 26, 2024 at 8:42 AM
November 24, 2024 at 1:23 AM