Will Whitney
wfwhitney.bsky.social
Will Whitney
@wfwhitney.bsky.social
RS at DeepMind.
willwhitney.com
I’m so ready for the robot X Games
December 23, 2024 at 8:11 PM
Likewise, voice mode is a qualitatively different interaction
December 14, 2024 at 1:37 AM
Oh I love that! Having an always-on voice channel for meta-interactions is a clear part of what I’m envisioning
December 14, 2024 at 1:27 AM
If you're around NeurIPS and want to chat about this stuff, hit me up.
December 14, 2024 at 1:14 AM
Under the hood, the model will interpret every click and update the UI. In the limiting case, your whole computer is just a large model: it generates pixels, reads your clicks and taps, sends some API calls, then generates the next frame. Generative UI is a whole different computing paradigm.
December 14, 2024 at 1:14 AM
Models that generate UI will build tools that help us communicate with them. DALL-E generates sliders for your image that control its outputs. A coding model generates a WYSIWYG editor for the web page it built. Instant feedback and rich interaction.
December 14, 2024 at 1:14 AM
Right now we're a little stuck thinking about AI as a person, which comes with the baggage of how we interact with people. But large models don't have the same limitations as humans.
December 14, 2024 at 1:14 AM
Seems like a close relative of the Janus problem, but I was always suspicious that guidance was the root cause of that. I don’t expect guidance to have a preference for left/right orientation in general… I bet big-ish models do fine on this
November 25, 2024 at 9:58 PM