Yonatan Bisk
banner
ybisk.me
Yonatan Bisk
@ybisk.me
Assistant Professor confused by the concept of consciousness but talkingtorobots.com in the meantime
Reposted by Yonatan Bisk
We are getting closer to have agents operating in the real physical world. However, can we trust frontier models to make embodied decisions 🎮 aligned with human norms 👩‍⚖️ ?

With EgoNormia, a 1.8k ego-centric video 🥽 QA benchmark, we show that this is surprisingly challenging!
March 4, 2025 at 4:32 AM
How many (checks calendar) decades do people keep around backups of data from their thesis? Am I a digital hoarder?
January 5, 2025 at 5:00 AM
Reposted by Yonatan Bisk
Recently, papers have been published in prestigious journals (Nature Human Behaviour, PNAS) claiming that large language models (e.g., ChatGPT) solve the "false belief" task (a task requiring Theory of Mind abilities).

What is the false belief task? ->
December 17, 2024 at 8:36 AM
Reposted by Yonatan Bisk
This article really spoke to me; all the science I've enjoyed and that I thought came out well has been done with a colleague that I was talking to every day and almost every couple of hours
Doing good science is 90% finding a science buddy to constantly talk to about the project.
November 17, 2024 at 2:32 PM
Reposted by Yonatan Bisk
Hello, Computational linguistics/NLP world in Bluesky! We're creating the same accounts on other social media platforms in Bluesky! #NLProc
November 14, 2024 at 12:17 AM
Reposted by Yonatan Bisk
I am trying to create a robotics and ai starter pack on bluesky: go.bsky.app/DfAoaJ1

Very incomplete please comment with suggestions (or just if you're missing and want to be added!)
November 11, 2024 at 3:01 PM
#EMNLP2024

1. Tools Fail: Detecting Silent Errors in Faulty Tools

Are you using tools with your LLMs? Are you assuming your tools are perfect? Assuming the LLM can just handle any errors for you? 😬
Danger… 🚨 Models trust tools over their own “knowledge” even for simple and well trained cases.
November 10, 2024 at 6:34 PM
Hi from CoRL 👋
November 8, 2024 at 10:27 AM