https://papail.io
The fact that these "world models" are approximate/incomplete does not disqualify them.
The fact that these "world models" are approximate/incomplete does not disqualify them.
In "A Mathematical Theory of Communication", almost as an afterthought Shannon suggests the N-gram for generating English, and that word level tokenization is better than character level tokenization.
In "A Mathematical Theory of Communication", almost as an afterthought Shannon suggests the N-gram for generating English, and that word level tokenization is better than character level tokenization.
I think 2-3 years ago, I said I will not work on two ML sub-areas. RL was one of them. I am happy to say that I am not strongly attached to my beliefs.
I think 2-3 years ago, I said I will not work on two ML sub-areas. RL was one of them. I am happy to say that I am not strongly attached to my beliefs.
Every eval chokes under hill climbing. If we're lucky, there’s an early phase where *real* learning (both model and community) can occur. I'd argue that a benchmark’s value lies entirely in that window. So the real question is what did we learn?
Every eval chokes under hill climbing. If we're lucky, there’s an early phase where *real* learning (both model and community) can occur. I'd argue that a benchmark’s value lies entirely in that window. So the real question is what did we learn?
In Greek, συκοφάντης (sykophántēs) most typically refers to a malicious slanderer, someone spreading lies, not flattery!
Every time you use it, you’re technically using it wrong :D
In Greek, συκοφάντης (sykophántēs) most typically refers to a malicious slanderer, someone spreading lies, not flattery!
Every time you use it, you’re technically using it wrong :D
We're hiring at the Senior Researcher level (eg post phd).
Please drop me a DM if you do!
jobs.careers.microsoft.com/us/en/job/17...
We're hiring at the Senior Researcher level (eg post phd).
Please drop me a DM if you do!
jobs.careers.microsoft.com/us/en/job/17...
link to our paper: arxiv.org/abs/2502.01612
link to our paper: arxiv.org/abs/2502.01612