bsky.app/profile/did:...
One Book 📚
One Movie 🎥
One Album 💿
One TV Show 📺
One Book 📚
One Movie 🎥
One Album 💿
One TV Show 📺
The principle based on which Unix was founded, and which guides building any software systems with a degree of complexity.
Can this be replicated in AI systems? Here's some thoughts I had on the same 👇
The principle based on which Unix was founded, and which guides building any software systems with a degree of complexity.
Can this be replicated in AI systems? Here's some thoughts I had on the same 👇
An end-to-end LALM designed for industry-strength audio understanding and speech interaction.
✨ Emotion-aware reasoning
✨ Switch timbres with natural language
✨ Intelligent speech conversation
More soon 👀
brb 👀👀👀👀👀👀
Anthropic just dropped this paper. They can steer models quite effectively, and even detect training data that elicits a certain (e.g. evil) persona
arxiv.org/abs/2507.21509
More soon 👀
They are often far worse at getting AI to do stuff than those with a liberal arts or social science bent. LLMs are built from the vast corpus human expression, and knowing the history & obscure corners of human works lets you do far more with AI & get its limits.
They are often far worse at getting AI to do stuff than those with a liberal arts or social science bent. LLMs are built from the vast corpus human expression, and knowing the history & obscure corners of human works lets you do far more with AI & get its limits.
Annotating an interview he gave at NeurIPS 2015 with my basic reflections of what works today and how people should approach working in deep learning (or getting started).
buff.ly/APz3IDj
Optimizes judgement task into thoughts, scores, and judgments using GRPO. Outperforms all baselines at 8B & 70B scale, o1-mini, and on some benchmarks, even R1.
Optimizes judgement task into thoughts, scores, and judgments using GRPO. Outperforms all baselines at 8B & 70B scale, o1-mini, and on some benchmarks, even R1.
In "A Mathematical Theory of Communication", almost as an afterthought Shannon suggests the N-gram for generating English, and that word level tokenization is better than character level tokenization.
Perhaps the "Law of ROUS."