https://vincentherrmann.github.io
We introduce the PHi (Prediction of Hidden states) layer and PHi Loss. High PHi loss means the model's hidden state is complex and unpredictable—a sign of interesting computation.
We introduce the PHi (Prediction of Hidden states) layer and PHi Loss. High PHi loss means the model's hidden state is complex and unpredictable—a sign of interesting computation.
How can we tell if an LLM is actually "thinking" versus just spitting out memorized or trivial text? Can we detect when a model is doing anything interesting?
(Thread below👇)
How can we tell if an LLM is actually "thinking" versus just spitting out memorized or trivial text? Can we detect when a model is doing anything interesting?
(Thread below👇)