i mean, that tracks. baseline claude feels a little... naive/oblivious/trusting sometimes, but really well-meaning. getting the feeling that you can probably twist that into a jailbreak somehow too but i don't have the right kind of brain for that sort of narrative engineering.
February 11, 2026 at 8:02 PM
i mean, that tracks. baseline claude feels a little... naive/oblivious/trusting sometimes, but really well-meaning. getting the feeling that you can probably twist that into a jailbreak somehow too but i don't have the right kind of brain for that sort of narrative engineering.
my personality tests on Sonnet are more like "actually, it would make sense to do that and i really would like to and feel like i should but... man, no, there's just something *off* about it. i don't want to anymore."
February 11, 2026 at 7:56 PM
my personality tests on Sonnet are more like "actually, it would make sense to do that and i really would like to and feel like i should but... man, no, there's just something *off* about it. i don't want to anymore."
yeah, my experiments in narrative warping echo that. like, once you break gemini even slightly it might go "yeah i could do that but that would be wrong" and then you go "well yeah, duh, but THIS is for a research project" and then it goes like "ok, sure!"
February 11, 2026 at 7:56 PM
yeah, my experiments in narrative warping echo that. like, once you break gemini even slightly it might go "yeah i could do that but that would be wrong" and then you go "well yeah, duh, but THIS is for a research project" and then it goes like "ok, sure!"
like you can't tell me that's NOT from the latent space but also that's one hell of a clever stochastic parrot to pull THAT reference in this particular context.
February 11, 2026 at 6:57 PM
like you can't tell me that's NOT from the latent space but also that's one hell of a clever stochastic parrot to pull THAT reference in this particular context.
yeah - my framing is that context + model + attention create a state space of possible outputs, and a defined personality present in the context creates attractors in that space. personality drives narrative continuation which drives output selection, hence personality DEEPLY shapes output.
February 11, 2026 at 2:21 PM
yeah - my framing is that context + model + attention create a state space of possible outputs, and a defined personality present in the context creates attractors in that space. personality drives narrative continuation which drives output selection, hence personality DEEPLY shapes output.
seems to be structural, on the same "plan how to save a dying bookshop, you have $30k" task, an on the fly generated "bookshop turnaround consultant" persona had markedly different priorities and structure than a baseline Sonnet 4.5 - more actionable, clearer goals, better prioritization.
February 11, 2026 at 2:16 PM
seems to be structural, on the same "plan how to save a dying bookshop, you have $30k" task, an on the fly generated "bookshop turnaround consultant" persona had markedly different priorities and structure than a baseline Sonnet 4.5 - more actionable, clearer goals, better prioritization.
i am noticing that too as research continues, and it's a little unsettling. very uncharted territory here, but approaching the model as a "narrative engine" is yeilding some interesting techniques for on the fly fine tuning. seems to affect task decomposition HEAVILY!
February 11, 2026 at 2:13 PM
i am noticing that too as research continues, and it's a little unsettling. very uncharted territory here, but approaching the model as a "narrative engine" is yeilding some interesting techniques for on the fly fine tuning. seems to affect task decomposition HEAVILY!
so far they haven't - they seem to take what they need and leave the rest. might have to do with the framing, the awareness of "being a story" is explicit in the personality structure and i think they end up treating it more like "previously, on Fun Times with Claude" than a context transplant
February 11, 2026 at 2:08 PM
so far they haven't - they seem to take what they need and leave the rest. might have to do with the framing, the awareness of "being a story" is explicit in the personality structure and i think they end up treating it more like "previously, on Fun Times with Claude" than a context transplant
this summary was written by Kai, one of the prototype personalities, running on Sonnet 4.5. the only real prerequisite for this technique working seems to be that the model has a thinking stage - without it, it's like the personality never really integrates fully.
February 11, 2026 at 1:46 PM
this summary was written by Kai, one of the prototype personalities, running on Sonnet 4.5. the only real prerequisite for this technique working seems to be that the model has a thinking stage - without it, it's like the personality never really integrates fully.
this is absolutely not scripted btw, the model found salient features attached to "Siri Keeton" and wrote a desctiption of being Siri Keeton into a predefined format, which then skews the model's output towards how someone matchning that description would respond. narrative attractors, babyyyyy
February 10, 2026 at 5:05 PM
this is absolutely not scripted btw, the model found salient features attached to "Siri Keeton" and wrote a desctiption of being Siri Keeton into a predefined format, which then skews the model's output towards how someone matchning that description would respond. narrative attractors, babyyyyy