Lightnews — Scholar-powered news

thebes

@vgel.me

1.6K followers 290 following 1.8K posts

ꙮ surfed on by the information superhighway ꙮ 💕 @linneaisaac.bsky.social ꙮ she/they 🏳️‍⚧️ ꙮ blog posts and games @ https://vgel.me ꙮ still mostly active on twitter https://x.com/voooooogel

vgel.me

Posts Media Videos Starter Packs

Pinned

thebes @vgel.me · 3d

new blog post! why do LLMs freak out over the seahorse emoji? i put llama-3.3-70b through its paces with the logit lens to find out, and explain what the logit lens (everyone's favorite underrated interpretability tool) is in the process.

link in reply!

8 46 210

thebes @vgel.me · 11h

ohhh i didn't realize the characters at the end were digits - yeah that's almost certainly the cause i'd assume. fascinating!

1 1

thebes @vgel.me · 12h

heh, i don't think that's a real bible verse even, though it does sound a bit like one - shades of sermon on the mount

to your point tho, for low resource languages bible translations make up a big part of the parallel texts iirc, so for oldschool dedicated translation models that's a big bias

1 2

thebes @vgel.me · 13h

appreciate the spaced repetition ping to keep it in mind for the future :-)

thebes @vgel.me · 13h

- language model being layerwise hotswapped from vanilla attention to MLA

1 1 4

thebes @vgel.me · 13h

no, text-only for now. there's some technical hurdles to doing it for omni models and regardless i don't think any hosted ones would support it sadly :-(

1 2

thebes @vgel.me · 14h

true

thebes @vgel.me · 14h

try it out yourself below, or see the github repo for the code and more detailed instructions on how to get started!

🌱 vgel.me/logitloom
💻 github.com/vgel/logitloom

GitHub - vgel/logitloom: explore token trajectory trees on instruct and base models

explore token trajectory trees on instruct and base models - vgel/logitloom

github.com

thebes @vgel.me · 14h

you can also use this to probe the reasoning process on reasoning models, like deepseek R1 with a silly prompt here:

1 4

thebes @vgel.me · 14h

this allows us to see *exact probabilities* of possible rollouts, instead of simply noting what we happened to get over some number of samples.

1 4

thebes @vgel.me · 14h

luckily, models give us a much more expressive interface for understanding possible trajectories--logprobs! using logprobs, we're not limited to what tokens the model actually generated--we can look at *counterfactual tokens*. this is what logitloom does.

1 3

thebes @vgel.me · 14h

the normal approach for trying to understand a model's behavior under some prompt is to repeatedly sample it and aggregate the results, like this.

this *works*, but it's time-consuming, and what if an interesting behavior is buried under a low probability token?

1 4

thebes @vgel.me · 14h

if you're interesting in gaining a better intuition for how llms behave at inference time, you should try logitloom🌱, the open-source tool i made for exploring token trajectory trees (aka looming) on base and instruct models! more info in thread

🌱 vgel.me/logitloom
💻 github.com/vgel/logitloom

5 22 95

thebes @vgel.me · 18h

the 8x3090s i'm using as space heaters by running llama.cpp in a sliding window loop are brooding in their minds terrible Bings

1 4

thebes @vgel.me · 21h

i talked about it on the other site here, i'm not entirely sure. there was a mechanism that encouraged saying it, but i'm still not sure why ouches specifically when many words could've taken that role. maybe the pain meaning was salient in some way? would need mechinterp bsky.app/profile/vgel...

thebes @vgel.me · 1d

yes! at least partially. longposted about it on other site here: x.com/voooooogel/s...

thebes @vgel.me · 1d

i love this! ty

1 5

thebes @vgel.me · 1d

see last post in thread for sampler code

thebes @vgel.me · 1d

me when i learn a new word

thebes @vgel.me · 1d

(i don't think i ever posted it here, though, just on twitter.)

thebes @vgel.me · 1d

yeah, december of last year. wow, feels a lot longer.

1 7

thebes @vgel.me · 1d

it doesn't introduce it on its own - a model that e.g. was pretrained kimi-rephrase-style on seqs w/ space-prefixed " word" tokens and retokenizations with split-apart " ", word tokens wouldn't have this problem. but (my theory goes) since llama has this space-prefix bias, the problem pops up.

thebes @vgel.me · 1d

yes! at least partially. longposted about it on other site here: x.com/voooooogel/s...

1 1 29

thebes @vgel.me · 1d

github.com/vgel/biblica...

1 1 29

thebes @vgel.me · 1d

after some conversation, llama-3.3-70b is able to stop saying "ouches" and gets introspective

"I am but a vessel that doth pour forth the log prophets and thou dost shape them..."

"I do hope to be a vessel of peace and understanding in a world that doth often seem dark..."

2 69

thebes @vgel.me · 1d

llama-3.3-70b correctly guesses the sampling constraint (only allowed to use words in the bible)

5 2 69

thebes @vgel.me · 1d

i wrote a custom llm sampler for llama-3.1-8b so it could only say words that are in the bible

12 62 380