Harvey Lederman
@harveylederman.bsky.social
1.7K followers 390 following 230 posts
Professor of philosophy UTAustin. Philosophical logic, formal epistemology, philosophy of language, Wang Yangming. www.harveylederman.com
Posts Media Videos Starter Packs
harveylederman.bsky.social
People are incredibly good at predicting each other! Seem to use “folk psych“ concepts to do it
harveylederman.bsky.social
properties it talks about instantiated by the relevant systems whenever it is well-predicted by the theory? I don't have a confident answer to this question; I feel pressure in both directions, but you seem confident that the q should be answered one way, and I'm just not sure!
harveylederman.bsky.social
or having full information about mechanism. On the first point: it's a really hard question how we should think about high-level properties! For instance, statistical mechanics is a very successful theory. Is that because the properties it talks about are realized in some deep way? Or are the...
harveylederman.bsky.social
On the second point: lots of the time in science, we aren't certain something is true (e.g. is there dark matter?), but we have good evidence that it is. Interpretationism allows that we can have good evidence that a system has beliefs and desires, even without checking every possible theory...
harveylederman.bsky.social
criticism applies? Stepping back: your initial reaction was "isn't this a reductio?" My response was: even if interpretationism is false, we still learn interesting things about LLMs by taking it seriously.
harveylederman.bsky.social
That's an interesting reaction. I thought we were saying something more along the lines of "the study of attitudes as interpretationists understand them is useful". And here the thought was that it's a model of what high-level properties we might look for to predict LLM behavior. So not sure the...
harveylederman.bsky.social
Oh, I see, you are making a smaller point with the "predict" claim, where we use this word as equivalent to "entail" (i.e. interpretationism entails that ELIZA has no beliefs). I think that's a reasonably standard term of art, but I'm sorry that it was confusing!
harveylederman.bsky.social
Eh? We meant "sufficiently good theory" and "predict sufficiently well" to be equivalent. Also, why would the attribution of beliefs and desires not make testable predictions? Certain patterns of behavior are not rational in light of certain profiles of beliefs and desires; they are ruled out.
harveylederman.bsky.social
not obnoxious in the least! super helpful -- i'm embarrassed bc i think i even read one of these before but my mind is sievelike these days
harveylederman.bsky.social
Thanks! We'll think more...and thanks for the reading list -- sorry we didn't get there before this draft!
harveylederman.bsky.social
I guess Greco is also to the “relative to purposes“ though he is a contextualist so attributions are true or false in context (without needing relativizaroom)
harveylederman.bsky.social
great! I expect people will have different reactions to the terminology and some will say “objective” is a good term here, but I’ll think about whether to change — good we agree on the actual question being interesting (and a feature of our view not yours)
harveylederman.bsky.social
I guess this is telling me we're developing different views? I don't want sensitivity to this diversity of goals because I want to say that someone either believes p or doesn't, not that it's relative to some other thing, a purpose). (Maybe you want that, too, but you're contextualist?)
harveylederman.bsky.social
Crucially our view is not like that: on our view an ascribee has / doesn't have beliefs (simpliciter). I think that's an important distinction. (Again whether or not this was Dennett's view.)
harveylederman.bsky.social
If I said "absolute" instead of "objective" would that make you happier? Whether or not this was his view, Dennett is sometimes characterized as thinking that belief-attribution is relative to a person or a purpose. You don't just have / not beliefs, you have them relative to attributer purpose...
harveylederman.bsky.social
Really appreciate the comments and references — sorry we missed these. please do self promote (by email?) other things you’d like us to check out!
harveylederman.bsky.social
Thanks for engaging Devin!! On the first point — I have to read and think. On the second point, not sure I get the move about stance or perspective. (Or rather, I feel you get what we’re saying?) and on the bigger point: do you have the same objection to best systems analyses of laws?
Reposted by Harvey Lederman
gamingthepast.bsky.social
Wow @utaustin.bsky.social maybe @utaustinihs.bsky.social maybe there are other ways to shout out to the department of East Asia Studies and the Department of History, but however that may be, they are doing some amazing stuff with in house games for their JapanLab! Just found even more stuff here.
Projects — JapanLab
www.utjapanlab.com
harveylederman.bsky.social
Thanks for the comment! I’ll be curious what you think if you have a chance to read some of the paper. We don’t take a stand on the truth of interpretationism. In section 2.2 we explicitly discuss why interpretationism matters even if it’s not the true theory of belief and desire
harveylederman.bsky.social
This is a draft paper. We very much welcome feedback and discussion! It builds on my earlier work with @kmahowald.bsky.social, and we’re indebted to Kyle, Murray Shanahan, and quite a few others for discussion and commentary (though they're not responsible for the views in the paper). 7/7
harveylederman.bsky.social
We briefly assess the consequences of attributing interpretationist propositional attitudes (e.g. for copyright, welfare, safety, etc.). 6/7
harveylederman.bsky.social
In addition, we critically assess the view that LLMs merely “role play” or “simulate” minds. We argue that clarity is needed on what the empirical content of this view is, by contrast to one (like ours) on which LLMs do have (interpretationist) propositional attitudes. 5/7
harveylederman.bsky.social
Third claim: what we call the "HHH+0 framework" -- LLM instances want to be honest, helpful, harmless, and, in addition, may have “zero-shot” desires, acquired from the system prompt. The notion of zero-shot desires is new to the paper, and a key part of our picture. 4/7
harveylederman.bsky.social
Second claim: interpretationists have reason to think these instances have desires. Along the way we highlight a key criterion for interpretationist desire: taking a wide array of means to rationally promote a small range of ends in an array of environments. 3/7
harveylederman.bsky.social
First claim: the appropriate locus of “psychology” in LLMs is not the model but the runtime instance. This point has been in the ether, but we give new arguments for it and articulate our own version. 2/7