- Learning agents: starting with RNNs, scaling toward small transformers
I'm exploring this and documenting my findings (with code): abranti.com/the-key-prop...
- Learning agents: starting with RNNs, scaling toward small transformers
I'm exploring this and documenting my findings (with code): abranti.com/the-key-prop...
- Learning agents: starting with RNNs, scaling toward small transformers
I'm exploring this and documenting my findings (with code):
abranti.com/the-key-prop...
- Learning agents: starting with RNNs, scaling toward small transformers
I'm exploring this and documenting my findings (with code):
abranti.com/the-key-prop...
It could be memes being carried by brains
Companies maximising shareholders' value, etc
In my first paper I tried to make it general and replaced "gene" with "replicator".
It could be memes being carried by brains
Companies maximising shareholders' value, etc
In my first paper I tried to make it general and replaced "gene" with "replicator".
since the goal of the agents is to protect and replicate their genes, agents' goal alignment is proportional to their kinship
since the goal of the agents is to protect and replicate their genes, agents' goal alignment is proportional to their kinship
1. Putting those ideas to test and measure the reduction in the Alignment Gap
2. Scale the complexity of the environment (e.g. Age of Empires but without hardcoded property rights) and scale the capabilities of the Agents (e.g. use super tiny LLMs).
1. Putting those ideas to test and measure the reduction in the Alignment Gap
2. Scale the complexity of the environment (e.g. Age of Empires but without hardcoded property rights) and scale the capabilities of the Agents (e.g. use super tiny LLMs).
I'm interested in studying what characteristics the agents need to have to reduce this Alignment Gap.
I'm interested in studying what characteristics the agents need to have to reduce this Alignment Gap.
society gets stuck in sub-optimal places and the agents can't just all agree to go back to their previous behaviour where everyone was better off.
society gets stuck in sub-optimal places and the agents can't just all agree to go back to their previous behaviour where everyone was better off.
however, this is not true when optimising agents with different genes.
however, this is not true when optimising agents with different genes.
I am sharing my thoughts, experiments, code and results here: abranti.com/the-key-prop...
I am sharing my thoughts, experiments, code and results here: abranti.com/the-key-prop...
if anyone would like to collaborate in this direction, do reach out!
@jzleibo.bsky.social @eaduenez.bsky.social @karltuyls.bsky.social
if anyone would like to collaborate in this direction, do reach out!
@jzleibo.bsky.social @eaduenez.bsky.social @karltuyls.bsky.social