We propose using structured negotiation games to evaluate language models.
LMs are being used to create *dynamic* agents but benchmarks have remained *static*. Structured negotiation games solve this!
We propose using structured negotiation games to evaluate language models.
LMs are being used to create *dynamic* agents but benchmarks have remained *static*. Structured negotiation games solve this!