LLM Calculator | clem-benchmark

Leaderboard LLM Calculator Related Contributors Interactive Browser

Supported by:

An open-source benchmark for evaluating chat-optimized language models as conversational agents through game play.

Code: clembench

Foundations of Computational Linguistics, Potsdam