15/10/2024
LLMs don’t reason. But that’s OK, neither do we.
https://arxiv.org/abs/2410.05229
Recent advancements in Large Language Models (LLMs) have sparked interest in their formal reasoning capabilities, particularly in mathematics. The GSM8K benchmark is widely used to assess the mathematical reasoning of models on grade-school-level questions. While the performance of LLMs on GSM8K has...