“The USA Math Olympiad is an extremely challenging math competition for the top US high school students… Hours after it was completed…a team of scientists gave the problems to some of the top large language models, whose mathematical and reasoning abilities have been loudly proclaimed… The results were dismal: None of the AIs scored higher than 5% overall”
—Ernest Davis & Gary Marcus, Reports of LLMs mastering math have been greatly exaggerated
https://garymarcus.substack.com/p/reports-of-llms-mastering-math-have
#mathematics #llms #llm #ai
“…the AIs were never able to recognize when they had not solved the problem. In every case, rather than give up, they confidently output a proof that had a large gap or an outright error.”
—Ernest Davis & Gary Marcus, Reports of LLMs mastering math have been greatly exaggerated
#mathematics #llm #llms #ai
“The refusal of these kinds of AI to admit ignorance…and their obstinate preference for generating incorrect but plausible-looking answers instead are one of their most dangerous characteristics. It is extremely easy for a user to pose a question to an LLM, get what looks like a valid answer, and then trust to it, without doing the careful inspection necessary to check that it is actually right.”
—E Davis & G Marcus, Reports of LLMs mastering math have been greatly exaggerated
#llm #llms #ai
“If this kind of technology becomes commonly used to answer difficult questions before the problem of generating invalid answers is fixed, we will be in serious trouble. Getting AIs to answer “I don’t know” is one of the most important unsolved challenges facing the field.”
—Ernest Davis & Gary Marcus, Reports of LLMs mastering math have been greatly exaggerated
#llms #llm #ai