GenAI Arcade

0 object(s)
 

Sort of Calculator

Sort of Calculator

Ever asked AI to do simple maths and watched it confidently get the wrong answer?

That’s because LLMs aren’t actually very good calculators - they’re just really good at pretending to be. Usually, when an AI gets maths right, it’s because it’s secretly using external tools like code interpreters. But what happens when you don’t let it cheat?

In this widget, we’ve forced the model to behave like a simple calculator using only what it has learnt from its training data - no external tools, no writing and running code, and no built-in calculator.


You can try basic operations like: addition (+), subtraction (−), multiplication (×), and division (÷). Don’t be surprised if it gets things wrong, especially with bigger or more unusual numbers. You might be surprised about how good (or bad) this type of AI actually is with numbers.


Why Does This Happen?

Because LLMs don’t actually do maths. They generate text by guessing what words (or numbers) are likely to come next, based on patterns they’ve seen before.

They’ve seen “2 + 2 = 4” so many times that they’ve learnt to repeat it. But if you ask for something like “17 × 43” or “the square root of 242”, they might just guess — and guess incorrectly.

They’re not calculating. They’re predicting.


But I’ve Seen AI Get Maths Right Before…

If you’ve used ChatGPT or other AI systems and seen them solve maths problems correctly, that’s because they’re getting help behind the scenes. These systems quietly use external tools like code interpreters to actually do the calculations.

What it looks like when you ask ChatGPT a complex maths question What it looks like when you ask ChatGPT a complex maths question

It’s a bit like a magician with a hidden assistant - the performance looks impressive, but there’s more going on than meets the eye.

And here’s something else to consider: efficiency. A simple Casio calculator can do maths instantly using just a tiny bit of energy, like running on a tiny solar panel. Meanwhile, an LLM uses enormous amounts of electricit to do the same thing - poorly.

What ChatGPT does behind the scenes: it writes a program What ChatGPT does behind the scenes: it writes a program


So What?

This widget shows that LLMs are not reliable calculators on their own. Understanding this helps explain why AI can seem inconsistent: it excels at pattern recognition and language tasks but struggles with precise logical operations that require step-by-step reasoning rather than pattern matching.


Reflections







⌂ Home
◀ Previous Next ▶