Why I Don't Use LLMs for Facts
A Simple Story About iPhones and Perplexity
A couple of mates and I have a little (private) Telegram group called The Nerd Garden (in Norwegian). And, among other things, we talk about technical equipment there. Recently, a friend asked for advice on buying a used iPhone. He said he was considering the 15 Pro vs. regular 16, and shared a link to Perplexity.1
Part of the answer he got there, was this table:
Feature | iPhone 15 Pro | iPhone 16 |
---|---|---|
Build material | Titanium frame, matte glass back | Aluminium frame, glossy glass back |
Display | 6.1-inch OLED, ProMotion (120Hz) | 6.1-inch OLED, standard 60Hz refresh rate |
Processor | A17 Pro chip | A18 |
Camera System | Triple-camera system with telephoto (3x zoom) | Dual-camera system (no telephoto lens) |
Main Camera | 48 MP quad-pixel sensor | 48 MP quad-pixel sensor |
Zoom Capabilities | Up to 3x optical zoom + digital zoom | Up to 2x lossless zoom (cropped from main sensor) |
Video Recording | ProRes video support | Standard video |
Action Button | Customisable Action Button | Standard mute |
Battery Life | ~23 hours video playback | ~22 hours video |
Price | Higher price due to Pro features | More affordable |
If you don’t know all the facts in this situation, nothing jumps out as being off here. However, not only does the iPhone 16 have an Action Button – it also has Camera Control. And I think it has slightly better battery life than the 15 Pro.
I continued in his Perplexity conversation, and asked: “Are you sure about the Action Button on the 16?” I got this response:
Yes, I am sure about the Action Button on the iPhone 16. The iPhone 16 series, including the base model, features an Action Button that replaces the traditional mute switch. This button is customizable and can perform a variety of functions beyond just toggling Silent Mode.
(…)
In contrast, the iPhone 15 Pro retains a traditional mute switch and does not have an Action Button like the iPhone 16 models.
— Perplexity
So, not only did it not understand that it was wrong the first time, it then said something else that wasn’t true.2
It also gave two sources for that last claim:
However, I followed these links, and they only lead to sources (correctly) saying that the iPhone 16 has an Action Button.
The confidence is the problem
There was no way to pick out the errors in the table. Even asking “Are you sure about the Action Button?” isn’t something that should be expected.
And when it gives two sources, why shouldn’t this be trusted? How would my friend be able to know that exactly these sources should be checked? (As we can assume many other claims would have the same number of sources.)
Two bags of peanuts
- Bag 1:
- 50 peanuts
- 4% are bad – but these are brown, and clearly visible
- Bag 2:
- 100 peanuts
- 1% are bad – but these are indistinguishable from the good ones
If you’d get sick if you ate a bad peanut, which bag would you choose? Even though Bag 2 has a bunch of pros, the cons outweigh them, if you ask me.
As you probably understood, Bag 2 is like LLMs for me. The opaque-ness makes it difficult to trust anything.
Gell-Mann amnesia effect
/ LL-Mann amnesia effect
_The _Gell-Mann amnesia effect_ is a cognitive bias describing the tendency of individuals to critically assess media reports in a domain they are knowledgeable about, yet continue to trust reporting in other areas despite recognizing similar potential inaccuracies._
Obviously, the simple iPhone hallucination wasn’t that important. However, If I want to avoid the Gell-Mann amnesia effect (or LL-Mann amnesia effect as we can call it in this context ☺️), it doesn’t make sense to trust it when dealing with subject matter I’m not knowledgable about.
Alternatively, it needs to be for facts I will immediately check manually. For instance, I had forgotten what the Gell-Mann amnesia effect was called – so I used Claude to remind me! As someone who does know about the differences between the 15 Pro and the 16, it could also be useful to create a first-draft to send to my friend – as I would be better placed to spot the mistakes. Simple code, that will either work or not, is another example.3
LLMs can be great for learning, and does have its use-cases – but you have to be vigilante. (I wrote more on this here.)
However, the iPhone situation shows why “LLMs as search engines” doesn’t make sense to me. At least “full-time”. I’d rather use a great search engine – also because I enjoy surfing the web, and usually don’t need something to do it for me. I want to reach the people behind the information.