Keeping AI Weird (for Safety Reasons)

The way AI makes mistakes differs significantly from the way humans make mistakes. AI errors are often unpredictable, confident, and downright weird (cue: put glue on Pizza; eat rocks; how many r’s in strawberry?).

What could we do about that? In their interesting piece, Bruce Schneier and Nathan E. Sanders say that one suggestion is to train AI so that its errors are more “human-like”. This would certainly allow for better control.

But is that really desirable? I am not so sure. Here’s why:

Transparency laws requiring disclosure of AI-generated content suggest that preserving AI’s distinctiveness is important. Shouldn’t that extend to the way it makes mistakes?

If the mistakes made by AI become indistinguishable from human mistakes, this is another step toward anthropomorphization. The more AI seems to operate like us, the more difficult oversight becomes.

Conversely, if AI continues to suggest adding glue to pizza and eating stones, this forces us to stay vigilant – arguably fostering more critical and careful use.

So, from an ethics perspective, the question is: Are we improving the safety of AI by making its mistakes more human-like, or are we just making AI appear more familiar? And would the latter possible blind us to new risks?

Of course, it’s important that we develop new ways to detect and mitigate AI-specific mistakes. But trying to erase one of AI’s key markers – its distinctive way of failing – might be the wrong approach.

After all, as the authors point out: When humans make random, incomprehensible, and inconsistent mistakes, we see them as red flags – often indicating “more serious problems”. Such people are ideally excluded from decision-making roles (unless they get elected as head of governments).

And I agree that the same caution should also apply to AI, i.e.: „confine AI decision-making systems to applications that suit their actual abilities – while keeping the potential ramifications of their mistakes firmly in mind“.

So, I’d say: Let’s keep AI weird, but let’s do so within carefully defined boundaries. Because in a world of black-box systems, weirdness might just be our last line of defense.

Originally shared on LinkedIn on February 17, 2025.
Picture from Erhan Astam auf Unsplash

You might also like:

TedX: AI – freedom within, freedom without
6. April 2021
We praise AI for detecting patterns we can’t see. But how free are we if we stop thinking for ourselves? My TEDx talk explores what we risk when we delegate our judgment to machines.
Weiterlesen
Why Human Learning Is Not Machine Learning
3. April 2025
Machines improve performance. Humans seek meaning. This piece explores why learning is more than optimization – and what we lose when we confuse adaptation with transformation.
Weiterlesen
Frankl, KI und die Reduktion des Menschen
8. Februar 2025
Wenn wir den Menschen auf ein Datenmodell verkürzen, verlieren wir das, was uns menschlich macht. Inspiriert von einem Text Viktor Frankls aus dem Jahr 1965 zeigt dieser Beitrag, warum die Reduktion des Menschen auf KI-Logik ein gefährlicher Irrweg ist: wissenschaftlich, ethisch und existenziell.
Weiterlesen