,

Keeping AI Weird (for Safety Reasons)

The way AI makes mistakes differs significantly from the way humans make mistakes. AI errors are often unpredictable, confident, and downright weird (cue: put glue on Pizza; eat rocks; how many r’s in strawberry?).

What could we do about that? In their interesting piece, Bruce Schneier and Nathan E. Sanders say that one suggestion is to train AI so that its errors are more “human-like”. This would certainly allow for better control.

But is that really desirable? I am not so sure. Here’s why:

Transparency laws requiring disclosure of AI-generated content suggest that preserving AI’s distinctiveness is important. Shouldn’t that extend to the way it makes mistakes?

If the mistakes made by AI become indistinguishable from human mistakes, this is another step toward anthropomorphization. The more AI seems to operate like us, the more difficult oversight becomes.

Conversely, if AI continues to suggest adding glue to pizza and eating stones, this forces us to stay vigilant – arguably fostering more critical and careful use.

So, from an ethics perspective, the question is: Are we improving the safety of AI by making its mistakes more human-like, or are we just making AI appear more familiar? And would the latter possible blind us to new risks?

Of course, it’s important that we develop new ways to detect and mitigate AI-specific mistakes. But trying to erase one of AI’s key markers – its distinctive way of failing – might be the wrong approach.

After all, as the authors point out: When humans make random, incomprehensible, and inconsistent mistakes, we see them as red flags – often indicating “more serious problems”. Such people are ideally excluded from decision-making roles (unless they get elected as head of governments).

And I agree that the same caution should also apply to AI, i.e.: „confine AI decision-making systems to applications that suit their actual abilities – while keeping the potential ramifications of their mistakes firmly in mind“.

So, I’d say: Let’s keep AI weird, but let’s do so within carefully defined boundaries. Because in a world of black-box systems, weirdness might just be our last line of defense.

Originally shared on LinkedIn on February 17, 2025.
Picture from Erhan Astam auf Unsplash