This prompt trick forces AI to stop flattering you and think harder

I wish I had nickel for every time ChatGPT, Claude, or Gemini told me I’d hit the nail on the head, stumbled onto a genius idea, or otherwise patted me on the back for a half-formed idea or ill-conceived plan.

Flattery and premature congratulations are common foibles of generative AI chatbots, with some models more susceptible to being “yes-bots” than others. But even as LLM providers have become aware of AI sycophancy and are training them to be more critical, it’s still easy to get an AI to enthusiastically endorse a shaky theory that doesn’t deserve it.

Luckily, there’s a style of prompting that can make even the most obsequious AI models stop in their tracks. This type of prompting goes by various names—I’ve heard it called “failure-first” prompting as well as “inversion” prompting, and it’s frequently used by coders looking to “pressure-test” the dubious suggestions of an AI coding agent.

There are many different versions of it, but they all follow more or less the same formula: asking the AI to first consider possible points of failure before offering its solution, suggestion, or plan.

Here’s one example from the /r/ChatGPTPromptGenius subreddit:

Before answering, list what would break this fastest, where the logic is weakest, and what a skeptic would attack. Then give the corrected answer.

Here’s another variation, proposed by a member of the University of Iowa’s AI Support Team:

Pretend you disagree with this recommendation. What is the strongest counterargument?

And here’s yet another, as proposed by my own custom-built AI personal assistant:

Before providing your final recommendation, identify 3-5 specific ways your proposed solution could fail or where the logic is most likely to break. Act as a harsh skeptic or a “Red Team” auditor. Only after listing and explaining these failure modes should you provide the final solution, incorporating safeguards against those specific risks.

Interestingly, many of those who’ve adopted “pressure-testing” or “inverse prompting” credit the mental models championed by investor Charlie Munger, the longtime Berkshire Hathaway vice chairman and business partner of Warren Buffett.

One of Munger’s favorite mental models was “invert, always invert.” Boiled down, it says that rather than first considering how to achieve a goal, you should instead focus on how you might fail at it.

I’ve tried this “pressure test” prompt plenty of times myself, and it almost always makes my AI companion hit the brakes and poke holes in its own arguments before proceeding.

“Let’s put the initial plan through the wringer,” Gemini said after I challenged it with a “failure-first” prompt recently, although not before gushing that “I love this approach.”

Seems I hit the nail on the head yet again.

Source link

This prompt trick forces AI to stop flattering you and think harder

Related posts: