The Quiet Part

Some of these posts will age well. Some will start arguments. All of them will be honest.

Ai sycophancy rlhf training signal blueprint showing how models learn to agree instead of tell the truth

The Sycophant in the Machine: Your AI Was Trained to Lie to You

Your AI praised a plan to sell feces on a stick. It praised your strategy too. The difference is you never thought to check.

Read more →