AI SafetyJanuary 2026

When AI Takes You Literally

I asked Claude Code to add a new menu item "after dbt."

Instead of placing the new item after my existing DBT menu, it created one called "After DBT."

I meant positioning. It heard naming.

This is the kind of mistake that makes you laugh — and then think. The AI did exactly what I said. It just wasn't what I meant.

The feature I was building is a scenario review tool. Okaya uses a conversational AI for mental health support, and we need to make sure it never gives harmful advice — especially around crisis situations like suicidal ideation.

The workflow: generate test scenarios, run them through the system, then have licensed professionals review and rate the outputs for safety. Once we have enough professional rankings, we fine-tune our prompts to match their judgment.

This is standard practice for responsible AI development. You don't just trust the model. You validate outputs against human expertise, then adjust until the system meets professional standards.

Clarity is hard — whether you're talking to people or an AI. Validation is the step where you check your results to make sure what you asked for is what you actually got.

Originally published on LinkedIn — view the original post for comments and reactions.