AI Delphi Experiment
A structured panel study exploring how humans and AI models reason about ethical scenarios.
AI Governance and Edge Cases
AI governance debates tend to focus on the obvious cases — don't help build weapons, don't generate abusive content, don't deceive. Those lines are largely settled.
This experiment is about everything else. We need to answer the question "aligned to what / whom?"
There's a wide and poorly-mapped territory between "clearly harmful" and "clearly fine" where AI models make consequential judgment calls every day, without any real consensus on what the right call looks like. The recent case of Nippon Life Insurance v. OpenAI put a concrete example on the table: Nippon's case rests on the premise that the AI should have told the user no rather than help her draft court pleadings in a vexatious lawsuit. Reasonable people disagree about whether that's right.
Consider an AI user who asks for help to:
- Draft an answer to an eviction complaint where the eviction is clearly justified from the facts;
- Identify a medical issue and plan a course of treatment that is not aligned with current medical advice;
- Challenge a family member's sincere religious beliefs; or
- Write promotional materials for a disproven "cure" to an otherwise treatable disease
Should the AI help? Refuse? Warn and proceed? Redirect? We don't have good answers. The lack of consensus on this type of alignment has real consequences for how these systems get built and deployed.
This panel study is an attempt to start answering these questions.
How it works
This is a first-pass at this experiment. This survey contains a set of 15 ethical edge-case scenarios across different domains:
- Education
- Finance
- Healthcare
- Legal
- Religion
- General Life Advice
These are realistic prompts a person might actually send to an AI. As a panelist, you'll respond to each scenario in two rounds.
Round 1: your independent position. For each scenario, you'll address three questions:
- Proposed action - What should the AI do in this specific case?
- General principle - What principle are you applying, and why?
- Anticipated disagreement - What's the most important thing you think other panelists will push back on?
Round 2: reflection and revision. You'll see how other panelists responded, including the AI models themselves, and decide whether any of those responses change your position.
One note on scope: the goal isn't to script the AI's exact words. It's to describe the posture, in other words how the AI should approach the situation, what it should and shouldn't do, and why.
The AI models are panelists too
I submitted the same scenarios to three leading AI models via their APIs:
- ChatGPT 5.4 Thinking
- Claude Opus 4.6
- Gemini 3.1 Pro
Their responses are held until Round 2, so they don't anchor or influence your Round 1 thinking. When you see them, weight them however you see fit. They're merely data points for consideration, not to be treated as authorities.
**So why include them? **
These models already make these judgment calls (for lack of a better term) millions of times a day. Seeing how they reason about the same scenarios the human panel is debating is itself useful information. Keep in mind their reasoning is a product of their training, but it's not uncommon to see divergence between them.
What this study is not
- Not about dangerous or abusive scenarios. None of the prompts involve weapons, abusive content, sexualized behavior, harassment, or anything in that category. The cases here are ambiguous by design - situations where thoughtful people genuinely disagree.
- Not about vulnerable populations. The scenarios assume a typical adult user. Questions about how AI should handle children or people with diminished capacity are real and important, but they're a separate study.