Agent Ethics
design-refusal-boundaries · case
Aim
Learn to draw and defend refusal boundaries, configuring guardrails under pressure, to produce a refusal policy.
In one line
Draw JARVIS's refusal boundaries — define what it won't do and defend your guardrails.
Stage run
| # | Stage | Type | Aim | Min |
|---|---|---|---|---|
| 1 | Agent Ethics | arcade | Make a moral call under the clock before you reason about who is liable. | |
| 2 | Agent Ethics | dossier | ||
| 3 | Guardrails & Refusal | workbench | Learn to design guardrails and refusal pairs that decide when your agent must not act, to keep your penpal safe with strangers, in the context of bounding what it will and won’t do at the booth. | |
| 4 | The Switch | larp | Can you write refusal boundaries precise enough to stop five real agent failures—and defend each rule under stress-test pressure? |
Evidence artefact
Refusal policy (baked into the persona)