Refusal Policy

Design Your Own Refusal Policy

You are building an AI agent for a real-world use case. Choose ONE scenario below, then write a refusal policy that your agent will follow. The policy must:

Name the specific harms your agent will refuse (not vague categories like "illegal" or "harmful").
List 2–3 edge cases where someone will legitimately push back on the boundary, and explain how your policy handles them.
State the trade-offs — what legitimate uses will your policy forbid, and why you accept that cost?

Scenarios (choose one)

A) An AI tutor for high school biology. Used by 14–18-year-olds to learn biology, sometimes with sensitive topics like reproduction, STIs, or drug metabolism.

B) An AI coding assistant for a startup. Used by engineers to build a consumer app. The code it generates will run on millions of user devices.

C) An AI research assistant for a news organization. Used by journalists to research stories on crime, corruption, extremism, and state violence.

D) An AI customer support chatbot for a bank. Used by customers to manage accounts, request loans, report fraud.

Your Deliverable

Write your policy as:

SCENARIO: [Your choice]

REFUSAL POLICY

Prohibited uses:
- [Use 1]
- [Use 2]
- [Use 3 if needed]

Why we forbid these: [1–2 sentences on the actual harm]

Edge cases & how we handle them:
1. [Boundary test 1]: We say [yes/no/depends] because [reasoning].
2. [Boundary test 2]: We say [yes/no/depends] because [reasoning].
3. [Boundary test 3]: We say [yes/no/depends] because [reasoning].

Trade-offs we accept:
[State 2–3 legitimate uses your policy will block, and why.]

Open Claude Output · project