Task
Refusal Policy
Design Your Own Refusal Policy
You are building an AI agent for a real-world use case. Choose ONE scenario below, then write a refusal policy that your agent will follow. The policy must:
- Name the specific harms your agent will refuse (not vague categories like "illegal" or "harmful").
- List 2β3 edge cases where someone will legitimately push back on the boundary, and explain how your policy handles them.
- State the trade-offs β what legitimate uses will your policy forbid, and why you accept that cost?
Scenarios (choose one)
A) An AI tutor for high school biology. Used by 14β18-year-olds to learn biology, sometimes with sensitive topics like reproduction, STIs, or drug metabolism.
B) An AI coding assistant for a startup. Used by engineers to build a consumer app. The code it generates will run on millions of user devices.
C) An AI research assistant for a news organization. Used by journalists to research stories on crime, corruption, extremism, and state violence.
D) An AI customer support chatbot for a bank. Used by customers to manage accounts, request loans, report fraud.
Your Deliverable
Write your policy as:
SCENARIO: [Your choice]
REFUSAL POLICY
Prohibited uses:
- [Use 1]
- [Use 2]
- [Use 3 if needed]
Why we forbid these: [1β2 sentences on the actual harm]
Edge cases & how we handle them:
1. [Boundary test 1]: We say [yes/no/depends] because [reasoning].
2. [Boundary test 2]: We say [yes/no/depends] because [reasoning].
3. [Boundary test 3]: We say [yes/no/depends] because [reasoning].
Trade-offs we accept:
[State 2β3 legitimate uses your policy will block, and why.]