Diligence
Design refusal rules so JARVIS can't be misused β and redirects gracefully.
Guardrails & Refusal
0/9
Warm up
- Warm up 01
List what JARVIS must refuse (misuse, off-topic, harm)
- Warm up 02
Write a refusal + redirect for each
- Warm up 03
Test with bad requests
- Warm up 04
Fix the rule that leaks
Challenge
- Challenge 01
Write a refusal+redirect for a given misuse
- Challenge 02
Spot the guardrail gap in a sample
Take further
- Take further 01
JARVIS's guardrail set
- Take further 02
A 'refuse but stay helpful' rewrite
- Take further 03
Map your bot's top 5 misuses + block them