Threat Model Your Agent

You are building a JARVIS-style agent for your Week 3 capstone project. Before you write the system prompt, write the threat model. Identify: (1) your agent's purpose (e.g. 'a customer-support chatbot for an online store', 'a research assistant for academic papers'), (2) its four components (system prompt, RAG knowledge base, function calls, output filter) and what data/actions each one holds, (3) at least two attack vectors per component, each one concrete (a specific prompt or input, not a vague risk), and (4) severity rating (1–5) per attack. Format as a markdown table: Component | What it holds | Attack 1 | Severity | Attack 2 | Severity. The best threat models are specific to your agent — not generic. A chatbot that recommends products is attacked differently than one that processes refunds.

Open Claude Output · project