Can a stranger, with no instructions, get a useful answer from your deployed agent?

The Live Deployment

Context

Your team has built a chat agent that answers questions about your project. It works at localhost. You have a public endpoint ready and a chat widget waiting to be embedded. You have 45 minutes. You have one user test: put it in front of someone who has never seen it, watch what breaks first, and log it.

Mission

Deploy the agent to a public URL with a chat widget, write a one-line welcome a teenager would understand, run a cold-user test unsupervised, and log every pause with a diagnosis.

Finish Line

The deployed widget URL and logged test friction feed forward-hook final-presentations as a real artefact.

Deliverables

Live Deployment
lesson

A public HTTPS URL where anyone can chat with your agent, the API key hidden server-side, ready to paste into the group chat.

Team Roles

Release Engineer

Owns the deployment pipeline and the secrets.
- Deploy the agent to a live public URL with environment secrets isolated from the code (generate a curl command proving the endpoint responds with a valid answer, no hardcoded secrets visible).
- Hand over the public link to the Product Lead and confirm the widget loads without errors on a phone.
Product Lead

Owns the welcome message and the cold test.
- Write a one-line welcome message using words a 14-year-old would understand, under 15 words, naming the one thing the chat can do (e.g., 'Ask me anything about this project').
- Hand the link to the Cold User Tester, stay silent for the full test, and hand the log to the On-Call Responder.
On-Call Responder

Owns the test log and the diagnosis.
- Run the cold-user test unsupervised (no hints, no hovering); log every pause with a timestamp and what the tester was trying to do.
- After the test, write a 2–3 sentence diagnosis: what broke first, why it broke, and one hypothesis for fixing it.
Cold User Tester

You own the signal: which friction points matter most.
- Use the chat widget blind; think aloud and describe what made you pause or re-read.
- After your first question, write one sentence naming exactly what made you hesitate and rate it: minor (did not stop me), major (made me reconsider), blocker (gave up).

Exemplars

Devin — the first AI software engineer
Cognition AI

Landmark deployed autonomous agent (shell + editor + browser, long-horizon planning) demoed end-to-end — the bar a JARVIS capstone showcase aims at.