Models Need Hands

Why Models Need Hands

The Sealed-Off Problem

A language model is a text-prediction machine. Give it "What is 2 + 2?" and it predicts "is 4" by pattern-matching billions of examples. It works for retrieval — questions whose answers live in its training data. It fails for action.

Imagine you're a student in Seoul and you want to book a flight to Busan on June 15. A model can tell you why you'd want to go to Busan, or what to expect when you get there. But it cannot buy the ticket. It has no hands. It cannot call the airline API, read the response, adjust the booking, and confirm the purchase. It can only predict the next word that a human who had already bought a ticket would type in a forum post.

When we ask a model to accomplish a real task — "Book me a flight" — it either:

Hallucinates: Writes plausible-sounding but fake text like "Booking confirmed. PNR: A8K2X9." The model has no way to know if that PNR exists.
Refuses: Admits it cannot act and asks the human to do it.

Neither solves the problem.

What a Tool Gives the Model

A tool is a contract that says: "Here is a function you can call. Tell me its name, the parameters it needs, and I will run it, give you the result, and you can respond based on what actually happened."

A tool definition has four parts:

Name — What the function is called. Example: search_flights. The model reads this name and decides: "I need to search for flights, so I'll call this." If you name it get_random_string, the model will pick it for the wrong reasons.
Description — What the tool does and when to use it. Example: "Searches available flights from a departure city to a destination city on a specific date. Returns a list of flights with prices, airlines, and departure times." A vague description like "Does something with flights" leaves the model guessing about when to use it.

Parameter Schema — A JSON schema that specifies exactly which inputs the tool accepts: their names, types, constraints, and defaults. Example:

{
  "type": "object",
  "properties": {
    "from_city": { "type": "string", "description": "IATA code, e.g. ICN" },
    "to_city": { "type": "string", "description": "IATA code, e.g. PUS" },
    "date": { "type": "string", "format": "YYYY-MM-DD", "description": "Departure date" }
  },
  "required": ["from_city", "to_city", "date"]
}

A schema protects against garbage input. The model cannot pass a person's name where a date belongs. The schema is the guardrail.

Return Type — What the tool sends back. Example: "An array of flight objects, each with: airline (string), departure_time (HH:MM 24-hour), arrival_time (HH:MM 24-hour), price_usd (number), stops (integer)." The model needs to know what it will receive so it can parse the result and respond sensibly.

The ReAct Loop

Once the model has tools, a ReAct loop (Reason-Act-Observe) runs like this:

Reason: Model thinks, "I need to book a flight. First, let me search for options."
Act: Model calls the tool: search_flights(from_city="ICN", to_city="PUS", date="2025-06-15").
Observe: The tool runs. It returns: [{"airline": "Asiana", "departure_time": "09:30", "price_usd": 45}, {...}].
Reason (again): Model sees the result and thinks, "There are three options. Asiana at 9:30 for 45 dollars is the cheapest. I should book that."
Act (again): Model calls a second tool: book_flight(airline="Asiana", departure_time="09:30") with the customer's details.
Observe: Tool returns confirmation or error.
Respond: Model tells the human: "Done. Your flight is booked for June 15 at 9:30 AM on Asiana. Your confirmation code is XYZ."

Without tools, step 2 never happens — the model just predicts what a booked person would write. With tools, the model can actually do it.

Real Cost: Vague Definitions Trap the Loop

Here's the hard part: the clarity of your tool definition determines whether the agent succeeds or fails.

Example: You write a tool called get_weather with a vague description: "Gets weather data." The schema requires a location (string), but does not say whether it accepts a city name, a latitude/longitude pair, or an IATA airport code.

Now the model sees a task: "What's the weather in Seoul?" It calls get_weather(location="Seoul"). Your tool receives a city name, but your backend expects coordinates. It returns an error or garbage. The model observes the failure. It might retry with a different input, or it might hallucinate a response ("The weather in Seoul is sunny and 22°C"), or it might give up.

In contrast, a precise tool definition prevents this:

{
  "name": "get_weather",
  "description": "Retrieves current weather for a location. Always use the city's full Korean name (한글) or its English name followed by 'Seoul'/'Busan'/'Daegu'.",
  "parameters": {
    "location_name": {
      "type": "string",
      "description": "City name in Korean or English (e.g., 서울, Seoul, Busan). Do not use coordinates."
    }
  }
}

Now when the model calls get_weather(location_name="Seoul"), your tool receives exactly what it expects.

Why This Matters

Tools are how autonomous AI systems work in the real world. A customer-service chatbot is sealed off until you give it tools to look up orders, check inventory, initiate refunds. A research assistant is powerless until you give it tools to query databases, fetch papers, run calculations.

You are not building a chatbot. You are building a reasoner with hands. The clarity of your hand (the tool definition) determines what it can actually do.