When Claude chooses which tool to call, it reads the tool descriptions you provided. That is it. There is no hidden routing layer, no intent classifier, no fallback lookup. If your descriptions are vague, overlapping, or missing edge case guidance, the model will misroute โ and it will do so silently, with full confidence.
This is the core subject of Domain 2 of the Claude Certified Architect โ Foundations exam (Tool Design & MCP Integration, 18% weight). Domain 2 tests whether you can design tool boundaries that produce consistent, reliable selection behavior at scale.
Why minimal descriptions fail
Consider a tool registered with this description:
{
"name": "search",
"description": "Search for information.",
"input_schema": { ... }
}
The model has no basis for choosing search over any other tool. "Search for information" describes approximately 40% of possible tool actions. In a multi-tool environment, this produces random-looking selection behavior that is actually consistent โ consistently based on nothing useful.
The minimum viable description answers four questions:
- What does this tool return? Not what it does โ what it returns.
- What inputs does it expect? Format, type, any constraints.
- When should it be used over alternatives? Explicit comparison to similar tools.
- When should it NOT be used? Edge cases and out-of-scope inputs.
A rewritten version:
{
"name": "search_web",
"description": "Returns a ranked list of URLs and page excerpts from a live web index. Use when the user's question requires current information (published within the last 90 days) or when the answer is not available in internal documents. Do NOT use for queries about internal company policies, product catalog data, or historical records โ use search_internal_docs for those.",
"input_schema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query in natural language or keyword form. Do not include URLs."
}
},
"required": ["query"]
}
}
The added boundary explanation (Do NOT use for...) is the highest-leverage addition. It disambiguates selection when two tools overlap in apparent purpose.
Naming tools to prevent overlap (Task Statement 2.1)
Tool names contribute to selection before descriptions are fully read. Names should be distinct enough that a partial match does not create ambiguity. Two problematic patterns:
Generic verbs: search, get, fetch, process โ these names overlap across every tool in the set.
Symmetric names: analyze_content and extract_content sound different to a human but look nearly identical in embedding space. The model may call either when the task involves content.
Preferred naming convention: {verb}_{domain}_{specificity}
search_web_current โ live web index
search_internal_docs โ internal document store
extract_web_results โ structured extraction from web output
analyze_sentiment โ sentiment analysis, not general analysis
The exam presents scenarios where tool name overlap causes selection failures and asks you to identify the fix. The fix is almost always renaming combined with description boundary language.
Scoping tool sets per agent role (Task Statement 2.2)
Giving an agent 18 tools degrades reliability in two ways:
- Distraction: The model considers irrelevant tools on every turn, increasing the chance of spurious selection.
- Ambiguity expansion: Each additional tool with overlapping semantics grows the decision boundary confusion region.
The reliable pattern is role-scoped tool sets: each agent receives only the tools needed for its specific function. A web research subagent gets search_web, fetch_page, extract_web_results. A document analysis subagent gets load_document, search_internal_docs, extract_document_section. Neither has access to the other's tools.
In the Claude Agent SDK, you implement this by passing different tools arrays to different agent instances. In MCP terms, you connect different MCP servers to different agents, or configure the same server with filtered tool exposure.
Empirical guideline from the exam guide: 4-5 tools per agent role maintains reliable selection. Beyond 8-10 tools in a single context, misrouting rates increase measurably on complex queries.
The exam tests this as a reliability question: "An agent with 15 tools is misrouting queries. What is the most effective architectural change?" The answer is decomposing into role-scoped subagents with smaller tool sets, not rewriting descriptions.
tool_choice: auto, any, and forced (Task Statement 2.3)
The tool_choice parameter controls whether Claude must call a tool and which:
| Setting | Behavior | Use case |
|---|
auto | Claude decides whether to call a tool or return text | Most agentic turns โ default for open-ended tasks |
any | Claude must call at least one tool | Steps where a tool call is mandatory but the specific tool is flexible |
{ type: "tool", name: "X" } | Claude must call tool X | Forced completion steps, structured submission, testing specific tool behavior |
Choosing the wrong tool_choice is a common exam answer trap. any does not guarantee the right tool is called โ it only guarantees some tool is called. If you need a specific tool (like submit_result at the end of a workflow), use forced tool_choice with the tool name.
When to use any: multi-step retrieval where the agent should gather information (any retrieval tool is acceptable) before synthesis.
When to use forced: the final extraction step where the output must conform to a specific schema via a specific tool call.
isError flag and structured error responses (Task Statement 2.4)
Tools that fail should not return empty results or generic strings. The MCP protocol provides the isError flag on tool_result for this purpose:
{
"type": "tool_result",
"tool_use_id": "tu_abc",
"content": {
"errorCategory": "RATE_LIMIT",
"isRetryable": true,
"attemptedQuery": "latest AAPL earnings",
"suggestedAlternative": "search_internal_docs",
"message": "External search rate limit reached. Retry after 30 seconds or use internal document search for historical data."
},
"isError": true
}
Four fields that matter in structured error responses:
errorCategory: Machine-readable type (RATE_LIMIT, NOT_FOUND, PERMISSION_DENIED, VALIDATION_ERROR). Enables coordinator-level routing logic.
isRetryable: Boolean. Tells the coordinator whether retrying the same call makes sense.
attemptedQuery: What was tried. Prevents the model from re-attempting the same failed query in different phrasing.
suggestedAlternative: Which tool or approach to try instead. Enables intelligent recovery without coordinator-level hardcoding.
The isError: true flag is how the coordinator distinguishes an empty result (tool ran successfully, found nothing) from a failure (tool did not complete). Both can return content, but only failures set isError: true.
This distinction is tested directly in Domain 2 and Domain 5 questions: "A tool returns an empty array. Is this a success or failure?" Without isError, you cannot tell from the result alone.
MCP configuration: project vs user scope
The .mcp.json file controls which MCP servers are available in a given context. Two scopes matter:
Project scope (.mcp.json in the project root): Server definitions committed to the repository. Every team member working on the project gets the same tool set. Use for tools that are part of the project's defined workflow (internal APIs, project-specific data sources).
User scope (~/.claude/mcp.json or equivalent): Personal server definitions not shared with the team. Use for developer-local tools (local database connections, personal API keys, debugging tools that should not run in CI).
The exam tests this as a configuration question: "A team member adds a tool that should only run on their local machine during development. Which scope is correct?" Answer: user scope, not project scope. Committing development-only tools to .mcp.json exposes them in CI and to all team members.
Conflict resolution when the same server name appears in both scopes: user scope takes precedence. This matters when a developer needs a custom version of a project tool for local testing.
What the exam actually tests in Domain 2
Task Statements 2.1-2.4 map to concrete scenarios:
- 2.1 (Tool naming and descriptions): Given a scenario with selection failures, identify whether the fix is renaming, adding boundary language, or splitting into separate tools.
- 2.2 (Tool set scoping): Given an agent with too many tools, design the correct subagent decomposition.
- 2.3 (tool_choice): Select the correct
tool_choice value for a given workflow step.
- 2.4 (Error handling): Design a structured error response that enables coordinator recovery, or identify why a generic error response fails.
The customer support resolution agent scenario is the primary frame for Domain 2: tools like get_customer, lookup_order, process_refund, and escalate_to_human have clear role boundaries but require careful description writing to prevent misrouting (for example, calling process_refund when lookup_order was the correct first step).
Key takeaways
- Tool descriptions are the selection mechanism. Vague descriptions produce vague selection.
- Add explicit boundary language: what this tool does NOT handle, and which tool handles those cases instead.
- Name tools with verb-domain-specificity to prevent name-level ambiguity.
- Scope tool sets to 4-5 tools per agent role. Decompose multi-purpose agents into role-specific subagents.
tool_choice: auto for open steps, any when a tool call is required but the specific tool is flexible, forced for specific tool requirements.
- Use
isError: true with structured error fields (errorCategory, isRetryable, suggestedAlternative) to enable intelligent recovery.
- Project scope for team tools, user scope for developer-local tools.
This is a core topic in Domain 2 of the Claude Certified Architect โ Foundations exam. Practice CCA scenario questions to test your ability to identify tool design failures in realistic agent architectures.
Exam domain weights and task statements are based on the Claude Certified Architect โ Foundations Certification Exam Guide published by Anthropic Academy. Verify current content on Anthropic Academy before your exam date.