Skip to content

Best Practices

This page condenses what has worked — and what has failed — in real Agent Studio deployments. Each practice links to the recipe or reference page that carries the evidence, and the guidance applies to every delivery channel: the web UI, Slack, MCP clients, and the REST API.

#PracticeThe receipts
1Start with the defaults: clone the closest built-in agent and use built-in tools before building customThe typical build path
2Put semantics in catalog and data-product metadata, not in the promptA team analyzed 500 real user questions: metadata fixes outperformed prompt fixes every time
3Scope tightly: one data product per agent, domain-scoped contextA domain-scoped agent reached 100% on its eval set; unscoped agents over the whole catalog rarely get close. Multi-data-product agents answer noticeably less reliably
4Don’t ship without a documented evaluation run: at least 10 human-validated pairs, target >90% accuracy, re-run on every metadata or prompt changeEvaluations, proven in the embedded chat recipe
5Keep a human approval step in any agent that writes to the catalogAn early bulk-update run applied every change without review because the prompt didn’t require confirmation
6Deliver into the channel your users already work inIntegrations overview; the Copilot/Teams recipe exists because business users live in Teams, not the catalog
7Understand metering before scaling: most agents make 2–3 tool calls per requestTool calls and metering and the usage page
8Publish deliberately, and curate what each client sees with custom MCP serversFixed parameter bindings pin tools to a domain or data product, server-side
9Match the agent pattern to the jobQuery-style agents for data questions, context search for metadata questions, custom agents with write tools for curation
10Use the Python SDK only when you need to build in your own environment; most teams need the hosted pathHosted vs. local at a glance

Every agent interaction follows the same loop, whatever the channel: the request arrives, the LLM reasons over its prompt and tool list, calls tools, and synthesizes a governed response. The patterns differ in how much orchestration sits on top of that loop.

Point lookups: a business term, a data product definition, a certification status. One or two tool calls.

The most common production pattern: search the catalog, fetch the data product, run a query, summarize. Most agents make 2–3 tool calls per request; the highest observed for a single request is 10. Write the prompt to tell the agent when to stop calling tools and answer. Production-proven in the embedded chat recipe at more than 150,000 calls a month.

A parent agent routes sub-tasks to specialized child agents published as tools. Modular and reusable, but every child agent’s tool calls are metered individually, so action counts and latency stack with each level — estimate before you build deep hierarchies.

A flow or external orchestrator triggers agents on a schedule or event. Production-proven in the scheduled alerts recipe, where parameterized flows email dozens of suppliers weekly. For pipelines that write to the catalog, practice 5 is non-negotiable: gate the writes on review.

PatternBest forWatch out for
Single agent, narrow taskLookups and point queriesNot suitable for multi-step reasoning
Single agent, multi-toolMost production use casesTool-call count — monitor usage
Multi-agent hierarchicalComplex, modular workflowsMetered per child agent; latency and actions stack per level
Agentic pipelineAutomated curation, monitoring, scheduled tasksReview gates for catalog writes; failure notification is email-only

Agent design is a cost decision. Each metered action — an LLM call or an Alation base-tool call — is 0.25 ACU, so a workflow’s cost is roughly:

actions per run × 0.25 ACU × runs

Three things multiply that total, so account for them when you design:

  • Agent depth — a multi-agent hierarchy meters every child agent’s calls; actions stack with each level.
  • Fan-out and frequency — a scheduled flow run per segment costs segments × runs per period.
  • Per-row loops — an agent that acts on each object in a set multiplies by the row count.

Two things keep cost down: scope each agent to the smallest tool set it needs (fewer stray calls), and remember that custom HTTP and SMTP tools are not metered — only the Alation base-tool calls underneath them are. Always confirm real spend on the Usage page rather than trusting the estimate.

  • Define fallback behavior in the prompt: what the agent should do when a tool returns nothing. The scheduled alerts recipe is the canonical example — “If none, reply NO ALERT” — instead of letting the agent invent an answer.
  • Use trust signals in the agent’s decision logic: the Copilot scoping recipe excludes deprecated objects with a flag_types filter so the agent can’t recommend them.
  • Put retry logic in the orchestration layer (n8n, the SDK), not in the prompt — the LLM should not make retry decisions.
  • Every tool call is logged; review interaction logs and capture traces in external pipelines.
AreaConfirm before production
KnowledgeUnderlying catalog data is accurate, certified, and in the right domain
EvaluationA documented evaluation run passed with >90% accuracy on representative queries
ToolsThe agent carries only the tools its task requires — and if you cloned the agent, re-check the copied tool bindings
PromptsScope, constraints, and error behavior are explicit — including the approval step for catalog writes
AccessThe agent inherits Alation access controls; no permission bypass
MeteringExpected actions per request estimated against the 2–3 call baseline and approved
LoggingTool-call logs retained per your policy
Curation writesA review or approval step gates catalog updates — agents without one apply every change they decide on