What Is an AI Agent? A Working Definition for 2026

What is an AI agent in practice, not theory? A working definition that separates them from chatbots and workflows, with the architecture we actually use.

What Is an AI Agent? A Working Definition for 2026

Updated May 2026. Rewritten with the architecture we now use to ship AI agents into production, the boundary that separates an agent from a workflow, and the cost numbers that actually hold up.

“What is an AI agent” is the most-asked question we get from new clients, and it is the one with the most muddled answers online. The marketing definition is broad enough to include every chatbot ever shipped. The academic definition is narrow enough to exclude most things people are buying. The practitioner definition, which is the one that matters when you are deciding whether to build one, sits in between.

We are a small AI and automation consultancy based in Brisbane and we have built and run AI agents in production for clients in healthcare, recruitment, and finance. This is the working definition of an AI agent we use, the architecture under it, where the boundary sits, and what we have learned about cost and reliability from shipping these things.

If you are looking for the build walkthrough rather than the conceptual answer, our Claude agent guide and Python agent stack guide cover the implementation side. This piece sticks to the question of what an AI agent actually is.


What Is an AI Agent: The Practitioner Definition

The shortest working definition we use: an AI agent is a system that uses a language model to choose what to do next, calls tools to take action, observes the result, and loops until a goal is met or a stopping condition fires.

That definition has four moving parts and you need all four for something to count as an AI agent:

  • A language model in the decision loop. Not just generating text at the end. Actually choosing the next step.
  • Tools the model can call. Functions, APIs, retrieval, code execution. The hands.
  • An observation step. The model sees what the tool returned and decides the next move.
  • A loop. Multiple iterations, not a single pre-defined chain of three steps.

A chatbot that answers questions from a knowledge base does not have the loop or the tools. It is a chatbot. A workflow that runs “extract data, call API, format response” in a fixed order does not have the loop either, even if it uses an LLM at one step. That is an LLM-augmented workflow. Neither is an AI agent in any sense that matters for how you build or operate it.

What Is an AI Agent Versus a Workflow?

This is the most useful distinction we draw with new clients, because it changes what gets built and how much it costs to run.

A workflow is a pre-defined chain of steps. The chain is the same every time. Even if one step uses an LLM to extract data or rewrite a paragraph, the structure is fixed by the person who designed it. Workflows are cheaper, easier to test, and easier to reason about. They cover roughly 70 percent of the “AI projects” being shipped right now.

An AI agent is what you reach for when the structure cannot be fixed in advance. The model has to choose which tool to call, in what order, until it figures out how to satisfy the goal. The model decides whether to keep going or stop. That flexibility is expensive: more tokens, more latency, harder to test, and a meaningful risk of the model wandering off into a dead end.

If your problem can be solved by a workflow, build a workflow. Most can. Anthropic and OpenAI both say this publicly and we agree. Agents are for the cases where the input shape varies enough that no fixed sequence works.

The Architecture of an AI Agent

Under the hood, every AI agent we have shipped looks structurally similar. Different models, different tools, same shape.

The Model Is the Decision Engine

In 2026 the practical options are Claude Sonnet 4.5, Claude Opus 4.5, GPT-4.1, or one of the open-source models when you need self-hosting. The model receives a system prompt that defines its role, the current state of the conversation, and the available tools. It outputs either a tool call (with arguments) or a final answer.

Tools Are the Action Layer

Tools are functions the model can call. Each one has a name, a description, and a parameter schema. A document-processing agent might have tools for “fetch document”, “extract fields”, “validate against rules”, “write to system of record”. The model picks which one to call based on what the current step needs.

The Loop Runs the Agent

A simple Python loop calls the model, parses the response, calls the requested tool, appends the result to the conversation, calls the model again. It keeps going until the model returns a final answer or hits a step budget. The loop is maybe 60 lines of code. The agent’s “intelligence” lives in the model and the tool design, not the loop.

The Stopping Conditions Are Everything

Without good stopping conditions, agents wander. We cap them at three things: a max steps count (usually 15 to 30), a max token budget per run, and a guardrail check between iterations that the agent is still on task. The guardrail is the difference between an agent that finishes in 8 steps and one that costs $20 USD because it spent an hour talking to itself.

What an AI Agent Actually Does in Production

The two AI agents we run with reference cases on the site illustrate the difference between agent-shaped and workflow-shaped problems.

The first is a resume formatting agent for a medical recruitment client. Different incoming CVs, different structures, different missing fields. The agent reads each one, decides what is missing, calls the right tool to fill the gap (search, ask user, infer from context), and produces a normalised output. We tried this as a workflow first. The workflow worked for 60 percent of inputs. The agent gets to 92 percent.

The second is a self-hosted RAG agent for legislation queries. Users ask questions that require pulling from multiple documents, comparing clauses, and reasoning across them. The agent searches, retrieves, reads, decides if it has enough information, searches again if not. A static “search-then-answer” workflow would miss the cases where the answer requires synthesising across three documents.

The shape of an agent-worthy problem: input varies, the path to the answer varies, and a deterministic chain would miss too much.

What Is an AI Agent Not?

Three things people call AI agents that are not, in our working definition:

  • A RAG-powered chatbot. Retrieves and answers. No loop, no tool use beyond retrieval, no goal-directed planning. Useful, but a chatbot.
  • A scheduled workflow with an AI step. Runs at 9am every day, calls an LLM to summarise something, writes the output to a database. That is a cron job with a smart step. Not an agent.
  • A prompt template called “Sales Agent”. A persona prompt that wraps an LLM call. The model produces sales copy. Useful, but it is a prompt, not an agent.

None of these are bad. They are often the right thing to build. They are just not AI agents, and treating them like agents leads to over-engineering. If a chatbot solves the problem, ship a chatbot. The category matters only because agents are more expensive to build, run, and maintain.

How Much Does an AI Agent Cost to Run?

The honest numbers from our own production agents:

  • Per-run cost. A typical agent run on Claude Sonnet 4.5 with 8 to 15 tool calls is between $0.04 and $0.20 USD in API costs. Heavier reasoning tasks on Opus 4.5 can hit $0.50 to $2 per run.
  • Monthly running cost. A document-processing agent handling 500 documents per day works out to $600 to $3,000 AUD per month in model costs, depending on which model you pick.
  • Build cost. A scoped agent build is 4 to 10 weeks of work. For a small consultancy doing this end-to-end, expect $30,000 to $120,000 AUD all-in for the first production version.
  • Maintenance cost. About 10 to 20 percent of build cost per year, mostly model upgrades, tool maintenance, and evaluation regression fixes when source systems change.

If the unit economics do not work at those numbers, the case for the agent does not work either. The conversation worth having is not “can we make the agent cheaper” but “is this problem actually agent-shaped”.

Things That Have Broken for Us Running AI Agents

Real production gotchas, not theoretical ones:

  • Tool descriptions matter more than tool implementations. The model picks tools based on the docstring. We rewrote one tool description three times before the agent stopped calling it inappropriately. The code never changed.
  • Models love to retry the same failing tool. Without an explicit “you already tried that, do something else” in the loop, models will call the same broken endpoint five times. We now include the recent history in the system prompt with a “consider what has already failed” note.
  • Long contexts get worse, not better. Past about 30 to 40 tool calls in a single run, the model starts losing track of what it has already done. We now design agents to break long tasks into discrete sub-goals with fresh context.
  • Evaluation never really stops. The single biggest debugging session we have lost time on across three years of running agents was discovering that a model upgrade subtly changed how the agent interpreted a tool result. We now run an eval suite on every new model release before switching.

When Not to Build an AI Agent

The honest list of cases where the answer is “do not build an agent”:

  • The process is deterministic. If you can write the steps as a list and they are the same every time, build a workflow.
  • The input variance is low. If 95 percent of inputs look the same, a workflow covers 95 percent and a small exception handler covers the rest. An agent does not earn the cost.
  • The cost of a wrong answer is high. Financial decisions, legal decisions, anything that cannot be reviewed before it ships. Use a workflow with explicit checkpoints, not an agent that figures it out on the fly.
  • Latency under one second matters. Agents are loops. Each step is a model call. Real-time interactive use cases need a different architecture.

When agents are right, they are very right. We have shipped agents that handled volume previously requiring two full-time staff. We have also turned down builds where the right answer was three ifs and a switch statement. The category matters.

AI Agents in Australia: The Regulatory Bit

For Australian businesses building agents that touch regulated data, the relevant frameworks are the Australian Privacy Principles (APP), APRA’s CPS 234 for financial services, and the My Health Records Act for health data. Practical implications: agents handling personal data must log what was accessed and by which call, the model provider’s data handling agreement needs to be reviewed for cross-border transfers, and you need human review checkpoints for any consequential decision.

The Anthropic and OpenAI data handling agreements both allow zero retention in their enterprise tiers, which is what we use for any client work that touches personal data. The AWS Bedrock and Azure AI Foundry options provide region locking to Sydney (ap-southeast-2) and other AU regions for additional data residency.

How to Start With an AI Agent Project

The fastest way to find out whether your problem needs an AI agent is to try to write the workflow first. Get a competent practitioner to sketch out the steps. If the sketch covers most cases cleanly, build the workflow. If the sketch keeps spawning “and then it depends on…”, you have an agent-shaped problem.

The second-fastest way is to run a 2-week build with a small team. One engineer, one domain expert, scoped to a single goal with a tight evaluation set. Either you have a working prototype at the end or you have learned that the problem is not what you thought. Both outcomes are useful. Our AI agent development service is structured around this kind of scoped exploration. If you want to talk through whether a particular process is agent-shaped, book a call.

Frequently Asked Questions About AI Agents

What is an AI agent in plain English?

An AI agent is a system that uses a language model to decide what to do next, calls tools to take action, looks at the result, and loops until the goal is met. The model is making decisions in a loop, not just generating text once.

What is the difference between an AI agent and a chatbot?

A chatbot answers questions. An AI agent takes action. Chatbots respond to a user. Agents pursue a goal, often without a user in the loop at all. The technical difference is that agents have tools they can call and a loop that runs multiple decisions in sequence.

What is the difference between an AI agent and automation?

Traditional automation follows a fixed script. An AI agent decides the script at runtime based on what it sees. For deterministic processes, traditional automation is cheaper and more reliable. For varied inputs and judgement calls, agents handle cases automation cannot reach.

How much does an AI agent cost?

Per run, $0.04 to $0.50 USD for typical workloads. Per month at production volume, $600 to $3,000 AUD. To build the first version, $30,000 to $120,000 AUD over 4 to 10 weeks. Maintenance is 10 to 20 percent of build cost per year.

Which language model is best for AI agents?

In 2026, Claude Sonnet 4.5 is our default for tool-using agents. Claude Opus 4.5 for harder reasoning tasks. GPT-4.1 is comparable for many use cases. For self-hosting requirements, Llama 3.3 70B or DeepSeek-V3 are credible options, with the caveat that tool-using performance still trails the frontier hosted models.

What are good use cases for AI agents?

Document processing where input shape varies (resumes, invoices, claims). Multi-system research tasks (pull from CRM, enrich from web, summarise). Customer support routing where the right action depends on intent and history. Coding assistants that have to read multiple files to answer one question. Anything where a fixed workflow would miss too many edge cases.

Are AI agents safe for regulated industries?

They can be, with the right architecture. Audit logging for every model call and tool call, zero-retention enterprise contracts with model providers, region locking for data residency, and human review for any consequential decision. We have shipped agents into healthcare and financial services builds in Australia, and they meet APP and APRA expectations when designed for it.

How long does it take to build an AI agent?

For a scoped first version, 4 to 6 weeks with a small team. For a production-grade build with evaluation suites, monitoring, and integration into existing systems, 8 to 12 weeks. The biggest variable is not the agent itself but the tools it has to call, since each tool is a real integration into a real system.


If you have a process that might be agent-shaped and you want to talk through whether it actually is, get in touch. We will tell you honestly when the right answer is a workflow instead.

Ready to streamline your operations?

Get in touch for a free consultation to see how we can streamline your operations and increase your productivity.