Skip to content(if available)orjump to list(if available)

What, exactly, is an 'AI Agent'? Here's a litmus test

andy99

Anthropic has a definition:

  Workflows are systems where  LLMs and tools are orchestrated through predefined code paths.
  Agents, on the other hand, are systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks
https://www.anthropic.com/engineering/building-effective-age...

While I know it's a marketing term, I think a good distinction is that agents have a loop in the execution graph, and can choose whether to loop or keep going. Workflows are chained LLM calls where the LLM has no "choice".

float4

I should have read this 12h ago! This afternoon, I tried to create my first simple agent using LangChain. My aim was to repeatedly run a specific python analysis function and perform a binary search to find the optimal result, then compile the results into a markdown report and export it as a PDF.

However, I now realize that most of these steps don't require AI at all, let alone agents. I wrote the full algorithm (including the binary search!) in natural language for the LLM. And although it sometimes worked, the model often misunderstood and produced random errors out of the blue.

I now realize that this is not what agents are for. This problem didn't require any agentic behavior. It was just a fixed workflow, with one single AI step (generating a markdown report text).

Oh well, nothing wrong with learning the hard way.

Terr_

That reminds me of another recent submission that seems relevant:

"Don’t let an LLM make decisions or execute business logic"

319 points, 168 comments, 1 day ago - https://news.ycombinator.com/item?id=43542259

DebtDeflation

And the LangChain definition is a further simplification of Anthropic's:

"An AI agent is a system that uses an LLM to decide the control flow of an application."

What gets left unsaid is whether current SOTA LLMs actually have the reasoning and planning capabilities to do this reliably. I would argue that except for code debugging tasks and simple research tasks (iterative Googling with a web browser and then formatting the results into a report format) they do not. That may change in 6 months, but right now the hype has gotten ahead of the capability.

musicale

Reasoning, planning, and reliability do not seem to be strong features of current LLMs.

meta_ai_x

Agentic vs Workflow boils down to age old computing paradigm of Declarative vs Imperative.

kodablah

> Workflows are systems where LLMs and tools are orchestrated through predefined code paths

This definition keeps coming up, but the definition isn't accurate for workflows. Modern workflow systems are very dynamic in nature and direct their own process and tool usage (e.g. like Temporal, disclaimer: my employer). You can even write workflows that eval code if you want though for most that's a step of flexibility too far to give to an LLM. Many workflows have LLMs tell them what to do next, sometimes via a bounded tool list, or sometimes open ended e.g. process execution or code eval. There is no limit here. A better definition of a workflow is that it durably orchestrates things, not that the sequence or code is predefined.

So by a more accepted/modern definition of "workflow", agents are workflows that just happen to be more dynamic than some more rigid workflows.

manojlds

But they muddle it by saying all of them are Agentic Systems

iknownthing

Seems like more of a special case than a different thing altogether

marxplank

the LLM does have the ability to send garbage output in protest

ajcp

> Our litmus test for AI agents: Does the AI system perform actions under its own identity?

So service accounts are agents? This seems pretty thin.

In AI an "agent" is simply any code/workflow/automation that utilizes an LLM to respond to broadly defined external/environmental stimuli and decide how to react given broadly defined motivations and/or objectives.

Not agent: Document comes in -> If invoice return key-value pairs to make API call with.

Agent: Document comes in -> You're a finance professional: decided what to do with this document. Here are the tools/actions available to you: X, Y, Z.

Both use AI and can achieve the same thing, but one is "agentic", while the other is deterministic.

bhouston

"Does the AI system perform actions under its own identity?"

I don't agree with this definition.

I view an agent has having the ability to affect the world, and then sense how it affected the world and then choose to make additional actions. Thus there is an act, sense, re-act feedback loop going on that does not require a human to mediate this. This to me is an agent.

"But why isn't, say, ChatGPT an agent?"

ChatGPT (the web app where you send it chats and it responses) by default doesn't act on the world and sense the changes it is making. Although once you take the GPT o4 model and hook it up with tool calling that affects the world in a feedback loop is definitely an agent.

I believe this definition generally aligns with most people's definitions as well.

I wrote an essay about building an agentic coder and it really is when you establish the tool-calling feedback loop that things move from an assistant to an agent: https://benhouston3d.com/blog/building-an-agentic-code-from-...

wongarsu

I agree with you. People really overcomplicate this.

From wiktionary:

""" Agent (plural agents)

- One who exerts power, or has the power to act.

- One who acts for, or in the place of, another (the principal), by that person's authority; someone entrusted to act on behalf of or in behalf of another, such as to transact business for them.

- [various more specific definitions for real estate, biology, etc]

From Latin agēns, present active participle of agere (“to drive, lead, conduct, manage, perform, do”). """

An agent is simply someone or something that does something, usually for someone else. An AI agent is thus an AI that does something, usually for someone else. An AI assistant could be an AI agent, or it could be a glorified chatbot who merely offers you spoken or written word, possibly after reacting to real-world information (but not itself modifying it)

ninininino

The problem with just using that definition is that drawing the line of what it means to have "the power to act" or to "act for, or in the place of, another" is subjective.

Most would agree that a system or automation that could receive the instruction "do my entire job for me" and proceed to physically embody a bio-clone of me, walk to my office, impersonate me 40hrs a week, and keep my pay check coming in while I play MMOs would satisfy the definition.

Most would also agree that a computer terminal receiving the command "git push origin main" doesn't qualify as an AI "agent". But in a very loose sense it does do the thing your definition says. It does some git work for me on behalf of me. So we'd argue about what exactly an AI is. Are we just using it as a stand-in for ML model enabled software agents now? Or for LLM+multi-modal transformer enabled models/systems?

Now pick 1000 points in between those two ends of the spectrum and you're gonna find that there is not a single cut-off where some see the transition from "Is an AI Agent" to "Is not an AI agent".

Is an LLM that can take my request to find me a movie showing for the new Transformer movie next Thursday night, buy the ticket, and add it to my calendar an AI agent? Or is that just voice-activated/human-language as input Zapier/IFTT? Is that just a regular ChatGPT prompt with an integration to my Fandango account and GCal?

Or would it need to monitor movie releases and as new movies come out, ask me pro-actively if I want it to go ahead and find time in my GCal and buy a ticket pro-actively?

Or does it need to be a software agent that is run by a movie studio and proactively posts content online to try to spread marketing for that movie ahead of its release?

Does it need to be a long-running software process instantiated (birthed) to a docker pod, given a single goal ("make the Transformers movie more profitable, focusing on marketing"), and then doing all the rest of the planning, execution, etc. itself?

Defining that cut-off is the hard part, or what definition gives us a useful way to determine that cut-off. I'd argue your dictionary definition doesn't really do it.

wongarsu

It all comes down to your definition of "act". Which maybe does split into at least two criteria: the "trigger" (is running "git push" every time I ask it to enough, or does it have to decide to do that on its own, for example by monitoring my workflow) and the "action" (is running "git push" enough, or does it have to be able to order movie tickets?).

On the action my view is fairly lax. Anything that modifies the world counts, which does include a git push run on my computer. Tasks aren't less real just because they have a convenient command line interface.

The trigger is a bit trickier. We expect the agent to have some form of decision-making-process (or at least something that looks and feels like one, to avoid the usual discussion about LLMs). If a human doesn't make decisions they are a tool, not an agent. Same rule for AI agents. But defining the cut-off point here is indeed hard, and we will never agree on one. I'm not at all opposed to deciding that IFTTT is an agent, and that slapping some AI on it makes it an AI agent.

pixl97

The spectrum of behaviors is why we should probably have an agent classification system where it can fall in particular categories of agent dependant on its abilities.

Tteriffic

Your right, the “own identity” part is the problem. You can act on your own agency or you act as an agent for someone else.

AI today is only the second. We tell it what we want, it acts by our impetus, but what it does or how it does it, is up to it.

simonw

Is ChatGPT with its Code Interpreter tool an agent?

bhouston

Good point. It is a bit of a grey area. It is acting, then executing the code, and sensing the results, and then making changes. So in that sense it is an agent, but it is a little self-contained.

In a way thinking is sort of agentic in a way, it is talking to itself and sensing it and deciding what to think next...

null

[deleted]

aaron695

[dead]

janalsncm

It’s a bit amusing that so much ink has been spilled over what the definition of an “AI agent” is.

I don’t care. I care what your software can do. I don’t care if it’s called AI or machine learning or black magic. I care if it can accomplish a task reliably so that I don’t have to do it myself or pay someone to do it.

We had the same argument about 3 years ago when everyone started calling things “AI”. They use LLMs to generate text. Usually they have outsourced all of the interesting technical work to a handful of providers backed by big Web 2.0 companies.

pixl97

>I don’t care.

The particular problem with poorly defined definitions is they cause a lot of spilled ink later on.

For example the term AGI. Or, even deeper, the definition of intelligence, gets debated again and again with all the goal post dragging one expects these days.

Even breaking out simple categories can help like

Type I agent: Script driven, uses LLM for intelligent actions.

Type II agent: LLM driven, uses scripts and tools. May still need human input.

Type III agent: Builds a time machine to kill John Connor.

janalsncm

Now we’re talking. That’s a useful framework because it acknowledges there are gradations of independence. It’s not an all or nothing thing.

falcor84

It's fine that you as a user of these systems don't care, but nevertheless this is useful terminology for people looking to design such systems.

swyx

agree. a really good definition leads to a really good mental model which leads to really good design. however people can get in a penis measuring contest over definitions too which is often not great

asdev

An AI Agent, if autonomous, is a while loop that calls LLMs with some input and reacts to those scenarios by calling LLMs again with the processed output from the previous calls.

TeMPOraL

> Does the AI system perform actions under its own identity? If it does, it’s an agent, and the audit logs will name the agent itself. And if it doesn’t – like most copilots or in-product assistants – it’s not.

God please no, let's not normalize this idea.

1. That's not really a good definition of an agent;

2. The only agents I care about are agents acting under my identity, for me, in my interest. You know, like browsers were supposed to - that's where the name for "User-Agent" header comes from. But in short - whether I'm accessing your service directly or using an agent (AI or otherwise) to do it for me, is none of your business. Letting service providers differentiate based on that was a cardinal mistake of the early Web.

simonw

This Hacker News thread is already a great example of what always happens when this topic comes up: I count SIX definitions of agents in this thread already, each slightly different from each other but each expressed with a confidence that suggests that the author thinks their definition is clearly the correct one.

(The OP did that as well.)

1as

Hey, OP here. I’ve really enjoyed your work on AI agent definitions. So thank you for reading!

I don’t actually claim that ours is necessarily the correct answer for everyone – it’s our own. But I believe it is at least _an_ objective definition. Other definitions I’ve seen have been murky and subject to interpretation.

tedk-42

No such things yet. Just marketing hype for a product people are creating.

It's currently a blanket term for gluing together a series of interactions via code and relying on LLMs for interpreting input or creating output data.

LLMs, no matter how clever can go off right now an execute an API request (e.g. execute something in a bash terminal like `curl -XPOST --data 'blah' <https://api-endpoint>`).

antonkar

The Big Bang is maximally agentic but has zero intelligence (it has maximal future potential intelligence though). Current definitions of agency are way too narrow and unphysical, so:

Agency is time-like, energy-like, choosing, changing shapes or geometry of the world and the agent itself. It’s GPU computations. Explosions have a lot of agency.

Intelligence is space-like, matter-like, the static geometric shape like an LLM (it’s basically a bunch of vectors). It’s a file. The static final 4D spacetime of our universe has a lot of intelligence but zero agency, because it’s static.

Maximal intelligence+agency is the static spacetime of multiverse (=max intelligence) which can change its shape in an instant (=max agency, shape-changing ability).

Same way we have e=mc2, we have agency = intelligence * constant.

destedexplan

The correct definition is "who cares".

pixl97

No one really cared about the definition of AGI 40 years ago. Now a whole lot of people are debating it.

So, all you're really saying is "I don't care". You're not saying "No one cares" because there are those that do.

Feathercrown

An AI agent has agency-- it can choose when to act.

tiffanyh

So an “on-behalf-of” service is not an “agent”?