Skip to content(if available)orjump to list(if available)

LLM Agents Are Simply Graph – Tutorial for Dummies

DebtDeflation

There are two competing definitions of agents being used in industry.

https://www.anthropic.com/engineering/building-effective-age...

"- Workflows are systems where LLMs and tools are orchestrated through predefined code paths.

- Agents, on the other hand, are systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks."

What Anthropic calls a "workflow" in the above definition is what most of the big enterprise software companies (Salesforce, ServiceNow, Workday, SAP, etc.) are building and calling AI Agents.

What Anthropic calls an "agent" in the above definition is what AI Researchers mean by the term. It's also something that mainly exists in their labs. Real world examples are fairly primitive right now, mainly stuff like Deep Research. That will change over time, but right now the hype far exceeds the reality.

kodablah

I think Anthropic's definition of workflows is inaccurate for modern definitions of the term. Temporal for instance (disclaimer, my employer) allows completely dynamic logic in agentic workflows to let the LLM choose what to do next. It can even be very dynamic (e.g. eval some code) though you may want it to operate on a limited set of "tools" you make available.

The problem with all of these AI specific workflow engines is they are not durable, so they are process local, suffer crashes, cannot resume, don't have good visibility or distribution, etc. They often only allow limited orchestration instead of code freedom, only one language, etc

MacsHeadroom

>The problem with all of these AI specific workflow engines is they are not durable, so they are process local, suffer crashes, cannot resume, don't have good visibility or distribution, etc. They often only allow limited orchestration instead of code freedom, only one language, etc

So what's the solution?

kodablah

My biased answer, because I work at Temporal[0], is to use an existing workflow solution that solves all of these problems instead of reaching for a solution that doesn't help with any of these but happens to be AI specific. Most agentic AI workflows are really just microservice orchestrations, the only "AI" involved is prompting an HTTP API that uses AI on its end. So use a good solution for "agentic X" whether that X is AI or any other orchestration needs.

0 - https://temporal.io/solutions/ai

J_Shelby_J

It’s only a true agent if it chooses what to eat for breakfast everyday.

zh2408

"Workflow can be very dynamic" is a great summary!

infecto

What about workflows are not agents?

trash_cat

PocketFlow calls itself "agentic" due to its "agentic coding" paradigm (AI agents like Cursor building apps), but this is about development, not runtime behavior. At runtime, it’s a workflow system. This stretches Anthropic’s definition, where "agentic" implies dynamic LLM control during execution. I think this si where the misunderstanding stems from.

trash_cat

This was a LLM generated reponse which was pretty stupid involving Agentic coding. But it's still correct that PocketFlow does not align with Anthropic's definition of what an "Agent" is.

campbel

I follow Mr. Huang, read/watch his content and also plan to use PocketFlow in some cases. A preamble, because I don't agree with this assessment. I think agents as nodes in a DAG workflow is _an_ implementation of an agentic system, but is not the systems I most often interact with (e.g. Cursor, Claude + MCP).

Agentic systems can be simply the LLM + prompting + tools[1]. LLMs are more than capable (especially chain-of thought models) to breakdown problems into steps, analyze necessary tools to use and then executing the steps in sequence. All of this is done with the model in the driver seat.

I think the system described in the post need a different name. It's a traditional workflow system with an agent operating on individual tasks. Its more rigid in that the workflow is setup ahead of time. Typical agentic systems are largely undefined or defined via prompting. For some use cases this rigidity is a feature.

[1 https://docs.anthropic.com/en/docs/build-with-claude/tool-us...

TeMPOraL

> Agentic systems can be simply the LLM + prompting + tools[1]. LLMs are more than capable (especially chain-of thought models) to breakdown problems into steps, analyze necessary tools to use and then executing the steps in sequence. All of this is done with the model in the driver seat.

Sort of, kind of. It's still a directed graph. Dynamically generated graph, but still a graph. Your prompted LLM is the decision/dispatch block. When the model decides to call a tool, that's going from the decision node to another node. The tool usually isn't another LLM call, but nothing stops it from being one.

The "traditional workflow" exists because even with best prompting, LLMs don't always stick to the expected plan. It's gotten better than it used to, so people are more willing to put the model in the driving seat. A fixed "ahead of time" workflow is still important for businesses powering products with LLMs, as they put up a facade of simplicity in front of the LLM agentic graph, and strongly prefer for it to have bounded runtime and costs.

(The other thing is that, in general, it's trickier to reason about code flow generated at runtime.)

infecto

Kind of. This explanation feels pedantic—like calling my morning routine a dynamically generated graph (which it technically is). Others have pointed this out, but the industry seems split. Workflows like those described in the article resemble Airflow jobs, making them, well, workflows.

Corporate buzzwords have co-opted "Agent" to describe workflows with an LLM in the loop. While these can be represented as graphs, I'm not convinced "Agent" is the right term, even if they exhibit agentic behavior. The key distinction is that workflows define specific rules and processes, whereas a true agent wouldn’t rely on a predetermined graph—it would simply be given a task in natural language.

You're right that reasoning about runtime is difficult for true agents due to their non-deterministic nature, but different groups are chipping away at the problem.

therealpygon

In my opinion, the split is between the people who want their tools to be called Agents so they can make more on AI hype, and the people who know better than to call a simple pre-defined software workflow an “agent”. It is harder to get large investments for “my program just calls an LLM” these days.

jwpapi

I have to agree this is a bit too simple for being anything of substance. That is not what really agentic means. This is basically implementing ChatGPT into Zapier.

When you work with agentic LLMs you should worry about prompt chaining, parallel execution, deciding points, loops and more of these complex decisions.

People who didn’t know what’s in first article shouldn’t use Pocketflow and go with N8N or even Zapier.

zh2408

I do agree what you said except the first sentence. The design of the Graph is super important. Pocketflow is for those with technical background.

zh2408

Let me clarify: this tutorial focuses on the technical internal implementation of the agent (e.g., OpenAI agent, Pydantic AI, etc.), rather than the UI/UX of the agent-based products that end users interact with.

yed

The newest generation of agents[0] aren't implemented this way; the model itself is trained to make decisions and a plan of action rather than an explicitly programmed workflow tree.

[0] https://openai.com/index/computer-using-agent/

zh2408

I think you’re referring to function calling: https://platform.openai.com/docs/guides/function-calling

This still returns a string. You need to explicitly program the branch to the right function. For example, check out how OpenAI Agents, released a week ago, rely on a workflow: https://github.com/openai/openai-agents-python/blob/48ff99bb...

campbel

That's what I am talking about as well. The low-level implementation of an agent isn't necessarily a rigid graph, and I'd actually argue its explicitly not this.

zh2408

The current implementations of Agents, e.g., OpenAI agents released last week, are based on graph (workflow): https://github.com/openai/openai-agents-python/blob/48ff99bb...

Not sure about Cursor you mentioned as its agent is not open sourced.

zh2408

Hey folks! I just posted a quick tutorial explaining how LLM agents (like OpenAI Agents, Pydantic AI, Manus AI, AutoGPT or PerplexityAI) are basically small graphs with loops and branches. For example:

OpenAI Agents: for the workflow logic: https://github.com/openai/openai-agents-python/blob/48ff99bb...

Pydantic Agents: organizes steps in a graph: https://github.com/pydantic/pydantic-ai/blob/4c0f384a0626299...

Langchain: demonstrates the loop structure: https://github.com/langchain-ai/langchain/blob/4d1d726e61ed5...

If all the hype has been confusing, this guide shows how they actually work under the hood, with simple examples. Check it out!

https://zacharyhuang.substack.com/p/llm-agent-internal-as-a-...

godelski

Minor comment: do you mean "LLM Agents Are Simply Graphs". Personally, I'd drop the adjective to "LLM Agents are Graphs" as I think it sounds better, but the plural is needed.

zh2408

Oh, that’s embarrassing ... pardon my poor English, and thanks so much for pointing that out!

godelski

Simple mistake and easy to fix :)

pseudopersonal

Thanks for this write up. It'll be inspiring my ruby framework.

zh2408

Thank you!

bambax

This explanation and demo is super clear.

It would be interesting to dig deeper into the "thinking" part: how does an LLM know what it doesn't know / how to fight hallucinations in this context?

erichi

I like the minimalistic approach! How to test such agents?

czbond

Thank you - really interesting looking read, thanks for crafting the deep explanation, with links to actual internal code examples. Also, thanks for not putting it behind the Medium paywall

zh2408

Thank you!!

_pdp_

It is hard to put a pin on this one because there are so many thing wrong with this definition. There are agent frameworks that are not rebranded workflow tools too. I don't think this article helps explain anything except putting the intended audience in the same box of mind we were stuck since the invention of programming - i.e. it does not help.

Forget about boxes and deterministic control and start thinking of error tolerance and recovery. That is what agents are all about.

ForTheKidz

> There are agent frameworks that are not rebranded workflow tools too.

To me "workflow" is just what agent means: the rules under which an automated action occurs. Without some central concept "agent" just a magic wand that does stuff that may or may not be what you want it to do. If we can't use state machines at all I'm just going to go out and say LLMs are a dead end. State machines are the bread and butter of reliable software.

> Forget about boxes and deterministic control and start thinking of error tolerance and recovery.

First you'd have to define what an error even is. Then you're just writing deterministic software again (a workflow), just with less confidence. Nice for stuff with low risk and confidence to begin with (eg semantic analysis etc whose error tends to wash out in aggregate), but not for stuff acting on my behalf.

LLMs are cool bits of software, but I can't say I see much use for "agents" whose behavior is not well-defined and whose non-determinism is formally bounded.

infecto

It’s getting pedantic, but the key idea is that Agents can solve problems traditional state machine-based workflows couldn't.

Your point is moot since many of these modern workflows already use LLMs as gating functions to determine the next steps.

It’s a different way of approaching problems, and while the future is uncertain, LLMs have moved beyond being just "cool software" to becoming genuinely useful in specific domains.

ForTheKidz

Hmm, maybe you are referring to something specific with "workflow". I'm envision a visual graph with a ui for each node and connection, or maybe a makefile on the other end of the spectrum. What are you envisioning?

Anyway, LLMs will remain at "cool software" like other niche-specific patterns until I see something general emerge. You'd have to pitch LLMs pretty savvily to show it as a clear value-add. Engineers are extremely expensive, so LLMs need to have a very low error rate to be integrated into the revenue-path of a product to not incur higher costs or a lower-quality service. I still see text- and code-generation for immediate consumption by a human (or possible classification to be reviewed by a human) as the only viable uses cases today. It's just way too easy to manipulate them with standard english.

vincston

Why forget about boxes and deterministic control and start thinking of error tolerance and recovery? I know, that LLMs are statistical models, but can you not use patterns to enforce a deterministic outcome? (Single responsibility for each agent, retrying llm calls, rephrasing prompts, etc?)

zh2408

Hey, sorry for the confusion. This tutorial is focusing on the low-level internals of how agents are implemented—much like how intelligent large language models still boil down to matrix multiplications at their core.

godelski

  > This tutorial is focusing on the low-level internals of how agents are implemented
We have very different definitions of what "low-level" means. Exact opposites in fact. "Low-level" means in the inner workings. Like a low-level language is assembly (some consider C low-level but this is debatable), whereas Python would be high-level.

I don't think this tutorial is "near the metal" of LLMs nor do I think it should be considering it is aimed at "Dummies". Low-level would really need to get into the inner workings of the processing, probing agents, and getting into the weeds.

ForTheKidz

> We have very different definitions of what "low-level" means.

Does it really matter if you can understand them? waiting for strongly-opinionated engineers to finish their pedantic spiels (...even when they're wrong or there is no obvious standard of correctness) when everyone already understands each other is one of the most miserable part of being in this industry.

I—and I emphatically don't include the above poster in this view as it takes continual & repeated behavior to accrue such judgement—see this as a small tantrum, essentially, for people who never learned to regulate their emotions in professional spaces. I don't understand why this sort of bickering is considered acceptable behavior in the workplace or adjacent spaces. It's rude, arrogant, trivially avoidable with slight change in tone and rhetoric, and it makes you look like an asshole if you're not 100% right and approach it in good humor.

windsignaling

I think "low-level" is relative to what's being discussed. Low-level for LLMs would have to do with how transformer layers are implemented (self-attention layer, layer norms, etc.) whereas low-level for agents would be the graph structure.

Although I personally don't think the graph implementation for agents is necessarily as established or widely standardized, it's helpful to know about why such an implementation was chosen and how it works.

> the inner workings of the processing, probing agents, and getting into the weeds

These feel to me like empty words... "inner workings of the processing"? You can say that about anything.

zh2408

By low-level, it is with respect to the agent interface.

The original purpose is to help people understand how the inner agent framework is internally implemented, like those:

OpenAI Agents: https://github.com/openai/openai-agents-python/blob/48ff99bb... Pydantic Agents: https://github.com/pydantic/pydantic-ai/blob/4c0f384a0626299... Langchain: https://github.com/langchain-ai/langchain/blob/4d1d726e61ed5... LangGraph: https://github.com/langchain-ai/langgraph/blob/24f7d7c4399e2...

adamnemecek

Despite the memes, this reductivism is not exactly insightful. Like why stop there? Matrix multiplication is just a bunch of dot product. Which in turn is just cos and magnitude. What insights were generated from this?

ethanwillis

The reductionism is insightful when it comes to providing an implementation with those specific details in mind.

In the case of LLMs knowing it does boil down to matrix multiplication is insightful and useful because now you know what kind of hardware is best suited to executing a model.

What is actually not insightful or useful is believing LLMs are AGI or conscious.

godelski

  > this reductivism is not exactly insightful.
I really agree with this. I think it has been bad for a lot of people's understanding when they have trivialized ML to "just matrix multiplications" (or GMMs). This does not help differentiate AI/ML from... well.. really any data processing algorithm. Matrices are fairly general structures in mathematics and you can formulate almost anything as one. In fact, this is a very common way to parallelize or speed up programs (e.g. numpy vectorization).

We wouldn't call least squares, even a bunch of them, ML nor would we call rasterization or ray tracing. Fundamentally all these things are "just GMMs". It also does not make apparent any differentiation from important distinctions like Linear Networks, CNNs, or Transformers. It brushes off a key element, the activation function, which is necessary for neural nets to do non-linear transformations! And what about the residual units? These are one of the most important factors in enabling Deep Learning. They're "just" addition. So we say it's all just matrix addition since we can convert multiplication to addition?

There is such a thing as oversimplification and I worry that we have hyper-optimized (over-optimized) for this. So I agree, saying they just "boil down to matrix multiplications" is fundamentally misleading. It provides no insight and only serves to mislead people.

zh2408

It’s kind of like the different levels of abstraction.

For example, for software projects, the algorithmic level is where most people focus because that’s typically where the biggest optimizations happen. But in some critical scenarios, you have to peel back those layers—down to how the hardware or compiler works—to make the best choices (like picking the right CPU/GPU).

Likewise, with agents, you can work with high-level abstractions for most applications. But if you need to optimize or compare different approaches (tool use vs. MCP vs. prompt-based, for instance), you have to dig deeper into how they’re actually implemented.

_factor

If you can reduce complex matrix multiplications into simpler terms, then you may be able to focus the training based on those constraints to increase performance/efficiency.

xg15

ok, then how would you do it?

heyitsguay

Could you give some examples of the agent frameworks you're referring to? I'd love to see some examples that go beyond the graph pattern! Thank you

_pdp_

The agentic ai capabilities of chatbotkit.com has nothing to do with workflows.

The graph rendering is simply for illustrative purposes and most to cater for people who think in terms of graphs but the underlaying mechanics are not nodes and edges and a flow that goes from one to the next.

xg15

what exactly do you mean with "error tolerance and recovery"?

mentalgear

Everything that was previously just called automation or pipeline processing on-top of LLM is now the buzzword "agents". The hype bubble needs constant feeding to keep from imploding.

zh2408

Thank you! I'm not against such hype TBH :)

jumploops

Anthropic[0] and Google[1] are both pushing for a clear definition of an “agent” vs. an “agentic workflow”

tl;dr from Anthropic:

> Workflows are systems where LLMs and tools are orchestrated through predefined code paths.

> Agents, on the other hand, are systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks.

Most “agents” today fall into the workflow category.

The foundation model makers are pushing their new models to be better at the second, “pure” agent, approach.

In practice, I’m not sure how effective the “pure” approach will work for most LLM-assisted tasks.

I liken it to a fresh intern who shows up with amnesia every day.

Even if you tell them what they did yesterday, they’re still liable to take a different path for today’s work.

My hunch is that we’ll see an evolution of this terminology, and agents of the future will still have some “guiderails” (note: not necessarily _guard_rails), that makes their behavior more predictable over long horizons.

[0]https://www.anthropic.com/engineering/building-effective-age...

[1]https://www.youtube.com/watch?v=Qd6anWv0mv0

zh2408

Let me clarify: we are discussing how the Agent is internally implemented, given LLM calls and tools. It can be built using a graph, where one node makes decisions that branch out to tools and can loop back.

The workflow can vary. For example, it can involve multiple LLM calls chained together without branching or looping. It can also be built using a graph.

I know the terms "graph" and "workflow" can be a bit confusing. It’s like we have a low-level 'cache' at the CPU level and then a high-level 'cache' in software.

jumploops

Yes, the difference is that in the “pure” agent approach, the model is the only thing directing what to do.

In a sense there’s still a graph of execution, but the graph isn’t known until the “agent” runs and decides what tools to use, in what order, and for how long.

There is no scaffold, just LLM + MCP (or w/e) in a loop.

zh2408

Yes!!

miguelinho

Great write up! In my opinion, your description likely accurately models what AI agents are doing. Perhaps the graph could be static or dynamic. Either way - it makes sense! Also, thank you for removing the hype!

zh2408

Thank you!

null

[deleted]

nxpnsv

I found it understandable and clear. Pocket flow looks cool, although that magic with - >> operators seems a bit obtuse... Also, I think "simply" is a trap - an agent might be modeled by a graph, but that graph can be arbitrarily complex.

admiralrohan

Strangely the original HN post on the framework got no comments but this one is getting viral! Good luck.

bckr

Anyone succeeding with agents in production? Other than cursor :)

v3ss0n

My experience is Mistral Small, QwQ and QwenCoder can build much better diagrams in Mermaid compared to those attempt by Mr haung