When people first encounter ReAct (Reason + Act), they often think it's just adding three fields—Thought / Action / Observation—to the prompt.

But in reality, the core of ReAct isn't the prompt format. It's the Agent's State Machine.

This article explains, from an engineering perspective, how ReAct actually works inside an LLM, and how it relates to modern Function Calling and Tool Calling.

1. What Is ReAct?

ReAct (Reason + Act) comes from the 2022 paper ReAct: Synergizing Reasoning and Acting in Language Models, authored by Shunyu Yao et al., a collaboration between Princeton University and Google Research.

Its core idea is actually quite simple:

Let the LLM call external tools (Act) at any point during its reasoning (Reason), then continue reasoning based on what the tools return.

Here's an analogy. A traditional LLM is like a student taking a closed-book exam—once the question is given, it writes out the whole answer in one go, relying only on what it has memorized:

User
    │
    ▼
LLM
    │
    ▼
Answer

ReAct is more like a student taking an open-book exam who can also look things up online. Whenever it hits something uncertain, it first thinks "I need to check this," goes off to flip through a book, look up the weather, or run a calculation, and then continues writing once it has the result:

User
    │
    ▼
LLM
    │
Thought      ← what should I do
    │
Action       ← go check the weather
    │
Tool         ← the tool actually runs
    │
Observation  ← the result it gets back
    │
LLM
    │
Thought      ← keep reasoning based on the result
    │
Answer

Its biggest change is this:

The model no longer spits out the final answer all at once. Instead, it can "think → act → get feedback → think again."

2. The Biggest Misconception

Almost every introductory article draws a diagram like this:

Thought
   ↓
Action
   ↓
Observation

And so many people draw two conclusions:

Observation is part of Action;
Thought, Action, and Observation are all just different fields in the prompt.

Neither conclusion is accurate.

To explain it clearly, we first need to distinguish two completely different concepts:

Message: what's actually passed between the Agent and the outside world—a communication protocol.
State: the Agent's internal state, describing "which step it has reasoned to."

In the next few sections, we'll pull the problem apart along these two concepts.

3. Looking at ReAct from the Message Perspective

Suppose the user asks a very everyday question:

Is it good for running in Shanghai today?

Throughout the whole process, the Messages that are actually produced are these:

User Message                ← User: Is it good for running in Shanghai today?
        │
        ▼
Assistant Message #1        ← Model output
        │
        ├── Thought          I should check the weather first
        └── Action(weather)  call weather("Shanghai")
        │
        ▼
Tool Message                ← Tool returns
        │
        └── Observation      26℃, humidity 90%, rain
        │
        ▼
Assistant Message #2        ← Model output again
        │
        ├── Thought          rainy and humid, not great
        └── Final Answer     Not recommended, it's raining today

There are two key points here:

Thought and Action are usually in the same Assistant Message—they're two parts of a single model output.
Observation is not produced by the model—it's a separate Message returned by the Tool.

In other words, at the Message level, only three kinds of roles take part in the conversation: User, Assistant, and Tool.

4. Why Must Observation Be a Separate Message?

Let's first address a point that's easy to confuse: in terms of content, Observation really is the return value of Action.

For example, the model emits an action:

Action: weather("Shanghai")

After the tool executes, it returns:

26℃
Humidity: 90%
Rain: true

This return is the Observation.

So if it's the same thing content-wise, why does the paper still pull Observation out separately?

The key isn't the content—it's the source:

Assistant
    │
    └── Action       comes from the model (what the model "wants" to do)

Tool
    │
    └── Observation  comes from the outside world (what actually happened)

Action comes from the model, Observation comes from the real environment, and the two must never be generated by the same role.

Why be so strict about this? Because if Observation were also written by the model itself, the model could pretend the tool already executed successfully and fabricate a result that never actually happened.

For example, suppose the model wrote this all in one go:

Action:
Search("Apple CEO")

Observation:
Tim Cook

If Observation were also generated by the model, it could make things up entirely—even if the search never ran, it could still "find" a name, or even invent a wrong answer.

That's why modern Agents always insert the tool's real return into the context as a separate Message. Only then is the model forced to face the real result, instead of talking to itself.

5. Why Must Thought and Action Be Split Apart?

This is another spot that's easy to get tangled up in.

Since Thought and Action are in the same Assistant Message:

Assistant Message
    Thought
    Action

why does the paper still describe them separately?

The reason comes back to those two concepts:

Message is the communication protocol—it describes "what was sent out."
Thought / Action is the Agent's internal state—it describes "what's going on in its head."

They're talking about two different things. Thought and Action correspond to the two stages of decision-making:

Thought:  I want to know the weather   ← Decision (deciding what to do)
   ↓
Action:   weather("Shanghai")          ← the execution instruction the model emits

To distinguish them in one sentence:

Thought is "I decide what to do next";
Action is "the execution instruction I actually emit."

What the paper really wants to convey is how the LLM makes decisions step by step, not what the API looks like. So conceptually, it separates decision (Thought) from execution (Action).

An Often-Overlooked Detail: Action Actually Spans Two Roles

There's another layer here that many people miss: Action isn't a single action—it internally splits into two halves.

First half: the LLM proposes the action. The model merely outputs an intent like "I want to call weather("Shanghai")." It can't—and has no ability to—actually check the weather itself.
Second half: the Agent executes the action. The Agent runtime (that is, the code/framework we write) parses this intent and actually calls the weather API, runs the database query, or executes the shell command.

And Observation is the result that comes back after the second half, the "execution," runs.

Stringing the whole chain together by role makes it clearer:

LLM     │  Thought         I need to check the weather
        │  Action(intent)  I "want" to call weather("Shanghai")   ← just proposing
        ▼
Agent   │  execute Action  actually call the weather API           ← doing the real work
        │  Observation     26℃, rain                               ← execution result
        ▼
LLM     │  Thought         it's raining, not suitable

So "Action → Observation" is strictly speaking not done by the model alone: the model is responsible for proposing, and the Agent is responsible for executing and fetching the result. This also echoes Section 4—Observation must be independent, because it comes from the Agent's real execution, not the model's imagination.

Action Is a Logical Concept, Not Equal to Function Calling

One more thing worth emphasizing: Action is a logical concept in the paper. It is not "welded" into some function-call field of an AI message.

In the paper, Action is essentially the abstract behavior of "the Agent decides on and performs one external operation." It can be realized in many ways:

Early on, the model output a single line of text in a fixed format, like Search[Apple CEO], which the Agent then parsed with a regex and executed;
Today the mainstream approach is function calling / tool calling, where the model directly emits structured tool_calls;
It can also be the model outputting a block of code that the Agent runs in a sandbox (Code Act).

These are all different engineering implementations of the same Action concept. Function calling is merely the most popular one right now, not the definition of Action itself. Equating "Action" with "function calling" is exactly what happens when you only see the Prompt/Message layer and miss the State layer behind it.

6. State Is the True Core of ReAct

Once you understand the two sections above, you can see that real ReAct is essentially a state machine.

Thought
   │
   ▼
Action
   │
   ▼
Observation
   │
   ▼
Thought
   │
   ▼
Action
   │
   ▼
Observation
   │
   ▼
  ...

Written as code, it's roughly this loop:

while not finished:
    thought = llm(history)            # LLM: decide + propose action
    action = choose_tool(thought)     # pick the tool the model wants to call
    observation = run(action)         # Agent: actually execute, fetch result
    history.append(observation)       # append back to context, next iteration

The four elements each have their own job:

Thought: the Agent's current decision;
Action: the action the Agent requests to execute;
Observation: the feedback from the environment;
History: the continuously accumulating context.

The whole loop repeats until the model decides it can wrap up and outputs the final answer.

7. In Modern Function Calling, Where Did Thought Go?

If you've used the tool-calling features of OpenAI, Claude, or Gemini, you'll notice they actually no longer output text like this:

Thought:
...

Action:
...

Instead, they directly emit a structured tool call:

{"tool_calls":[{"function":"weather","arguments":{"city":"Shanghai"}}]}

After the program executes the tool, it stuffs the result back as a tool message:

{"role":"tool","content":"26℃, humidity 90%, rain"}

Finally it calls the LLM once more to get the final answer:

User
   ↓
Assistant(tool_call)
   ↓
Tool(result)
   ↓
Assistant(final answer)

Throughout this whole process, Thought is nowhere to be seen.

But that doesn't mean Thought disappeared:

Thought hasn't disappeared. It has simply moved from "written explicitly in the prompt" to "the model's internal Hidden Reasoning."

Modern models usually don't expose this reasoning process directly to developers (reasoning models put it in a separate reasoning field). The decision step still exists—it's just been tucked away inside the model.

8. ReAct Inside: The Whole Flow Seen from Inside the LLM

If we shift our viewpoint to inside the LLM, the whole flow can be drawn like this:

                +----------------+
                | User Message   |
                +--------+-------+
                         |
                         ▼
              +-------------------+
              | Internal Reasoning|
              | (Thought)         |
              +--------+----------+
                       |
                       ▼
              +-------------------+
              | Tool Selection    |
              | (Action)          |
              +--------+----------+
                       |
                       ▼
              +-------------------+
              | Tool Execution    |
              +--------+----------+
                       |
                       ▼
              +-------------------+
              | Observation       |
              | (Tool Message)    |
              +--------+----------+
                       |
                       ▼
              +-------------------+
              | Internal Reasoning|
              | (Thought)         |
              +--------+----------+
                       |
                       ▼
                 Final Answer

What's truly looping is these three actions:

Reason → Act → Observe → Reason → ...

and not, as many people assume:

Prompt → Prompt → Prompt → ...

In other words, the body of the loop is the flow of state, not a pile of stacked text formats.

9. Understanding ReAct at Three Levels

To pull together what we've covered, we can look at ReAct from three levels.

The first level is Prompt. The Thought / Action / Observation in the paper is just there to conveniently display the reasoning trace—a "display format" for humans to read.

The second level is Message. The messages a modern Agent actually exchanges come in only three kinds: User, Assistant, and Tool. This is the "communication protocol" that lands on the API.

The third level is State, and it's the true core. It describes the flow of the Agent's internal state:

Decision
   ↓
Execution
   ↓
Environment Feedback
   ↓
Decision

This state machine is the essence of ReAct.

10. Summary

ReAct in one sentence:

ReAct is not a prompt template—it's an Agent's state machine.

The key to understanding it is to separate three levels:

Prompt level: Thought / Action / Observation—just a display format for expressing the reasoning process.
Message level: User / Assistant / Tool—the actual API communication protocol.
State level: Thought → Action → Observation—the Agent's true internal state machine.

Although modern Function Calling no longer explicitly outputs Thought, underneath it still follows the same state transitions:

Reason → Act → Observe → Reason → ...

So we can understand the relationship between the two like this:

Function Calling is the engineering implementation of ReAct; ReAct is the design philosophy behind Function Calling.

If you found this article helpful, feel free to like, bookmark, and follow. I'll keep sharing more valuable content. Your support is my greatest motivation to create!

ReAct Inside — From Message to State, Understanding How AI Agents Really Work