Chapter 4: ReAct Framework
Reasoning + Acting
Learning Objectives
- Understand react framework fundamentals
- Master the mathematical foundations
- Learn practical implementation
- Apply knowledge through examples
- Recognize real-world applications
ReAct Framework
What is ReAct?
ReAct (Reasoning + Acting) is a framework that combines reasoning and acting in language, allowing agents to think step-by-step and take actions based on their reasoning.
ReAct solves a critical problem:
- Traditional agents: Act without explicit reasoning (black box decisions)
- ReAct agents: Think out loud, then act based on reasoning (transparent, better decisions)
- Key innovation: Interleaves reasoning and acting in a single loop
ReAct Loop Visualization
💭
Thought
Reason about task
⚡
Action
Execute tool
👁️
Observation
See result
Loop continues until goal achieved
The ReAct Loop
ReAct follows a simple but powerful pattern:
- Thought: Agent reasons about what to do next
- Action: Agent takes an action (tool call, API request, etc.)
- Observation: Agent observes the result
- Repeat: Use observation to inform next thought
📚 Why ReAct Works
ReAct improves agent performance because:
- Explicit reasoning: Forces agent to think before acting
- Better decisions: Reasoning leads to better tool selection
- Transparency: You can see agent's thought process
- Error recovery: Agent can reason about failures and try alternatives
Key Concepts
🔑 The Three Components of ReAct
1. Thought (Reasoning)
The agent explicitly reasons about the task:
- Analyzes the current situation
- Considers what information is needed
- Decides what action to take
- Explains the reasoning in natural language
Example Thought: "I need to find the weather in New York. I should use the get_weather tool with city='New York'."
2. Action (Tool Execution)
The agent takes a concrete action:
- Calls a tool or function
- Specifies parameters
- Executes the action
Example Action: Action: get_weather(city="New York")
3. Observation (Result)
The agent observes the result:
- Receives tool output
- Processes the information
- Uses it to inform next thought
Example Observation: Observation: {"temperature": 72, "condition": "sunny"}
ReAct vs Other Approaches
ReAct vs Other Frameworks
| Feature | ReAct | Act-Only | Think-Only |
|---|---|---|---|
| Explicit Reasoning | ✓ Yes | ✗ No | ✓ Yes |
| Tool Use | ✓ Yes | ✓ Yes | ✗ No |
| Transparency | ✓ High | ✗ Low | ✓ High |
| Error Recovery | ✓ Good | ✗ Poor | ✗ N/A |
📝 ReAct Prompt Format
ReAct uses a specific prompt format that interleaves thoughts, actions, and observations:
ReAct Prompt Structure
Question: What's the weather in New York?
Thought: I need to find the weather in New York. I should use the get_weather tool.
Action: get_weather(city="New York")
Observation: {"temperature": 72, "condition": "sunny"}
Thought: I have the weather information. I can now answer the question.
Action: respond()
Observation: The weather in New York is 72°F and sunny.
Final Answer: The weather in New York is 72°F and sunny.
Mathematical Formulations
ReAct State Transition
What This Measures
This formula describes how the ReAct (Reasoning + Acting) framework transitions from one step to the next. It shows how explicit reasoning thoughts, actions taken, and observations received evolve over time, with each observation informing the next reasoning step. This creates a transparent, adaptive decision-making loop.
Breaking It Down
- thought_t: Reasoning at step t - the explicit reasoning text the agent writes before taking action at step t. This is the agent's "thinking out loud" - it explains what it's considering, why it's choosing a particular action, and what it expects to learn. Thoughts are written in natural language and make the agent's reasoning transparent.
- action_t: Action taken at step t - the specific action the agent executes based on thought_t (e.g., "call get_weather(city='New York')", "search_web(query='quantum computing')", "Final Answer: ..."). Actions are concrete steps that move toward the goal.
- observation_t: Result observed at step t - what happened as a result of action_t (tool output, error message, environmental change). Observations provide feedback about whether the action was successful and what information was gained.
- → (transition): The arrow represents the transition mechanism - observation_t is used to inform thought_{t+1}. The agent reads the observation, reasons about what it means, and uses it to decide the next action. This creates a feedback loop where actions inform future reasoning.
- thought_{t+1}: Reasoning at step t+1 - the next reasoning step that incorporates observation_t. The agent explicitly writes how the observation changes its understanding and what it will do next.
- action_{t+1}, observation_{t+1}: The cycle continues with the next action and observation, building on previous steps until the goal is achieved.
Where This Is Used
This transition happens at every step of the ReAct loop. The agent: (1) writes a thought explaining its reasoning, (2) takes an action based on that thought, (3) observes the result, (4) uses the observation to inform the next thought, and (5) repeats until the goal is achieved or max_steps is reached. This creates a transparent, step-by-step reasoning process that's easy to understand and debug.
Why This Matters
The ReAct state transition enables transparent, adaptive reasoning. By explicitly writing thoughts, agents make their reasoning process visible and understandable. By using observations to inform next thoughts, agents can adapt their strategy based on what they learn. This is superior to "black box" agents that don't show reasoning - ReAct agents can explain their decisions, handle errors gracefully (by reasoning about what went wrong), and adapt their approach when initial actions don't work. The explicit thought-action-observation loop is what makes ReAct effective for complex, multi-step tasks.
Example Calculation
Step t:
- thought_t = "I need to find the weather for New York. I should use the get_weather tool."
- action_t = "call get_weather(city='New York')"
- observation_t = {"temp": 22, "condition": "sunny"}
Transition: observation_t informs next reasoning
Step t+1:
- thought_{t+1} = "I received the weather data: 22°C and sunny. The user asked about the weather, so I have enough information. I should provide a clear answer."
- action_{t+1} = "Final Answer: The weather in New York is 22°C (72°F) and sunny."
- observation_{t+1} = "Task complete"
Interpretation: The observation from step t (weather data) directly informed the thought at step t+1 (agent recognized it has enough info). The agent then took the final action (provided answer) and completed the task. This demonstrates how observations feed into reasoning, creating an adaptive loop that enables effective problem-solving.
ReAct Decision Function
What This Measures
This function determines what action the ReAct agent should take at step t. It uses the LLM to process the original question, current reasoning thought, history of previous steps, and available tools, then generates the next action. This is the decision-making core of the ReAct framework.
Breaking It Down
- question: Original user question - the initial request or query that started the task (e.g., "What's the weather in New York?", "Research quantum computing"). This provides the goal and context that guides all decisions.
- thought_t: Current reasoning - the explicit reasoning text the agent wrote at step t, explaining what it's thinking and why it's considering certain actions. This makes the agent's reasoning process transparent and helps the LLM understand the current reasoning state.
- history_{
Previous thoughts, actions, and observations - the complete sequence of (thought, action, observation) tuples from steps 1 through t-1. This history provides context about: what the agent has already tried, what it learned from previous steps, what worked and what didn't, and the progression of reasoning. History enables the agent to avoid repeating mistakes and build on previous insights. - available_tools: Tools the agent can use - the set of functions, APIs, or capabilities available (e.g., ["get_weather", "search_web", "calculate"]). The LLM uses tool descriptions to determine which tool (if any) to use, or whether to provide a final answer.
- LLM(...): Language model processing - the LLM takes all inputs and generates the next action. The LLM: understands the question, processes the current thought, reviews the history to understand what's been tried, evaluates available tools, and decides the best next action (use a tool, provide final answer, or ask for clarification).
- action_t: Generated action - the output decision, which could be: a tool call with parameters, a "Final Answer" with the response, or a clarification question. The action is then executed, and its result becomes observation_t.
Where This Is Used
This function is called at every step of the ReAct loop, specifically in the "Reason" phase. After writing thought_t, the agent uses this function to decide what action to take. The decision is based on: the original goal (question), current understanding (thought_t), past experience (history), and available capabilities (tools). This happens repeatedly until the agent decides to provide a final answer or reaches max_steps.
Why This Matters
This decision function is what makes ReAct effective. By considering the full history, the agent can: learn from previous steps (if a tool failed, try a different approach), build on previous insights (use information from earlier observations), avoid repetition (don't try the same failed action), and make informed decisions (understand the full context before acting). The explicit thought input makes reasoning transparent, and the history input enables adaptive behavior. Without this comprehensive decision-making, agents would make decisions in isolation without learning from experience.
Example Calculation
Given:
- question = "What's the weather in New York and should I bring an umbrella?"
- thought_t = "I need to get weather data for New York first, then determine if an umbrella is needed based on rain probability."
- history_{
- available_tools = ["get_weather", "search_web"]
Step 1: LLM processes inputs - understands it needs weather data
Step 2: LLM evaluates tools - get_weather is most appropriate
Step 3: LLM generates action
Result: action_t = "call get_weather(city='New York')"
After observation: observation_t = {"temp": 22, "condition": "sunny", "rain_prob": 10%}
Next step (t+1):
- thought_{t+1} = "I got the weather: 22°C, sunny, 10% rain probability. Since rain probability is only 10%, an umbrella is not needed."
- history_{
- LLM processes: sees weather data, understands question about umbrella, has enough info
- action_{t+1} = "Final Answer: The weather in New York is 22°C and sunny with only 10% chance of rain. You don't need an umbrella."
Interpretation: The decision function used the question, current thought, and available tools to select the weather tool. After receiving the observation, the next decision used the history (previous step's result) to determine it had enough information and could provide the final answer. This demonstrates how history enables the agent to build on previous steps and make informed decisions.
ReAct Termination Condition
What This Measures
This formula defines when the ReAct loop should stop executing. It specifies two termination conditions: (1) the agent decides it has enough information and provides a final answer, or (2) a maximum step limit is reached to prevent infinite loops. This ensures the agent completes tasks efficiently while avoiding endless execution.
Breaking It Down
- action_t = "Final Answer": The agent explicitly decides to terminate by outputting "Final Answer" followed by the response. This happens when the agent's reasoning determines it has: gathered sufficient information to answer the question, completed all necessary steps, or reached a point where it cannot proceed further (and provides what it knows). The "Final Answer" action is a special action type that signals completion.
- t ≥ max_steps: A safety mechanism that terminates the loop if the number of steps (t) reaches or exceeds a maximum limit (max_steps, typically 10-20 steps). This prevents infinite loops that could occur if: the agent gets stuck in a cycle, cannot find a solution, or keeps trying actions that don't lead to completion. When max_steps is reached, the agent typically returns the best answer it has so far or indicates it couldn't complete the task.
- Terminate: When either condition is met, the ReAct loop stops, and the agent returns its final response (either the "Final Answer" content or a timeout message if max_steps was reached).
Where This Is Used
This termination condition is checked after every action in the ReAct loop. After executing action_t and receiving observation_t, the agent checks: (1) if action_t was "Final Answer" → terminate and return answer, (2) if t ≥ max_steps → terminate and return best answer or timeout message, (3) otherwise → continue to next step (t+1). This check happens in the loop control logic that manages ReAct execution.
Why This Matters
Proper termination is essential for agent efficiency and safety. Without termination conditions, agents could: run indefinitely (wasting resources), get stuck in loops (repeating the same actions), or never provide answers (always trying more actions). The "Final Answer" condition enables agents to recognize when they've completed the task and stop efficiently. The max_steps condition provides a safety net to prevent infinite execution, ensuring agents always complete (even if with a timeout message) rather than running forever. This is crucial for production systems where resource limits and user experience matter.
Example Calculation
Scenario 1: Normal Termination
- t = 2 (second step)
- action_2 = "Final Answer: The weather in New York is 22°C and sunny. You don't need an umbrella."
- max_steps = 10
- Check: action_2 = "Final Answer" → Terminate
- Result: Loop stops, returns the final answer
Scenario 2: Max Steps Termination
- t = 10
- action_10 = "call search_web(query='...')" (still trying to find answer)
- max_steps = 10
- Check: t (10) ≥ max_steps (10) → Terminate
- Result: Loop stops, returns "Reached maximum steps. Best answer so far: [partial information]"
Scenario 3: Continue
- t = 3
- action_3 = "call get_weather(city='Boston')" (not Final Answer)
- max_steps = 10
- Check: action_3 ≠ "Final Answer" AND t (3) < max_steps (10) → Continue
- Result: Loop continues to step 4
Interpretation: Scenario 1 shows normal completion - the agent recognized it had enough information and provided a final answer. Scenario 2 shows safety termination - the agent reached the step limit and stopped to prevent infinite execution. Scenario 3 shows the loop continuing when neither termination condition is met. This demonstrates how termination conditions ensure efficient completion while preventing infinite loops.
Detailed Examples
Example 1: ReAct for Weather Query - Complete Trace
Task: "What's the weather in New York and should I bring an umbrella?"
Complete ReAct Execution Trace
💭 Thought 1
I need to get the weather in New York. I'll use the get_weather tool.
⚡ Action 1
get_weather(city="New York")
👁️ Observation 1
{"temperature": 72, "condition": "sunny", "rain_probability": 10%}
💭 Thought 2
The weather is sunny with only 10% rain probability. I don't need an umbrella. I have all the information to answer.
⚡ Action 2
Final Answer
✅ Final Answer
The weather in New York is 72°F and sunny with only 10% chance of rain. You don't need an umbrella.
Example 2: ReAct for Multi-Step Research
Task: "Research quantum computing breakthroughs in 2024"
Multi-Step ReAct Trace
| Step | Thought | Action | Observation |
|---|---|---|---|
| 1 | Need to search for quantum computing 2024 | search("quantum computing 2024") | Found 5 articles |
| 2 | Should read articles to extract key points | read_article(article_1) | Extracted 3 breakthroughs |
| 3 | Have enough info, can summarize | Final Answer | Summary generated |
Implementation
ReAct Agent Implementation
from typing import Dict, List, Any, Optional
import re
import json
class ReActAgent:
"""ReAct (Reasoning + Acting) Agent Implementation"""
def __init__(self, llm, tools: Dict[str, Any], max_steps: int = 10):
"""
Initialize ReAct agent
Parameters:
llm: Language model for reasoning
tools: Dictionary of available tools
max_steps: Maximum ReAct loop iterations
"""
self.llm = llm
self.tools = tools
self.max_steps = max_steps
self.history = [] # Store thought-action-observation history
def format_tools(self) -> str:
"""Format tools for prompt"""
tool_descriptions = []
for name, tool in self.tools.items():
desc = f"- {name}: {tool['description']}"
if 'parameters' in tool:
params = ', '.join(tool['parameters'].keys())
desc += f" (params: {params})"
tool_descriptions.append(desc)
return '\n'.join(tool_descriptions)
def extract_action(self, text: str) -> Optional[Dict]:
"""
Extract action from LLM output
Looks for patterns like:
- Action: tool_name(params)
- Action: Final Answer
"""
# Pattern: Action: tool_name(param="value")
action_pattern = r'Action:\s*(\w+)\((.*?)\)'
match = re.search(action_pattern, text)
if match:
tool_name = match.group(1)
params_str = match.group(2)
# Parse parameters (simple parsing, in production use proper parser)
params = {}
if params_str:
# Handle key="value" pairs
param_pattern = r'(\w+)="([^"]+)"'
for param_match in re.finditer(param_pattern, params_str):
key = param_match.group(1)
value = param_match.group(2)
params[key] = value
return {'type': 'tool', 'tool': tool_name, 'parameters': params}
# Check for Final Answer
if 'Final Answer' in text or 'final answer' in text.lower():
return {'type': 'final_answer'}
return None
def execute_action(self, action: Dict) -> Any:
"""Execute an action (tool call)"""
if action['type'] == 'final_answer':
return None
tool_name = action['tool']
if tool_name not in self.tools:
return f"Error: Tool {tool_name} not found"
tool = self.tools[tool_name]
func = tool['function']
params = action['parameters']
try:
result = func(**params)
return result
except Exception as e:
return f"Error: {str(e)}"
def react_step(self, question: str, step: int) -> Dict:
"""
Execute one ReAct step
Returns:
Dict with thought, action, observation, and whether to continue
"""
# Build prompt with history
history_text = ""
for i, entry in enumerate(self.history):
history_text += f"\nThought {i+1}: {entry['thought']}\n"
history_text += f"Action {i+1}: {entry['action']}\n"
history_text += f"Observation {i+1}: {entry['observation']}\n"
tools_text = self.format_tools()
prompt = f"""Question: {question}
Available tools:
{tools_text}
{history_text}
Thought {step+1}: Let me think about what to do next.
Action {step+1}:"""
# Get LLM response
llm_response = self.llm.generate(prompt)
# Extract thought and action
thought_match = re.search(r'Thought \d+:\s*(.+?)(?=Action|$)', llm_response, re.DOTALL)
thought = thought_match.group(1).strip() if thought_match else "Thinking..."
action = self.extract_action(llm_response)
if not action:
# Try to extract from full response
action = {'type': 'final_answer'}
# Execute action
if action['type'] == 'final_answer':
observation = "Ready to provide final answer"
should_continue = False
else:
observation = self.execute_action(action)
should_continue = True
return {
'thought': thought,
'action': action,
'observation': observation,
'continue': should_continue
}
def run(self, question: str) -> str:
"""
Run ReAct agent
Parameters:
question: User's question
Returns:
Final answer
"""
self.history = []
for step in range(self.max_steps):
# Execute ReAct step
step_result = self.react_step(question, step)
# Store in history
self.history.append({
'thought': step_result['thought'],
'action': step_result['action'],
'observation': step_result['observation']
})
# Check if done
if not step_result['continue']:
# Generate final answer
history_text = ""
for i, entry in enumerate(self.history):
history_text += f"\nThought {i+1}: {entry['thought']}\n"
history_text += f"Action {i+1}: {entry['action']}\n"
history_text += f"Observation {i+1}: {entry['observation']}\n"
prompt = f"""Question: {question}
{history_text}
Based on the above reasoning and observations, provide the final answer.
Final Answer:"""
final_answer = self.llm.generate(prompt)
return final_answer.strip()
return "Agent reached maximum steps without completing the task."
# Example usage
def search_web(query: str) -> str:
"""Search the web"""
# In real implementation, call search API
return f"Search results for '{query}': Found 10 relevant articles"
def get_weather(city: str) -> str:
"""Get weather"""
return f"Weather in {city}: 72°F, sunny"
tools = {
'search_web': {
'description': 'Search the web for information',
'function': search_web,
'parameters': {'query': 'string'}
},
'get_weather': {
'description': 'Get current weather for a city',
'function': get_weather,
'parameters': {'city': 'string'}
}
}
# agent = ReActAgent(llm=my_llm, tools=tools)
# answer = agent.run("What's the weather in New York?")
# print(answer)
ReAct Prompt Template
REACT_PROMPT_TEMPLATE = """Question: {question}
You can use the following tools:
{tools}
Use the following format:
Thought: [your reasoning about what to do]
Action: [tool_name(param1="value1", param2="value2")]
Observation: [tool result]
Thought: [your reasoning based on observation]
Action: [next tool or Final Answer]
Observation: [tool result or ready for final answer]
... (repeat Thought/Action/Observation as needed)
When you have enough information to answer, use:
Action: Final Answer
Observation: [your final answer]
Final Answer: [your complete answer to the question]"""
def create_react_prompt(question: str, tools: List[Dict]) -> str:
"""Create ReAct prompt"""
tools_text = format_tools(tools)
return REACT_PROMPT_TEMPLATE.format(
question=question,
tools=tools_text
)
Real-World Applications
🌍 ReAct in Production Systems
ReAct is used in many production agent systems:
1. LangChain ReAct Agents
- LangChain implements ReAct framework
- Used in production for customer support, research, automation
- Benefits: Transparent reasoning, better error handling
2. Research and Analysis Agents
- ReAct helps agents reason through complex research tasks
- Step-by-step reasoning improves accuracy
- Example: Financial analysis agents that research companies
3. Code Generation Agents
- ReAct helps agents plan code generation
- Reason about requirements before writing code
- Better code quality through explicit reasoning
ReAct Performance Benefits
ReAct vs Act-Only Performance
| Metric | Act-Only | ReAct |
|---|---|---|
| Task Accuracy | ~60% | ~85% |
| Error Recovery | Poor | Good |
| Transparency | Low | High |
| Tool Selection | ~70% correct | ~90% correct |