Chapter 4: ReAct Framework

Reasoning + Acting

Learning Objectives

  • Understand react framework fundamentals
  • Master the mathematical foundations
  • Learn practical implementation
  • Apply knowledge through examples
  • Recognize real-world applications

ReAct Framework

What is ReAct?

ReAct (Reasoning + Acting) is a framework that combines reasoning and acting in language, allowing agents to think step-by-step and take actions based on their reasoning.

ReAct solves a critical problem:

  • Traditional agents: Act without explicit reasoning (black box decisions)
  • ReAct agents: Think out loud, then act based on reasoning (transparent, better decisions)
  • Key innovation: Interleaves reasoning and acting in a single loop

ReAct Loop Visualization

💭

Thought

Reason about task

Action

Execute tool

👁️

Observation

See result

Loop continues until goal achieved

The ReAct Loop

ReAct follows a simple but powerful pattern:

  1. Thought: Agent reasons about what to do next
  2. Action: Agent takes an action (tool call, API request, etc.)
  3. Observation: Agent observes the result
  4. Repeat: Use observation to inform next thought

📚 Why ReAct Works

ReAct improves agent performance because:

  • Explicit reasoning: Forces agent to think before acting
  • Better decisions: Reasoning leads to better tool selection
  • Transparency: You can see agent's thought process
  • Error recovery: Agent can reason about failures and try alternatives

Key Concepts

🔑 The Three Components of ReAct

1. Thought (Reasoning)

The agent explicitly reasons about the task:

  • Analyzes the current situation
  • Considers what information is needed
  • Decides what action to take
  • Explains the reasoning in natural language

Example Thought: "I need to find the weather in New York. I should use the get_weather tool with city='New York'."

2. Action (Tool Execution)

The agent takes a concrete action:

  • Calls a tool or function
  • Specifies parameters
  • Executes the action

Example Action: Action: get_weather(city="New York")

3. Observation (Result)

The agent observes the result:

  • Receives tool output
  • Processes the information
  • Uses it to inform next thought

Example Observation: Observation: {"temperature": 72, "condition": "sunny"}

ReAct vs Other Approaches

ReAct vs Other Frameworks

Feature ReAct Act-Only Think-Only
Explicit Reasoning ✓ Yes ✗ No ✓ Yes
Tool Use ✓ Yes ✓ Yes ✗ No
Transparency ✓ High ✗ Low ✓ High
Error Recovery ✓ Good ✗ Poor ✗ N/A

📝 ReAct Prompt Format

ReAct uses a specific prompt format that interleaves thoughts, actions, and observations:

ReAct Prompt Structure

Question: What's the weather in New York?

Thought: I need to find the weather in New York. I should use the get_weather tool.
Action: get_weather(city="New York")
Observation: {"temperature": 72, "condition": "sunny"}

Thought: I have the weather information. I can now answer the question.
Action: respond()
Observation: The weather in New York is 72°F and sunny.

Final Answer: The weather in New York is 72°F and sunny.

Mathematical Formulations

ReAct State Transition

\[(\text{thought}_t, \text{action}_t, \text{observation}_t) \rightarrow (\text{thought}_{t+1}, \text{action}_{t+1}, \text{observation}_{t+1})\]
What This Measures

This formula describes how the ReAct (Reasoning + Acting) framework transitions from one step to the next. It shows how explicit reasoning thoughts, actions taken, and observations received evolve over time, with each observation informing the next reasoning step. This creates a transparent, adaptive decision-making loop.

Breaking It Down
  • thought_t: Reasoning at step t - the explicit reasoning text the agent writes before taking action at step t. This is the agent's "thinking out loud" - it explains what it's considering, why it's choosing a particular action, and what it expects to learn. Thoughts are written in natural language and make the agent's reasoning transparent.
  • action_t: Action taken at step t - the specific action the agent executes based on thought_t (e.g., "call get_weather(city='New York')", "search_web(query='quantum computing')", "Final Answer: ..."). Actions are concrete steps that move toward the goal.
  • observation_t: Result observed at step t - what happened as a result of action_t (tool output, error message, environmental change). Observations provide feedback about whether the action was successful and what information was gained.
  • → (transition): The arrow represents the transition mechanism - observation_t is used to inform thought_{t+1}. The agent reads the observation, reasons about what it means, and uses it to decide the next action. This creates a feedback loop where actions inform future reasoning.
  • thought_{t+1}: Reasoning at step t+1 - the next reasoning step that incorporates observation_t. The agent explicitly writes how the observation changes its understanding and what it will do next.
  • action_{t+1}, observation_{t+1}: The cycle continues with the next action and observation, building on previous steps until the goal is achieved.
Where This Is Used

This transition happens at every step of the ReAct loop. The agent: (1) writes a thought explaining its reasoning, (2) takes an action based on that thought, (3) observes the result, (4) uses the observation to inform the next thought, and (5) repeats until the goal is achieved or max_steps is reached. This creates a transparent, step-by-step reasoning process that's easy to understand and debug.

Why This Matters

The ReAct state transition enables transparent, adaptive reasoning. By explicitly writing thoughts, agents make their reasoning process visible and understandable. By using observations to inform next thoughts, agents can adapt their strategy based on what they learn. This is superior to "black box" agents that don't show reasoning - ReAct agents can explain their decisions, handle errors gracefully (by reasoning about what went wrong), and adapt their approach when initial actions don't work. The explicit thought-action-observation loop is what makes ReAct effective for complex, multi-step tasks.

Example Calculation

Step t:

  • thought_t = "I need to find the weather for New York. I should use the get_weather tool."
  • action_t = "call get_weather(city='New York')"
  • observation_t = {"temp": 22, "condition": "sunny"}

Transition: observation_t informs next reasoning

Step t+1:

  • thought_{t+1} = "I received the weather data: 22°C and sunny. The user asked about the weather, so I have enough information. I should provide a clear answer."
  • action_{t+1} = "Final Answer: The weather in New York is 22°C (72°F) and sunny."
  • observation_{t+1} = "Task complete"

Interpretation: The observation from step t (weather data) directly informed the thought at step t+1 (agent recognized it has enough info). The agent then took the final action (provided answer) and completed the task. This demonstrates how observations feed into reasoning, creating an adaptive loop that enables effective problem-solving.

ReAct Decision Function

\[\text{action}_t = \text{LLM}(\text{question}, \text{thought}_t, \text{history}_{
What This Measures

This function determines what action the ReAct agent should take at step t. It uses the LLM to process the original question, current reasoning thought, history of previous steps, and available tools, then generates the next action. This is the decision-making core of the ReAct framework.

Breaking It Down
  • question: Original user question - the initial request or query that started the task (e.g., "What's the weather in New York?", "Research quantum computing"). This provides the goal and context that guides all decisions.
  • thought_t: Current reasoning - the explicit reasoning text the agent wrote at step t, explaining what it's thinking and why it's considering certain actions. This makes the agent's reasoning process transparent and helps the LLM understand the current reasoning state.
  • history_{ Previous thoughts, actions, and observations - the complete sequence of (thought, action, observation) tuples from steps 1 through t-1. This history provides context about: what the agent has already tried, what it learned from previous steps, what worked and what didn't, and the progression of reasoning. History enables the agent to avoid repeating mistakes and build on previous insights.
  • available_tools: Tools the agent can use - the set of functions, APIs, or capabilities available (e.g., ["get_weather", "search_web", "calculate"]). The LLM uses tool descriptions to determine which tool (if any) to use, or whether to provide a final answer.
  • LLM(...): Language model processing - the LLM takes all inputs and generates the next action. The LLM: understands the question, processes the current thought, reviews the history to understand what's been tried, evaluates available tools, and decides the best next action (use a tool, provide final answer, or ask for clarification).
  • action_t: Generated action - the output decision, which could be: a tool call with parameters, a "Final Answer" with the response, or a clarification question. The action is then executed, and its result becomes observation_t.
Where This Is Used

This function is called at every step of the ReAct loop, specifically in the "Reason" phase. After writing thought_t, the agent uses this function to decide what action to take. The decision is based on: the original goal (question), current understanding (thought_t), past experience (history), and available capabilities (tools). This happens repeatedly until the agent decides to provide a final answer or reaches max_steps.

Why This Matters

This decision function is what makes ReAct effective. By considering the full history, the agent can: learn from previous steps (if a tool failed, try a different approach), build on previous insights (use information from earlier observations), avoid repetition (don't try the same failed action), and make informed decisions (understand the full context before acting). The explicit thought input makes reasoning transparent, and the history input enables adaptive behavior. Without this comprehensive decision-making, agents would make decisions in isolation without learning from experience.

Example Calculation

Given:

  • question = "What's the weather in New York and should I bring an umbrella?"
  • thought_t = "I need to get weather data for New York first, then determine if an umbrella is needed based on rain probability."
  • history_{
  • available_tools = ["get_weather", "search_web"]

Step 1: LLM processes inputs - understands it needs weather data

Step 2: LLM evaluates tools - get_weather is most appropriate

Step 3: LLM generates action

Result: action_t = "call get_weather(city='New York')"

After observation: observation_t = {"temp": 22, "condition": "sunny", "rain_prob": 10%}

Next step (t+1):

  • thought_{t+1} = "I got the weather: 22°C, sunny, 10% rain probability. Since rain probability is only 10%, an umbrella is not needed."
  • history_{
  • LLM processes: sees weather data, understands question about umbrella, has enough info
  • action_{t+1} = "Final Answer: The weather in New York is 22°C and sunny with only 10% chance of rain. You don't need an umbrella."

Interpretation: The decision function used the question, current thought, and available tools to select the weather tool. After receiving the observation, the next decision used the history (previous step's result) to determine it had enough information and could provide the final answer. This demonstrates how history enables the agent to build on previous steps and make informed decisions.

ReAct Termination Condition

\[\text{Terminate if: } \text{action}_t = \text{"Final Answer"} \text{ OR } t \geq \text{max\_steps}\]
What This Measures

This formula defines when the ReAct loop should stop executing. It specifies two termination conditions: (1) the agent decides it has enough information and provides a final answer, or (2) a maximum step limit is reached to prevent infinite loops. This ensures the agent completes tasks efficiently while avoiding endless execution.

Breaking It Down
  • action_t = "Final Answer": The agent explicitly decides to terminate by outputting "Final Answer" followed by the response. This happens when the agent's reasoning determines it has: gathered sufficient information to answer the question, completed all necessary steps, or reached a point where it cannot proceed further (and provides what it knows). The "Final Answer" action is a special action type that signals completion.
  • t ≥ max_steps: A safety mechanism that terminates the loop if the number of steps (t) reaches or exceeds a maximum limit (max_steps, typically 10-20 steps). This prevents infinite loops that could occur if: the agent gets stuck in a cycle, cannot find a solution, or keeps trying actions that don't lead to completion. When max_steps is reached, the agent typically returns the best answer it has so far or indicates it couldn't complete the task.
  • Terminate: When either condition is met, the ReAct loop stops, and the agent returns its final response (either the "Final Answer" content or a timeout message if max_steps was reached).
Where This Is Used

This termination condition is checked after every action in the ReAct loop. After executing action_t and receiving observation_t, the agent checks: (1) if action_t was "Final Answer" → terminate and return answer, (2) if t ≥ max_steps → terminate and return best answer or timeout message, (3) otherwise → continue to next step (t+1). This check happens in the loop control logic that manages ReAct execution.

Why This Matters

Proper termination is essential for agent efficiency and safety. Without termination conditions, agents could: run indefinitely (wasting resources), get stuck in loops (repeating the same actions), or never provide answers (always trying more actions). The "Final Answer" condition enables agents to recognize when they've completed the task and stop efficiently. The max_steps condition provides a safety net to prevent infinite execution, ensuring agents always complete (even if with a timeout message) rather than running forever. This is crucial for production systems where resource limits and user experience matter.

Example Calculation

Scenario 1: Normal Termination

  • t = 2 (second step)
  • action_2 = "Final Answer: The weather in New York is 22°C and sunny. You don't need an umbrella."
  • max_steps = 10
  • Check: action_2 = "Final Answer" → Terminate
  • Result: Loop stops, returns the final answer

Scenario 2: Max Steps Termination

  • t = 10
  • action_10 = "call search_web(query='...')" (still trying to find answer)
  • max_steps = 10
  • Check: t (10) ≥ max_steps (10) → Terminate
  • Result: Loop stops, returns "Reached maximum steps. Best answer so far: [partial information]"

Scenario 3: Continue

  • t = 3
  • action_3 = "call get_weather(city='Boston')" (not Final Answer)
  • max_steps = 10
  • Check: action_3 ≠ "Final Answer" AND t (3) < max_steps (10) → Continue
  • Result: Loop continues to step 4

Interpretation: Scenario 1 shows normal completion - the agent recognized it had enough information and provided a final answer. Scenario 2 shows safety termination - the agent reached the step limit and stopped to prevent infinite execution. Scenario 3 shows the loop continuing when neither termination condition is met. This demonstrates how termination conditions ensure efficient completion while preventing infinite loops.

Detailed Examples

Example 1: ReAct for Weather Query - Complete Trace

Task: "What's the weather in New York and should I bring an umbrella?"

Complete ReAct Execution Trace

💭 Thought 1

I need to get the weather in New York. I'll use the get_weather tool.

⚡ Action 1

get_weather(city="New York")

👁️ Observation 1

{"temperature": 72, "condition": "sunny", "rain_probability": 10%}

💭 Thought 2

The weather is sunny with only 10% rain probability. I don't need an umbrella. I have all the information to answer.

⚡ Action 2

Final Answer

✅ Final Answer

The weather in New York is 72°F and sunny with only 10% chance of rain. You don't need an umbrella.

Example 2: ReAct for Multi-Step Research

Task: "Research quantum computing breakthroughs in 2024"

Multi-Step ReAct Trace

Step Thought Action Observation
1 Need to search for quantum computing 2024 search("quantum computing 2024") Found 5 articles
2 Should read articles to extract key points read_article(article_1) Extracted 3 breakthroughs
3 Have enough info, can summarize Final Answer Summary generated

Implementation

ReAct Agent Implementation

from typing import Dict, List, Any, Optional
import re
import json

class ReActAgent:
    """ReAct (Reasoning + Acting) Agent Implementation"""
    
    def __init__(self, llm, tools: Dict[str, Any], max_steps: int = 10):
        """
        Initialize ReAct agent
        
        Parameters:
        llm: Language model for reasoning
        tools: Dictionary of available tools
        max_steps: Maximum ReAct loop iterations
        """
        self.llm = llm
        self.tools = tools
        self.max_steps = max_steps
        self.history = []  # Store thought-action-observation history
    
    def format_tools(self) -> str:
        """Format tools for prompt"""
        tool_descriptions = []
        for name, tool in self.tools.items():
            desc = f"- {name}: {tool['description']}"
            if 'parameters' in tool:
                params = ', '.join(tool['parameters'].keys())
                desc += f" (params: {params})"
            tool_descriptions.append(desc)
        return '\n'.join(tool_descriptions)
    
    def extract_action(self, text: str) -> Optional[Dict]:
        """
        Extract action from LLM output
        
        Looks for patterns like:
        - Action: tool_name(params)
        - Action: Final Answer
        """
        # Pattern: Action: tool_name(param="value")
        action_pattern = r'Action:\s*(\w+)\((.*?)\)'
        match = re.search(action_pattern, text)
        
        if match:
            tool_name = match.group(1)
            params_str = match.group(2)
            
            # Parse parameters (simple parsing, in production use proper parser)
            params = {}
            if params_str:
                # Handle key="value" pairs
                param_pattern = r'(\w+)="([^"]+)"'
                for param_match in re.finditer(param_pattern, params_str):
                    key = param_match.group(1)
                    value = param_match.group(2)
                    params[key] = value
            
            return {'type': 'tool', 'tool': tool_name, 'parameters': params}
        
        # Check for Final Answer
        if 'Final Answer' in text or 'final answer' in text.lower():
            return {'type': 'final_answer'}
        
        return None
    
    def execute_action(self, action: Dict) -> Any:
        """Execute an action (tool call)"""
        if action['type'] == 'final_answer':
            return None
        
        tool_name = action['tool']
        if tool_name not in self.tools:
            return f"Error: Tool {tool_name} not found"
        
        tool = self.tools[tool_name]
        func = tool['function']
        params = action['parameters']
        
        try:
            result = func(**params)
            return result
        except Exception as e:
            return f"Error: {str(e)}"
    
    def react_step(self, question: str, step: int) -> Dict:
        """
        Execute one ReAct step
        
        Returns:
        Dict with thought, action, observation, and whether to continue
        """
        # Build prompt with history
        history_text = ""
        for i, entry in enumerate(self.history):
            history_text += f"\nThought {i+1}: {entry['thought']}\n"
            history_text += f"Action {i+1}: {entry['action']}\n"
            history_text += f"Observation {i+1}: {entry['observation']}\n"
        
        tools_text = self.format_tools()
        
        prompt = f"""Question: {question}

Available tools:
{tools_text}

{history_text}

Thought {step+1}: Let me think about what to do next.
Action {step+1}:"""
        
        # Get LLM response
        llm_response = self.llm.generate(prompt)
        
        # Extract thought and action
        thought_match = re.search(r'Thought \d+:\s*(.+?)(?=Action|$)', llm_response, re.DOTALL)
        thought = thought_match.group(1).strip() if thought_match else "Thinking..."
        
        action = self.extract_action(llm_response)
        if not action:
            # Try to extract from full response
            action = {'type': 'final_answer'}
        
        # Execute action
        if action['type'] == 'final_answer':
            observation = "Ready to provide final answer"
            should_continue = False
        else:
            observation = self.execute_action(action)
            should_continue = True
        
        return {
            'thought': thought,
            'action': action,
            'observation': observation,
            'continue': should_continue
        }
    
    def run(self, question: str) -> str:
        """
        Run ReAct agent
        
        Parameters:
        question: User's question
        
        Returns:
        Final answer
        """
        self.history = []
        
        for step in range(self.max_steps):
            # Execute ReAct step
            step_result = self.react_step(question, step)
            
            # Store in history
            self.history.append({
                'thought': step_result['thought'],
                'action': step_result['action'],
                'observation': step_result['observation']
            })
            
            # Check if done
            if not step_result['continue']:
                # Generate final answer
                history_text = ""
                for i, entry in enumerate(self.history):
                    history_text += f"\nThought {i+1}: {entry['thought']}\n"
                    history_text += f"Action {i+1}: {entry['action']}\n"
                    history_text += f"Observation {i+1}: {entry['observation']}\n"
                
                prompt = f"""Question: {question}

{history_text}

Based on the above reasoning and observations, provide the final answer.
Final Answer:"""
                
                final_answer = self.llm.generate(prompt)
                return final_answer.strip()
        
        return "Agent reached maximum steps without completing the task."

# Example usage
def search_web(query: str) -> str:
    """Search the web"""
    # In real implementation, call search API
    return f"Search results for '{query}': Found 10 relevant articles"

def get_weather(city: str) -> str:
    """Get weather"""
    return f"Weather in {city}: 72°F, sunny"

tools = {
    'search_web': {
        'description': 'Search the web for information',
        'function': search_web,
        'parameters': {'query': 'string'}
    },
    'get_weather': {
        'description': 'Get current weather for a city',
        'function': get_weather,
        'parameters': {'city': 'string'}
    }
}

# agent = ReActAgent(llm=my_llm, tools=tools)
# answer = agent.run("What's the weather in New York?")
# print(answer)

ReAct Prompt Template

REACT_PROMPT_TEMPLATE = """Question: {question}

You can use the following tools:
{tools}

Use the following format:

Thought: [your reasoning about what to do]
Action: [tool_name(param1="value1", param2="value2")]
Observation: [tool result]

Thought: [your reasoning based on observation]
Action: [next tool or Final Answer]
Observation: [tool result or ready for final answer]

... (repeat Thought/Action/Observation as needed)

When you have enough information to answer, use:
Action: Final Answer
Observation: [your final answer]

Final Answer: [your complete answer to the question]"""

def create_react_prompt(question: str, tools: List[Dict]) -> str:
    """Create ReAct prompt"""
    tools_text = format_tools(tools)
    return REACT_PROMPT_TEMPLATE.format(
        question=question,
        tools=tools_text
    )

Real-World Applications

🌍 ReAct in Production Systems

ReAct is used in many production agent systems:

1. LangChain ReAct Agents

  • LangChain implements ReAct framework
  • Used in production for customer support, research, automation
  • Benefits: Transparent reasoning, better error handling

2. Research and Analysis Agents

  • ReAct helps agents reason through complex research tasks
  • Step-by-step reasoning improves accuracy
  • Example: Financial analysis agents that research companies

3. Code Generation Agents

  • ReAct helps agents plan code generation
  • Reason about requirements before writing code
  • Better code quality through explicit reasoning

ReAct Performance Benefits

ReAct vs Act-Only Performance

Metric Act-Only ReAct
Task Accuracy ~60% ~85%
Error Recovery Poor Good
Transparency Low High
Tool Selection ~70% correct ~90% correct

Test Your Understanding

Question 1: What are the three components of the ReAct loop?

A) While reasoning and action are important, ReAct's key innovation is the explicit thought-action-observation loop that enables transparent reasoning and adaptive behavior
B) Thought, Action, Observation
C) Action and observation only
D) ReAct (Reasoning + Acting) is a framework where agents explicitly write reasoning thoughts before taking actions, observe the results of those actions, and use observations to inform subsequent reasoning steps, creating a transparent and adaptive decision-making loop that enables better error handling and strategic thinking

Question 2: What is the key advantage of ReAct over act-only agents?

A) While reasoning and action are important, ReAct's key innovation is the explicit thought-action-observation loop that enables transparent reasoning and adaptive behavior
B) ReAct (Reasoning + Acting) is a framework where agents explicitly write reasoning thoughts before taking actions, observe the results of those actions, and use observations to inform subsequent reasoning steps, creating a transparent and adaptive decision-making loop that enables better error handling and strategic thinking
C) Explicit reasoning leads to better decisions and transparency
D) Only reasoning

Question 3: How does ReAct handle errors?

A) ReAct (Reasoning + Acting) is a framework where agents explicitly write reasoning thoughts before taking actions, observe the results of those actions, and use observations to inform subsequent reasoning steps, creating a transparent and adaptive decision-making loop that enables better error handling and strategic thinking
B) Only action
C) ReAct combines reasoning (explicit thoughts) with action execution, but also includes observation of results to inform the next reasoning step
D) Agent reasons about the error in next thought and tries alternative approach

Question 4: Interview question: "Explain the ReAct framework and why explicit reasoning helps."

A) ReAct = Reasoning + Acting. Agent explicitly writes thoughts before actions, allowing it to reason about observations, handle errors, and make transparent decisions
B) ReAct (Reasoning + Acting) is a framework where agents explicitly write reasoning thoughts before taking actions, observe the results of those actions, and use observations to inform subsequent reasoning steps, creating a transparent and adaptive decision-making loop that enables better error handling and strategic thinking
C) Only reasoning
D) ReAct combines reasoning (explicit thoughts) with action execution, but also includes observation of results to inform the next reasoning step

Question 5: In the ReAct state transition formula, what does the arrow (→) represent?

A) The transition from current state to next state, where observation informs the next thought
B) While reasoning and action are important, ReAct's key innovation is the explicit thought-action-observation loop that enables transparent reasoning and adaptive behavior
C) Action and observation only
D) ReAct (Reasoning + Acting) is a framework where agents explicitly write reasoning thoughts before taking actions, observe the results of those actions, and use observations to inform subsequent reasoning steps, creating a transparent and adaptive decision-making loop that enables better error handling and strategic thinking

Question 6: What is the termination condition for a ReAct loop?

A) When agent outputs "Final Answer" or max_steps limit is reached
B) Only reasoning
C) ReAct combines reasoning (explicit thoughts) with action execution, but also includes observation of results to inform the next reasoning step
D) ReAct (Reasoning + Acting) is a framework where agents explicitly write reasoning thoughts before taking actions, observe the results of those actions, and use observations to inform subsequent reasoning steps, creating a transparent and adaptive decision-making loop that enables better error handling and strategic thinking

Question 7: Interview question: "How would you implement a ReAct agent that can handle multi-step reasoning?"

A) Implement Thought-Action-Observation loop with state tracking, maintain history of all steps, use LLM to generate thoughts and actions, execute tools and observe results, continue until Final Answer or max_steps
B) Only action
C) While reasoning and action are important, ReAct's key innovation is the explicit thought-action-observation loop that enables transparent reasoning and adaptive behavior
D) ReAct (Reasoning + Acting) is a framework where agents explicitly write reasoning thoughts before taking actions, observe the results of those actions, and use observations to inform subsequent reasoning steps, creating a transparent and adaptive decision-making loop that enables better error handling and strategic thinking

Question 8: What makes ReAct different from traditional prompt engineering?

A) ReAct explicitly structures reasoning and actions in a loop, allowing iterative refinement and error recovery, while traditional prompts are single-shot
B) ReAct (Reasoning + Acting) is a framework where agents explicitly write reasoning thoughts before taking actions, observe the results of those actions, and use observations to inform subsequent reasoning steps, creating a transparent and adaptive decision-making loop that enables better error handling and strategic thinking
C) While reasoning and action are important, ReAct's key innovation is the explicit thought-action-observation loop that enables transparent reasoning and adaptive behavior
D) Action and observation only

Question 9: In the formula action_t = LLM(question, thought_t, history_{
A) Although agents may use different model architectures, the number of parameters doesn't define what makes an agent different
B) Agents have more parameters
C) While processing speed and model size can vary between implementations, the fundamental distinction between agents and traditional LLMs is their ability to autonomously use tools, access real-time data, make decisions, and take actions that affect the environment, not just generate text responses
D) All previous thoughts, actions, and observations from steps before time t

Question 10: Interview question: "How do you prevent a ReAct agent from getting stuck in infinite loops?"

A) ReAct combines reasoning (explicit thoughts) with action execution, but also includes observation of results to inform the next reasoning step
B) Only action
C) ReAct (Reasoning + Acting) is a framework where agents explicitly write reasoning thoughts before taking actions, observe the results of those actions, and use observations to inform subsequent reasoning steps, creating a transparent and adaptive decision-making loop that enables better error handling and strategic thinking
D) Set max_steps limit, detect repeated states/actions, implement timeout, track goal progress, add loop detection in thought generation

Question 11: What is the advantage of making reasoning explicit in ReAct?

A) ReAct (Reasoning + Acting) is a framework where agents explicitly write reasoning thoughts before taking actions, observe the results of those actions, and use observations to inform subsequent reasoning steps, creating a transparent and adaptive decision-making loop that enables better error handling and strategic thinking
B) While reasoning and action are important, ReAct's key innovation is the explicit thought-action-observation loop that enables transparent reasoning and adaptive behavior
C) Transparency, debugging, error recovery, and better decision-making through iterative refinement
D) Only reasoning

Question 12: Interview question: "How would you optimize token usage in a ReAct agent?"

A) While reasoning and action are important, ReAct's key innovation is the explicit thought-action-observation loop that enables transparent reasoning and adaptive behavior
B) ReAct (Reasoning + Acting) is a framework where agents explicitly write reasoning thoughts before taking actions, observe the results of those actions, and use observations to inform subsequent reasoning steps, creating a transparent and adaptive decision-making loop that enables better error handling and strategic thinking
C) Action and observation only
D) Summarize history instead of full context, use concise thought prompts, limit observation length, implement context window management, cache common reasoning patterns