Chapter 7: Prompt Engineering

Guiding Model Behavior

Learning Objectives

  • Understand prompt engineering fundamentals
  • Master the mathematical foundations
  • Learn practical implementation
  • Apply knowledge through examples
  • Recognize real-world applications

Prompt Engineering

Introduction

Guiding Model Behavior

This chapter provides comprehensive coverage of prompt engineering, including detailed explanations, mathematical formulations, code implementations, and real-world examples.

📚 Why This Matters

Understanding prompt engineering is crucial for mastering modern AI systems. This chapter breaks down complex concepts into digestible explanations with step-by-step examples.

Key Concepts

Prompt Engineering Fundamentals

What is prompt engineering: Crafting input text to guide LLM behavior and improve performance without modifying model weights.

Key principles:

  • Clarity: Be specific and unambiguous
  • Context: Provide relevant background
  • Examples: Show desired format (few-shot)
  • Structure: Use clear formatting and organization

Prompting Strategies

Zero-shot: Just describe the task. Model uses its pre-trained knowledge.

Few-shot: Provide examples in the prompt. Model learns the pattern from examples.

Chain-of-thought: Ask model to show reasoning steps. Improves complex reasoning tasks.

Role-playing: "Act as a..." helps model adopt specific perspective or expertise.

Prompt Components

Effective prompts include:

  • Task description: What you want the model to do
  • Context: Relevant background information
  • Examples: Demonstrations of desired behavior
  • Constraints: Limitations or requirements
  • Output format: How you want the response structured

Mathematical Formulations

Prompt-based Prediction

\[P(y | \text{prompt}, x) = P(y | [\text{task\_desc}, \text{examples}, x])\]
Where:
  • \(y\): Desired output
  • \(\text{prompt}\): Crafted input including task description and examples
  • \(x\): Actual input to process
  • Model conditions on entire prompt context

Few-shot Learning

\[P(y | x, \{(x_i, y_i)\}_{i=1}^{k}) = P(y | [x_1, y_1, \ldots, x_k, y_k, x])\]
Where:
  • \(\{(x_i, y_i)\}\): k examples in prompt
  • Model learns pattern from examples
  • Applies pattern to new input x
  • k typically 1-5 examples

Chain-of-Thought Prompting

\[P(\text{answer} | \text{question}) = P(\text{reasoning} | \text{question}) \times P(\text{answer} | \text{reasoning})\]

By explicitly modeling reasoning steps, the model breaks down complex problems into simpler sub-problems, improving accuracy on reasoning tasks.

Detailed Examples

Example: Zero-shot vs Few-shot

Zero-shot prompt:

Classify the sentiment of this review: "This movie was amazing!"
Sentiment:

Few-shot prompt:

Review: "Great product!"
Sentiment: Positive

Review: "Terrible quality."
Sentiment: Negative

Review: "This movie was amazing!"
Sentiment:

Result: Few-shot typically performs better as model learns the pattern from examples.

Example: Chain-of-Thought

Without CoT:

Q: A store has 15 apples. They sell 6. How many left?
A:

With CoT:

Q: A store has 15 apples. They sell 6. How many left?
A: Let me think step by step.
The store starts with 15 apples.
They sell 6 apples.
So remaining = 15 - 6 = 9 apples.
Therefore, 9 apples are left.

Result: CoT improves accuracy on math and reasoning problems.

Implementation

Few-shot Prompting Function

def few_shot_classify(text, examples, model, tokenizer):
    """
    Classify text using few-shot prompting
    """
    # Build prompt with examples
    prompt = ""
    for example_text, example_label in examples:
        prompt += f"Text: {example_text}\nLabel: {example_label}\n\n"
    
    # Add input text
    prompt += f"Text: {text}\nLabel:"
    
    # Generate
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(
        inputs.input_ids,
        max_length=inputs.input_ids.shape[1] + 10,
        temperature=0.3,
        do_sample=True
    )
    
    result = tokenizer.decode(outputs[0], skip_special_tokens=True)
    label = result.split("Label:")[-1].strip()
    return label

# Example usage
examples = [
    ("Great product!", "Positive"),
    ("Terrible quality.", "Negative")
]
text = "This movie was amazing!"
label = few_shot_classify(text, examples, model, tokenizer)
print(label)  # "Positive"

Chain-of-Thought Prompting

def chain_of_thought_reasoning(question, model, tokenizer):
    """
    Use chain-of-thought prompting for reasoning
    """
    prompt = f"""Q: {question}
A: Let me think step by step.
"""
    
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(
        inputs.input_ids,
        max_length=inputs.input_ids.shape[1] + 200,
        temperature=0.7,
        do_sample=True
    )
    
    result = tokenizer.decode(outputs[0], skip_special_tokens=True)
    reasoning = result.split("A:")[-1].strip()
    return reasoning

# Example
question = "A store has 15 apples. They sell 6. How many left?"
reasoning = chain_of_thought_reasoning(question, model, tokenizer)
print(reasoning)

Real-World Applications

Prompt Engineering in Practice

Chatbot development:

  • Craft system prompts to define chatbot personality
  • Use few-shot examples to show desired conversation style
  • Iterate on prompts to improve responses

Content generation:

  • Use role-playing prompts: "Act as a marketing expert..."
  • Provide examples of desired writing style
  • Specify output format and constraints

Task automation:

  • Format conversion tasks
  • Data extraction from unstructured text
  • Code generation with specific requirements

Prompt Engineering Best Practices

Do:

  • Be specific and clear
  • Provide context and examples
  • Specify output format
  • Test and iterate
  • Use chain-of-thought for complex reasoning

Don't:

  • Be vague or ambiguous
  • Assume model knows context
  • Use overly complex prompts
  • Ignore prompt injection risks

Test Your Understanding

Question 1: What is prompt engineering?

A) The practice of crafting input text to guide LLM behavior and improve performance without modifying model weights, using techniques like clarity, context, examples, and structure
B) A method to reduce model size
C) A training technique that updates model weights
D) A way to speed up inference

Question 2: What is the difference between zero-shot, few-shot, and chain-of-thought prompting?

A) Zero-shot: just describe the task. Few-shot: provide examples in the prompt. Chain-of-thought: ask the model to show reasoning steps, which improves complex reasoning tasks
B) They are all identical
C) Zero-shot uses examples while few-shot doesn't
D) Chain-of-thought doesn't use reasoning

Question 3: What is the mathematical formulation for prompt-based prediction?

A) \(P(y | \text{prompt}, x) = P(y | [\text{task\_desc}, \text{examples}, x])\) where the model conditions on the entire prompt context
B) \(P(y | x) = \text{constant}\)
C) \(P(y | \text{prompt}) = P(y)\)
D) \(P(y) = \text{random()}\)

Question 4: What is few-shot learning in the context of prompting?

A) \(P(y | x, \{(x_i, y_i)\}_{i=1}^{k}) = P(y | [x_1, y_1, \ldots, x_k, y_k, x])\) where k examples are provided in the prompt, and the model learns the pattern from these examples
B) Using zero examples
C) Using thousands of examples
D) Training the model weights

Question 5: What is chain-of-thought prompting and why is it effective?

A) Asking the model to explicitly show reasoning steps, which breaks down complex problems into simpler sub-problems. Formally: \(P(\text{answer} | \text{question}) = P(\text{reasoning} | \text{question}) \times P(\text{answer} | \text{reasoning})\)
B) A method to reduce prompt length
C) A technique that doesn't improve reasoning
D) Only for simple tasks

Question 6: What are the key principles of effective prompt engineering?

A) Clarity (be specific and unambiguous), context (provide relevant background), examples (show desired format), and structure (use clear formatting and organization)
B) Use vague and ambiguous prompts
C) Never provide examples
D) Use complex, unstructured prompts

Question 7: What is role-playing in prompt engineering?

A) Using prompts like "Act as a..." to help the model adopt a specific perspective or expertise, which guides its behavior and responses
B) A method to reduce model parameters
C) A training technique
D) A way to speed up inference

Question 8: What components should effective prompts include?

A) Task description (what you want the model to do), context (relevant background), examples (demonstrations), constraints (limitations), and output format (how you want the response structured)
B) Only the task description
C) Only random text
D) No specific components needed

Question 9: What are some best practices for prompt engineering?

A) Be specific and clear, provide context and examples, specify output format, test and iterate, and use chain-of-thought for complex reasoning. Avoid being vague, assuming context, using overly complex prompts, and ignoring prompt injection risks
B) Always use vague prompts
C) Never test prompts
D) Use the same prompt for all tasks

Question 10: What are common applications of prompt engineering?

A) Chatbot development (define personality and conversation style), content generation (role-playing prompts, style examples), and task automation (format conversion, data extraction, code generation)
B) Only image processing
C) Only speech recognition
D) Only model training

Question 11: How does few-shot prompting typically compare to zero-shot prompting?

A) Few-shot prompting typically performs better because the model learns the pattern from the provided examples, making it more likely to produce the desired output format and style
B) Zero-shot always performs better
C) They always perform identically
D) Few-shot is never useful

Question 12: What is in-context learning?

A) The ability of LLMs to learn from examples provided in the prompt context without updating model weights, enabling the model to adapt to new tasks through prompting alone
B) A training method that updates weights
C) A way to reduce model size
D) A technique for fine-tuning