10  Practice Lab: Master Prompt Engineering

Important🧪 Lab Overview

Duration: 90–120 minutes | Difficulty: ⭐⭐☆☆☆ Goal: Build a prompt comparison tool and benchmark different techniques on real tasks.

10.1 What You’ll Build

A Prompt Benchmarking Toolkit that tests multiple prompt strategies against the same task and scores them.


10.2 Step 1: Prompt Comparison Engine

# file: prompt_benchmark.py
from openai import OpenAI
from dataclasses import dataclass
from typing import Callable
import json

client = OpenAI()

@dataclass
class PromptResult:
    technique: str
    prompt: str
    response: str
    tokens_used: int

def test_prompt(technique: str, messages: list) -> PromptResult:
    """Run a single prompt and capture results."""
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        temperature=0.3
    )
    return PromptResult(
        technique=technique,
        prompt=str(messages),
        response=response.choices[0].message.content,
        tokens_used=response.usage.total_tokens
    )

# Define the task
TASK = "A customer says: 'Your app crashed and I lost all my data. Fix it NOW!'"

# Test all techniques
prompts_to_test = [
    ("Zero-Shot", [
        {"role": "user", "content": f"Respond to this customer complaint: {TASK}"}
    ]),
    ("Role + Context", [
        {"role": "system", "content": "You are a senior customer success manager at a tech startup. You are empathetic, solution-focused, and professional."},
        {"role": "user", "content": f"Draft a response to: {TASK}"}
    ]),
    ("Chain-of-Thought", [
        {"role": "user", "content": f"""
Draft a customer service response to this complaint.

Think through this step by step:
1. What is the customer's emotional state?
2. What is the core problem?
3. What immediate action can we take?
4. What is the ideal tone?

Then write the response.

Complaint: {TASK}
"""}
    ]),
    ("Few-Shot", [
        {"role": "user", "content": f"""
Here are examples of excellent customer service responses:

Complaint: "The delivery was late!"
Response: "I sincerely apologise for the delayed delivery. I've flagged this as urgent and your order will be prioritised. You'll receive a 10% discount on your next order as our apology."

Complaint: "I was charged twice for my subscription."
Response: "I'm so sorry for this billing error — that should never happen. I've immediately reversed the duplicate charge and you'll see the refund within 3-5 business days."

Now respond to this complaint using the same empathetic, action-oriented style:
{TASK}
"""}
    ])
]

results = []
for technique, messages in prompts_to_test:
    result = test_prompt(technique, messages)
    results.append(result)
    print(f"\n{'='*60}")
    print(f"Technique: {result.technique} ({result.tokens_used} tokens)")
    print(f"Response:\n{result.response}")

# Save results for analysis
with open("benchmark_results.json", "w") as f:
    json.dump([vars(r) for r in results], f, indent=2)

10.3 Step 2: Automated Prompt Evaluator

Use an LLM to score prompts against criteria:

def evaluate_response(response: str, criteria: list[str]) -> dict:
    """Use GPT-4o to evaluate a response against criteria."""

    eval_prompt = f"""
Evaluate this customer service response against each criterion.
Score each from 1-10 and give a one-line reason.

Response to evaluate:
"{response}"

Criteria:
{json.dumps(criteria, indent=2)}

Return JSON only:
{{
  "criteria_name": {{"score": X, "reason": "..."}},
  ...
  "overall_score": X
}}
"""
    eval_response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": eval_prompt}],
        response_format={"type": "json_object"},
        temperature=0
    )
    return json.loads(eval_response.choices[0].message.content)

# Evaluate all results
criteria = ["Empathy", "Actionability", "Professionalism", "Clarity"]

print("\n📊 EVALUATION SCORES")
print("=" * 60)
for result in results:
    scores = evaluate_response(result.response, criteria)
    print(f"\n{result.technique}:")
    for criterion in criteria:
        s = scores.get(criterion, {})
        print(f"  {criterion}: {s.get('score', 'N/A')}/10 — {s.get('reason', '')}")
    print(f"  Overall: {scores.get('overall_score', 'N/A')}/10")

10.4 Step 3: Build a Prompt Template Library

# file: prompt_library.py
from string import Template
from dataclasses import dataclass

@dataclass
class PromptTemplate:
    name: str
    description: str
    template: str
    required_vars: list[str]

    def fill(self, **kwargs) -> str:
        t = Template(self.template)
        return t.safe_substitute(**kwargs)

PROMPT_LIBRARY = {
    "executive_summary": PromptTemplate(
        name="Executive Summary",
        description="Create a concise executive summary",
        template="""You are a senior business analyst.
Create a 3-bullet executive summary of the following content.
Each bullet should be one clear sentence.
Audience: ${audience}

Content: ${content}

Executive Summary:""",
        required_vars=["audience", "content"]
    ),

    "swot_analysis": PromptTemplate(
        name="SWOT Analysis",
        description="Generate a SWOT analysis",
        template="""You are a strategic business consultant.
Analyse ${company} in the ${industry} industry.
Provide a concise SWOT analysis with 3 points per category.

Format as:
STRENGTHS:
- ...
WEAKNESSES:
- ...
OPPORTUNITIES:
- ...
THREATS:
- ...""",
        required_vars=["company", "industry"]
    ),

    "code_review": PromptTemplate(
        name="Code Review",
        description="Review code for issues and improvements",
        template="""You are a senior ${language} developer.
Review the following code for:
1. Bugs and errors
2. Security issues
3. Performance improvements
4. Code style and best practices

Code:
```${language}
${code}

Provide specific, actionable feedback:“““, required_vars=[“language”, “code”] ) }

11 Usage example

summary_prompt = PROMPT_LIBRARY[“executive_summary”].fill( audience=“C-suite executives”, content=“Detailed report text here…” ) print(summary_prompt)


---

## Lab Challenges 🏆

1. **Easy**: Add 3 more prompt templates to the library for your specific use case
2. **Medium**: Build a Streamlit or Gradio UI for the prompt comparison tool
3. **Hard**: Implement automatic prompt optimisation — use the evaluator scores to iteratively improve a prompt

---

::: {.callout-note icon="false"}
## ✅ Lab Complete!
You've mastered the fundamentals of prompt engineering. Next up: **Vector Databases** in Chapter 9.
:::




:::{#quarto-navigation-envelope .hidden}
[Fundamentals of AI]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyLXRpdGxl"}
[Fundamentals of AI]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1uYXZiYXItdGl0bGU="}
[<span class='chapter-number'>11</span>  <span class='chapter-title'>Vector Databases Deep Dive</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1uZXh0"}
[<span class='chapter-number'>9</span>  <span class='chapter-title'>Prompt Engineering Techniques</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1wcmV2"}
[<span class='chapter-number'>1</span>  <span class='chapter-title'>cover.html</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jb3Zlci5odG1sPHNwYW4tY2xhc3M9J2NoYXB0ZXItbnVtYmVyJz4xPC9zcGFuPi0tPHNwYW4tY2xhc3M9J2NoYXB0ZXItdGl0bGUnPmNvdmVyLmh0bWw8L3NwYW4+"}
[Preface]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9pbmRleC5odG1sUHJlZmFjZQ=="}
[Foundations — The Genesis of Modern AI]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOnF1YXJ0by1zaWRlYmFyLXNlY3Rpb24tMQ=="}
[<span class='chapter-number'>2</span>  <span class='chapter-title'>From Synapses to Silicon: The Genesis of Modern AI</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jaGFwdGVycy8wMC1nZW5lc2lzLW9mLWFpLmh0bWw8c3Bhbi1jbGFzcz0nY2hhcHRlci1udW1iZXInPjI8L3NwYW4+LS08c3Bhbi1jbGFzcz0nY2hhcHRlci10aXRsZSc+RnJvbS1TeW5hcHNlcy10by1TaWxpY29uOi1UaGUtR2VuZXNpcy1vZi1Nb2Rlcm4tQUk8L3NwYW4+"}
[Part I — Understanding AI]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOnF1YXJ0by1zaWRlYmFyLXNlY3Rpb24tMg=="}
[<span class='chapter-number'>3</span>  <span class='chapter-title'>Introduction to AI Agents</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jaGFwdGVycy8wMS1pbnRyby1haS1hZ2VudHMuaHRtbDxzcGFuLWNsYXNzPSdjaGFwdGVyLW51bWJlcic+Mzwvc3Bhbj4tLTxzcGFuLWNsYXNzPSdjaGFwdGVyLXRpdGxlJz5JbnRyb2R1Y3Rpb24tdG8tQUktQWdlbnRzPC9zcGFuPg=="}
[<span class='chapter-number'>4</span>  <span class='chapter-title'>How LLMs Work in Real Time</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jaGFwdGVycy8wMi1ob3ctbGxtcy13b3JrLmh0bWw8c3Bhbi1jbGFzcz0nY2hhcHRlci1udW1iZXInPjQ8L3NwYW4+LS08c3Bhbi1jbGFzcz0nY2hhcHRlci10aXRsZSc+SG93LUxMTXMtV29yay1pbi1SZWFsLVRpbWU8L3NwYW4+"}
[<span class='chapter-number'>5</span>  <span class='chapter-title'>Embeddings & Vector Representations</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jaGFwdGVycy8wMy1lbWJlZGRpbmdzLXZlY3RvcnMuaHRtbDxzcGFuLWNsYXNzPSdjaGFwdGVyLW51bWJlcic+NTwvc3Bhbj4tLTxzcGFuLWNsYXNzPSdjaGFwdGVyLXRpdGxlJz5FbWJlZGRpbmdzLSYtVmVjdG9yLVJlcHJlc2VudGF0aW9uczwvc3Bhbj4="}
[<span class='chapter-number'>6</span>  <span class='chapter-title'>How LangChain Works</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jaGFwdGVycy8wNC1ob3ctbGFuZ2NoYWluLXdvcmtzLmh0bWw8c3Bhbi1jbGFzcz0nY2hhcHRlci1udW1iZXInPjY8L3NwYW4+LS08c3Bhbi1jbGFzcz0nY2hhcHRlci10aXRsZSc+SG93LUxhbmdDaGFpbi1Xb3Jrczwvc3Bhbj4="}
[Part II — Your First Hands-On Labs]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOnF1YXJ0by1zaWRlYmFyLXNlY3Rpb24tMw=="}
[<span class='chapter-number'>7</span>  <span class='chapter-title'>Practice Lab: Your First AI API Call</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jaGFwdGVycy8wNS1sYWItZmlyc3QtYXBpLWNhbGwuaHRtbDxzcGFuLWNsYXNzPSdjaGFwdGVyLW51bWJlcic+Nzwvc3Bhbj4tLTxzcGFuLWNsYXNzPSdjaGFwdGVyLXRpdGxlJz5QcmFjdGljZS1MYWI6LVlvdXItRmlyc3QtQUktQVBJLUNhbGw8L3NwYW4+"}
[<span class='chapter-number'>8</span>  <span class='chapter-title'>Practice Lab: LangChain</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jaGFwdGVycy8wNi1sYWItbGFuZ2NoYWluLmh0bWw8c3Bhbi1jbGFzcz0nY2hhcHRlci1udW1iZXInPjg8L3NwYW4+LS08c3Bhbi1jbGFzcz0nY2hhcHRlci10aXRsZSc+UHJhY3RpY2UtTGFiOi1MYW5nQ2hhaW48L3NwYW4+"}
[Part III — Prompt Engineering]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOnF1YXJ0by1zaWRlYmFyLXNlY3Rpb24tNA=="}
[<span class='chapter-number'>9</span>  <span class='chapter-title'>Prompt Engineering Techniques</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jaGFwdGVycy8wNy1wcm9tcHQtZW5naW5lZXJpbmcuaHRtbDxzcGFuLWNsYXNzPSdjaGFwdGVyLW51bWJlcic+OTwvc3Bhbj4tLTxzcGFuLWNsYXNzPSdjaGFwdGVyLXRpdGxlJz5Qcm9tcHQtRW5naW5lZXJpbmctVGVjaG5pcXVlczwvc3Bhbj4="}
[<span class='chapter-number'>10</span>  <span class='chapter-title'>Practice Lab: Master Prompt Engineering</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jaGFwdGVycy8wOC1sYWItcHJvbXB0LWVuZ2luZWVyaW5nLmh0bWw8c3Bhbi1jbGFzcz0nY2hhcHRlci1udW1iZXInPjEwPC9zcGFuPi0tPHNwYW4tY2xhc3M9J2NoYXB0ZXItdGl0bGUnPlByYWN0aWNlLUxhYjotTWFzdGVyLVByb21wdC1FbmdpbmVlcmluZzwvc3Bhbj4="}
[Part IV — Vector Databases & Semantic Search]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOnF1YXJ0by1zaWRlYmFyLXNlY3Rpb24tNQ=="}
[<span class='chapter-number'>11</span>  <span class='chapter-title'>Vector Databases Deep Dive</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jaGFwdGVycy8wOS12ZWN0b3ItZGF0YWJhc2VzLmh0bWw8c3Bhbi1jbGFzcz0nY2hhcHRlci1udW1iZXInPjExPC9zcGFuPi0tPHNwYW4tY2xhc3M9J2NoYXB0ZXItdGl0bGUnPlZlY3Rvci1EYXRhYmFzZXMtRGVlcC1EaXZlPC9zcGFuPg=="}
[<span class='chapter-number'>12</span>  <span class='chapter-title'>Practice Lab: Build a Semantic Search Engine</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jaGFwdGVycy8xMC1sYWItc2VtYW50aWMtc2VhcmNoLmh0bWw8c3Bhbi1jbGFzcz0nY2hhcHRlci1udW1iZXInPjEyPC9zcGFuPi0tPHNwYW4tY2xhc3M9J2NoYXB0ZXItdGl0bGUnPlByYWN0aWNlLUxhYjotQnVpbGQtYS1TZW1hbnRpYy1TZWFyY2gtRW5naW5lPC9zcGFuPg=="}
[Part V — Retrieval Augmented Generation]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOnF1YXJ0by1zaWRlYmFyLXNlY3Rpb24tNg=="}
[<span class='chapter-number'>13</span>  <span class='chapter-title'>RAG: Retrieval Augmented Generation</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jaGFwdGVycy8xMS1yYWcuaHRtbDxzcGFuLWNsYXNzPSdjaGFwdGVyLW51bWJlcic+MTM8L3NwYW4+LS08c3Bhbi1jbGFzcz0nY2hhcHRlci10aXRsZSc+UkFHOi1SZXRyaWV2YWwtQXVnbWVudGVkLUdlbmVyYXRpb248L3NwYW4+"}
[<span class='chapter-number'>14</span>  <span class='chapter-title'>Practice Lab: RAG Implementation</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jaGFwdGVycy8xMi1sYWItcmFnLmh0bWw8c3Bhbi1jbGFzcz0nY2hhcHRlci1udW1iZXInPjE0PC9zcGFuPi0tPHNwYW4tY2xhc3M9J2NoYXB0ZXItdGl0bGUnPlByYWN0aWNlLUxhYjotUkFHLUltcGxlbWVudGF0aW9uPC9zcGFuPg=="}
[Part VI — Advanced AI Workflows]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOnF1YXJ0by1zaWRlYmFyLXNlY3Rpb24tNw=="}
[<span class='chapter-number'>15</span>  <span class='chapter-title'>LangGraph for AI Workflows</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jaGFwdGVycy8xMy1sYW5nZ3JhcGguaHRtbDxzcGFuLWNsYXNzPSdjaGFwdGVyLW51bWJlcic+MTU8L3NwYW4+LS08c3Bhbi1jbGFzcz0nY2hhcHRlci10aXRsZSc+TGFuZ0dyYXBoLWZvci1BSS1Xb3JrZmxvd3M8L3NwYW4+"}
[<span class='chapter-number'>16</span>  <span class='chapter-title'>Practice Lab: Build a Stateful AI Workflow</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jaGFwdGVycy8xNC1sYWItbGFuZ2dyYXBoLmh0bWw8c3Bhbi1jbGFzcz0nY2hhcHRlci1udW1iZXInPjE2PC9zcGFuPi0tPHNwYW4tY2xhc3M9J2NoYXB0ZXItdGl0bGUnPlByYWN0aWNlLUxhYjotQnVpbGQtYS1TdGF0ZWZ1bC1BSS1Xb3JrZmxvdzwvc3Bhbj4="}
[<span class='chapter-number'>17</span>  <span class='chapter-title'>Model Context Protocol (MCP)</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jaGFwdGVycy8xNS1tY3AuaHRtbDxzcGFuLWNsYXNzPSdjaGFwdGVyLW51bWJlcic+MTc8L3NwYW4+LS08c3Bhbi1jbGFzcz0nY2hhcHRlci10aXRsZSc+TW9kZWwtQ29udGV4dC1Qcm90b2NvbC0oTUNQKTwvc3Bhbj4="}
[<span class='chapter-number'>18</span>  <span class='chapter-title'>Practice Lab: Advanced MCP Concepts</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jaGFwdGVycy8xNi1sYWItbWNwLmh0bWw8c3Bhbi1jbGFzcz0nY2hhcHRlci1udW1iZXInPjE4PC9zcGFuPi0tPHNwYW4tY2xhc3M9J2NoYXB0ZXItdGl0bGUnPlByYWN0aWNlLUxhYjotQWR2YW5jZWQtTUNQLUNvbmNlcHRzPC9zcGFuPg=="}
[<span class='chapter-number'>19</span>  <span class='chapter-title'>Conclusion: Your AI Journey Continues</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9jaGFwdGVycy8xNy1jb25jbHVzaW9uLmh0bWw8c3Bhbi1jbGFzcz0nY2hhcHRlci1udW1iZXInPjE5PC9zcGFuPi0tPHNwYW4tY2xhc3M9J2NoYXB0ZXItdGl0bGUnPkNvbmNsdXNpb246LVlvdXItQUktSm91cm5leS1Db250aW51ZXM8L3NwYW4+"}
[References]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWludC1zaWRlYmFyOi9yZWZlcmVuY2VzLmh0bWxSZWZlcmVuY2Vz"}
[Part III — Prompt Engineering]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWJyZWFkY3J1bWJzLVBhcnQtSUlJLeKAlC1Qcm9tcHQtRW5naW5lZXJpbmc="}
[<span class='chapter-number'>10</span>  <span class='chapter-title'>Practice Lab: Master Prompt Engineering</span>]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLWJyZWFkY3J1bWJzLTxzcGFuLWNsYXNzPSdjaGFwdGVyLW51bWJlcic+MTA8L3NwYW4+LS08c3Bhbi1jbGFzcz0nY2hhcHRlci10aXRsZSc+UHJhY3RpY2UtTGFiOi1NYXN0ZXItUHJvbXB0LUVuZ2luZWVyaW5nPC9zcGFuPg=="}

:::{.hidden .quarto-markdown-envelope-contents render-id="Zm9vdGVyLWxlZnQ="}
© 2026 Prof. Bongo Adi | Lagos Business School
:::


:::{.hidden .quarto-markdown-envelope-contents render-id="Zm9vdGVyLXJpZ2h0"}
Built with [Quarto](https://quarto.org)
:::

:::



:::{#quarto-meta-markdown .hidden}
[[[10]{.chapter-number}  [Practice Lab: Master Prompt Engineering]{.chapter-title}]{#sec-lab-prompts .quarto-section-identifier} – Fundamentals of AI]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLW1ldGF0aXRsZQ=="}
[[[10]{.chapter-number}  [Practice Lab: Master Prompt Engineering]{.chapter-title}]{#sec-lab-prompts .quarto-section-identifier} – Fundamentals of AI]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLXR3aXR0ZXJjYXJkdGl0bGU="}
[[[10]{.chapter-number}  [Practice Lab: Master Prompt Engineering]{.chapter-title}]{#sec-lab-prompts .quarto-section-identifier} – Fundamentals of AI]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLW9nY2FyZHRpdGxl"}
[Fundamentals of AI]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLW1ldGFzaXRlbmFtZQ=="}
[A hands-on, interactive introduction to Artificial Intelligence for professionals
and students. From understanding how large language models work to building
production-ready RAG pipelines and AI workflows — no PhD required.
]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLXR3aXR0ZXJjYXJkZGVzYw=="}
[A hands-on, interactive introduction to Artificial Intelligence for professionals
and students. From understanding how large language models work to building
production-ready RAG pipelines and AI workflows — no PhD required.
]{.hidden .quarto-markdown-envelope-contents render-id="cXVhcnRvLW9nY2FyZGRkZXNj"}
:::




<!-- -->

::: {.quarto-embedded-source-code}
```````````````````{.markdown shortcodes="false"}
# Practice Lab: Master Prompt Engineering {#sec-lab-prompts}

::: {.callout-important icon="false"}
## 🧪 Lab Overview
**Duration:** 90–120 minutes | **Difficulty:** ⭐⭐☆☆☆
**Goal:** Build a prompt comparison tool and benchmark different techniques on real tasks.
:::

## What You'll Build

A **Prompt Benchmarking Toolkit** that tests multiple prompt strategies against the same task and scores them.

---

## Step 1: Prompt Comparison Engine

```python
# file: prompt_benchmark.py
from openai import OpenAI
from dataclasses import dataclass
from typing import Callable
import json

client = OpenAI()

@dataclass
class PromptResult:
    technique: str
    prompt: str
    response: str
    tokens_used: int

def test_prompt(technique: str, messages: list) -> PromptResult:
    """Run a single prompt and capture results."""
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        temperature=0.3
    )
    return PromptResult(
        technique=technique,
        prompt=str(messages),
        response=response.choices[0].message.content,
        tokens_used=response.usage.total_tokens
    )

# Define the task
TASK = "A customer says: 'Your app crashed and I lost all my data. Fix it NOW!'"

# Test all techniques
prompts_to_test = [
    ("Zero-Shot", [
        {"role": "user", "content": f"Respond to this customer complaint: {TASK}"}
    ]),
    ("Role + Context", [
        {"role": "system", "content": "You are a senior customer success manager at a tech startup. You are empathetic, solution-focused, and professional."},
        {"role": "user", "content": f"Draft a response to: {TASK}"}
    ]),
    ("Chain-of-Thought", [
        {"role": "user", "content": f"""
Draft a customer service response to this complaint.

Think through this step by step:
1. What is the customer's emotional state?
2. What is the core problem?
3. What immediate action can we take?
4. What is the ideal tone?

Then write the response.

Complaint: {TASK}
"""}
    ]),
    ("Few-Shot", [
        {"role": "user", "content": f"""
Here are examples of excellent customer service responses:

Complaint: "The delivery was late!"
Response: "I sincerely apologise for the delayed delivery. I've flagged this as urgent and your order will be prioritised. You'll receive a 10% discount on your next order as our apology."

Complaint: "I was charged twice for my subscription."
Response: "I'm so sorry for this billing error — that should never happen. I've immediately reversed the duplicate charge and you'll see the refund within 3-5 business days."

Now respond to this complaint using the same empathetic, action-oriented style:
{TASK}
"""}
    ])
]

results = []
for technique, messages in prompts_to_test:
    result = test_prompt(technique, messages)
    results.append(result)
    print(f"\n{'='*60}")
    print(f"Technique: {result.technique} ({result.tokens_used} tokens)")
    print(f"Response:\n{result.response}")

# Save results for analysis
with open("benchmark_results.json", "w") as f:
    json.dump([vars(r) for r in results], f, indent=2)

11.1 Step 2: Automated Prompt Evaluator

Use an LLM to score prompts against criteria:

def evaluate_response(response: str, criteria: list[str]) -> dict:
    """Use GPT-4o to evaluate a response against criteria."""

    eval_prompt = f"""
Evaluate this customer service response against each criterion.
Score each from 1-10 and give a one-line reason.

Response to evaluate:
"{response}"

Criteria:
{json.dumps(criteria, indent=2)}

Return JSON only:
{{
  "criteria_name": {{"score": X, "reason": "..."}},
  ...
  "overall_score": X
}}
"""
    eval_response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": eval_prompt}],
        response_format={"type": "json_object"},
        temperature=0
    )
    return json.loads(eval_response.choices[0].message.content)

# Evaluate all results
criteria = ["Empathy", "Actionability", "Professionalism", "Clarity"]

print("\n📊 EVALUATION SCORES")
print("=" * 60)
for result in results:
    scores = evaluate_response(result.response, criteria)
    print(f"\n{result.technique}:")
    for criterion in criteria:
        s = scores.get(criterion, {})
        print(f"  {criterion}: {s.get('score', 'N/A')}/10 — {s.get('reason', '')}")
    print(f"  Overall: {scores.get('overall_score', 'N/A')}/10")

11.2 Step 3: Build a Prompt Template Library

# file: prompt_library.py
from string import Template
from dataclasses import dataclass

@dataclass
class PromptTemplate:
    name: str
    description: str
    template: str
    required_vars: list[str]

    def fill(self, **kwargs) -> str:
        t = Template(self.template)
        return t.safe_substitute(**kwargs)

PROMPT_LIBRARY = {
    "executive_summary": PromptTemplate(
        name="Executive Summary",
        description="Create a concise executive summary",
        template="""You are a senior business analyst.
Create a 3-bullet executive summary of the following content.
Each bullet should be one clear sentence.
Audience: ${audience}

Content: ${content}

Executive Summary:""",
        required_vars=["audience", "content"]
    ),

    "swot_analysis": PromptTemplate(
        name="SWOT Analysis",
        description="Generate a SWOT analysis",
        template="""You are a strategic business consultant.
Analyse ${company} in the ${industry} industry.
Provide a concise SWOT analysis with 3 points per category.

Format as:
STRENGTHS:
- ...
WEAKNESSES:
- ...
OPPORTUNITIES:
- ...
THREATS:
- ...""",
        required_vars=["company", "industry"]
    ),

    "code_review": PromptTemplate(
        name="Code Review",
        description="Review code for issues and improvements",
        template="""You are a senior ${language} developer.
Review the following code for:
1. Bugs and errors
2. Security issues
3. Performance improvements
4. Code style and best practices

Code:
```${language}
${code}

Provide specific, actionable feedback:“““, required_vars=[“language”, “code”] ) }

12 Usage example

summary_prompt = PROMPT_LIBRARY[“executive_summary”].fill( audience=“C-suite executives”, content=“Detailed report text here…” ) print(summary_prompt)


---

## Lab Challenges 🏆

1. **Easy**: Add 3 more prompt templates to the library for your specific use case
2. **Medium**: Build a Streamlit or Gradio UI for the prompt comparison tool
3. **Hard**: Implement automatic prompt optimisation — use the evaluator scores to iteratively improve a prompt

---

::: {.callout-note icon="false"}
## ✅ Lab Complete!
You've mastered the fundamentals of prompt engineering. Next up: **Vector Databases** in Chapter 9.
:::

:::