Reflexio Docs
Core Concepts

Understanding Agent Playbooks

Learn how the Reflexio playbook system captures and analyzes user responses to drive continuous agent improvement.

Understanding Agent Playbooks

The Reflexio playbook system operates at the agent level, capturing and analyzing how users respond to agent interactions to drive continuous improvement of agent performance across all users.

What Is an Agent Playbook?

Agent playbooks represent user responses that indicate satisfaction, dissatisfaction, or suggestions for improvement regarding agent behavior. Unlike user profiles (which are user-specific memory), playbooks are agent-level data that helps improve agent performance for all users.

Think of the playbook system as your agent's performance review system - it automatically identifies when users provide feedback about agent behavior and aggregates these insights to guide agent improvement.

Playbooks vs. Profiles: Key Differences

AspectUser ProfilesAgent Playbooks
ScopeUser-specific memoryAgent-wide performance
PurposePersonalizationImprovement
Data LevelIndividual user characteristicsCollective user satisfaction
UsageCustomize responses per userEnhance agent for all users
LifetimeEvolves with userTied to agent versions

The Playbook Extraction Process

1. Configurable Playbook Detection

Reflexio uses AI prompts that you configure to automatically identify playbook entries in user interactions:

PlaybookConfig(
    playbook_name="customer_service_playbook",
    playbook_definition_prompt="""
    Extract playbook entries about agent performance, including:
    - Complaints about agent responses or behavior
    - Suggestions for how the agent could improve
    - Praise for helpful or effective agent actions
    - Requests for different communication styles or approaches
    - Comments about agent understanding or misunderstanding
    - Reactions to agent recommendations or solutions

    Focus on actionable insights that could improve future agent interactions.
    """,
    playbook_aggregator_config=PlaybookAggregatorConfig(
        min_cluster_size=3  # Need 3+ similar user playbooks before aggregating
    )
)

2. Intelligent Playbook Recognition

The AI doesn't just look for explicit complaints - it understands subtle cues:

User says: "I prefer more concise answers from you" Extracted user playbook: "Provide more concise answers and avoid lengthy explanations"

User says: "That's not quite what I was looking for..." Extracted user playbook: "Agent response didn't match user intent, improve query understanding"

User says: "Perfect! That's exactly what I needed." Extracted user playbook: "Response was accurate and helpful, maintain this approach"

Types of Agent Playbook Entries

Performance Playbook Entries

Playbook entries about how well the agent performs its tasks:

performance_playbook_examples = [
    "Agent provided accurate technical information",
    "Response was helpful but took too long to generate",
    "Agent missed the main point of the user's question",
    "Solution provided was exactly what was needed",
    "Agent should ask clarifying questions before responding"
]

Communication Style Playbook Entries

Playbook entries about how the agent communicates:

communication_playbook_examples = [
    "Agent responses are too formal for casual conversations",
    "Appreciate the friendly and approachable tone",
    "Explanations are too technical for non-expert users",
    "Agent should be more empathetic in support situations",
    "Love the concise and direct communication style"
]

Process and Workflow Playbook Entries

Playbook entries about the agent's approach to problem-solving:

workflow_playbook_examples = [
    "Agent should gather more context before providing solutions",
    "Step-by-step breakdown was very helpful",
    "Agent jumped to conclusions without understanding the problem",
    "Appreciate the follow-up questions to ensure understanding",
    "Should offer multiple options instead of just one solution"
]

Feature and Capability Playbook Entries

Playbook entries about what the agent can or should be able to do:

capability_playbook_examples = [
    "Wish the agent could handle image analysis",
    "Agent needs better integration with external tools",
    "Great that the agent remembers previous conversations",
    "Should be able to schedule appointments automatically",
    "Agent could benefit from real-time data access"
]

Learning from Expert Responses

In addition to extracting playbook entries from user behavior, Reflexio can learn directly from human expert responses. When you provide an expert's ideal response via the expert_content field, Reflexio compares it against the agent's actual response and generates targeted playbook entries.

How Expert-Derived Playbooks Work

  1. Publish with expert content: Include expert_content on agent turns where an expert has provided an ideal response
  2. Automatic detection: Reflexio detects expert content and switches to a specialized extraction pipeline
  3. Comparison analysis: The system pairs each expert response with the agent's response and the preceding user question
  4. SOP generation: Playbook entries are extracted as structured SOPs -- Situation/Trigger, Instruction, and Pitfall -- describing what the agent should do differently
  5. Standard aggregation: Expert-derived user playbooks enter the same aggregation pipeline as user-derived playbooks

Example: Expert Review Workflow

# A domain expert reviews agent conversations and provides ideal responses
client.publish_interaction(
    user_id="customer_456",
    interactions=[
        InteractionData(
            role="User",
            content="Can I use my FSA card to pay for contact lenses?"
        ),
        InteractionData(
            role="Agent",
            content="Yes, you can use your FSA card for contact lenses.",
            expert_content=(
                "Yes, contact lenses are an FSA-eligible expense. You can use your FSA card "
                "directly at checkout. Keep your receipt and prescription as documentation. "
                "Note that FSA funds expire at the end of the plan year unless your employer "
                "offers a grace period or rollover option."
            ),
        ),
    ],
    source="expert_review",
    agent_version="v2.1.0",
    session_id="expert_batch_001"
)

When to use expert-derived playbooks:

  • When you have domain experts who can provide gold-standard responses
  • During agent onboarding to bootstrap playbooks before enough user interactions accumulate
  • For compliance-critical domains where accuracy must meet expert standards
  • As part of a quality assurance workflow to systematically improve agent responses

Playbook Lifecycle: From User Playbook to Agent Playbook

1. User Playbook Generation

When users provide signals about agent performance, Reflexio automatically:

# Example interaction containing playbook signals
client.publish_interaction(
    user_id="frustrated_customer",
    interactions=[
        InteractionData(
            role="User",
            content="I already told you my budget is $500. Why do you keep recommending products over $1000?"
        )
    ],
    source="customer_support",
    agent_version="v2.1.3",
    session_id="support_session_042"
)

# Reflexio automatically extracts a user playbook:
# "Agent not respecting user's stated budget constraints"

2. User Playbook Storage

Each user playbook includes:

class UserPlaybook:
    user_playbook_id: int          # Unique identifier
    agent_version: str             # Which agent version this applies to
    request_id: str                # Source interaction
    playbook_name: str             # Playbook category name
    created_at: int                # Timestamp
    content: str                   # The main actionable content
    trigger: str                   # Condition/context when this rule applies
    rationale: str                 # Why this playbook entry was extracted
    blocking_issue: BlockingIssue  # Root cause when agent couldn't complete action
    embedding: List[float]         # Vector embedding for similarity

3. Agent Playbook Aggregation

Multiple user playbooks are aggregated when patterns emerge:

# Multiple users provide similar signals:
user_playbooks = [
    "Agent should respect budget constraints mentioned by users",
    "Agent ignores price limits provided by customers",
    "Agent recommendations exceed stated budget requirements",
    "Agent needs to better consider financial constraints"
]

# After threshold is met, Reflexio creates an agent playbook:
agent_playbook = "Agent consistently fails to respect user budget constraints when making recommendations. Implement budget filtering in recommendation logic."

Playbook Configuration Strategies

Domain-Specific Playbook Systems

E-commerce Agent Playbook

ecommerce_playbook = PlaybookConfig(
    playbook_name="ecommerce_agent_playbook",
    playbook_definition_prompt="""
    Extract playbook entries about sales agent performance:
    - Product recommendation accuracy and relevance
    - Respect for budget constraints and price sensitivity
    - Understanding of customer needs and preferences
    - Communication style appropriateness for sales context
    - Helpfulness in product comparison and selection
    - Follow-up and customer service quality
    """,
    playbook_aggregator_config=PlaybookAggregatorConfig(min_cluster_size=5)
)

Educational Platform Playbook

education_playbook = PlaybookConfig(
    playbook_name="tutoring_agent_playbook",
    playbook_definition_prompt="""
    Extract playbook entries about tutoring agent effectiveness:
    - Clarity and quality of explanations
    - Appropriate difficulty level for student
    - Patience and encouragement in teaching style
    - Ability to adapt to different learning styles
    - Effectiveness of examples and practice problems
    - Progress tracking and assessment quality
    """,
    playbook_aggregator_config=PlaybookAggregatorConfig(min_cluster_size=3)
)

Technical Support Playbook

support_playbook = PlaybookConfig(
    playbook_name="technical_support_playbook",
    playbook_definition_prompt="""
    Extract playbook entries about technical support agent performance:
    - Accuracy of technical solutions provided
    - Speed and efficiency of problem resolution
    - Clarity of technical explanations for user level
    - Appropriate escalation and resource usage
    - Empathy and patience with frustrated users
    - Follow-through and issue closure quality
    """,
    playbook_aggregator_config=PlaybookAggregatorConfig(min_cluster_size=4)
)

Multi-Agent Playbook Systems

For systems with multiple agent types:

# Configure playbooks for different agent roles
playbook_configs = [
    PlaybookConfig(
        playbook_name="sales_agent_playbook",
        playbook_definition_prompt="Extract sales-specific playbook entries..."
    ),
    PlaybookConfig(
        playbook_name="support_agent_playbook",
        playbook_definition_prompt="Extract support-specific playbook entries..."
    ),
    PlaybookConfig(
        playbook_name="onboarding_agent_playbook",
        playbook_definition_prompt="Extract onboarding-specific playbook entries..."
    )
]

# Apply all playbook configurations
config = client.get_config()
config.playbook_configs = playbook_configs
client.set_config(config)

Using Playbooks for Agent Improvement

Playbook Analysis and Reporting

def generate_playbook_report(agent_version=None):
    """Generate comprehensive playbook analysis for agent improvement."""

    # Get user playbooks
    user_playbooks = client.get_user_playbooks()

    # Filter by agent version if specified
    if agent_version:
        relevant_playbooks = [
            pb for pb in user_playbooks.user_playbooks
            if pb.agent_version == agent_version
        ]
    else:
        relevant_playbooks = user_playbooks.user_playbooks

    # Analyze playbook patterns
    playbook_categories = {}
    for playbook in relevant_playbooks:
        category = playbook.playbook_name
        if category not in playbook_categories:
            playbook_categories[category] = []
        playbook_categories[category].append(playbook.content)

    # Generate report
    report = {
        "total_playbooks": len(relevant_playbooks),
        "agent_version": agent_version or "all_versions",
        "categories": {}
    }

    for category, playbooks in playbook_categories.items():
        report["categories"][category] = {
            "count": len(playbooks),
            "samples": playbooks[:5],  # First 5 examples
            "common_themes": extract_common_themes(playbooks)
        }

    return report

def extract_common_themes(playbooks):
    """Extract common themes from playbook list (simplified)."""
    all_words = " ".join(playbooks).lower().split()
    word_count = {}
    for word in all_words:
        if len(word) > 4:  # Filter short words
            word_count[word] = word_count.get(word, 0) + 1

    # Return most common meaningful words
    return sorted(word_count.items(), key=lambda x: x[1], reverse=True)[:10]

Version Comparison and Regression Detection

def compare_agent_versions(version_a, version_b):
    """Compare playbook entries between two agent versions."""

    user_playbooks = client.get_user_playbooks()

    version_a_playbooks = [
        pb for pb in user_playbooks.user_playbooks
        if pb.agent_version == version_a
    ]

    version_b_playbooks = [
        pb for pb in user_playbooks.user_playbooks
        if pb.agent_version == version_b
    ]

    comparison = {
        "version_a": {
            "version": version_a,
            "playbook_count": len(version_a_playbooks),
            "sample_playbooks": [pb.content for pb in version_a_playbooks[:3]]
        },
        "version_b": {
            "version": version_b,
            "playbook_count": len(version_b_playbooks),
            "sample_playbooks": [pb.content for pb in version_b_playbooks[:3]]
        }
    }

    # Identify potential regressions
    if len(version_b_playbooks) > len(version_a_playbooks) * 1.5:
        comparison["warning"] = "Significant increase in playbook volume - possible regression"

    return comparison

Continuous Improvement Workflows

class PlaybookDrivenImprovement:
    """System for continuous agent improvement based on playbooks."""

    def __init__(self, client, agent_version):
        self.client = client
        self.agent_version = agent_version
        self.improvement_threshold = 5  # Playbooks needed for action

    def identify_improvement_opportunities(self):
        """Identify specific areas needing improvement."""

        # Get user playbooks for current version
        user_playbooks = self.client.get_user_playbooks()
        version_playbooks = [
            pb for pb in user_playbooks.user_playbooks
            if pb.agent_version == self.agent_version
        ]

        # Group by theme
        improvement_areas = {}
        for playbook in version_playbooks:
            # Categorize (simplified)
            content_lower = playbook.content.lower()

            if "budget" in content_lower or "price" in content_lower:
                category = "budget_awareness"
            elif "communication" in content_lower or "tone" in content_lower:
                category = "communication_style"
            elif "accuracy" in content_lower or "correct" in content_lower:
                category = "response_accuracy"
            else:
                category = "general"

            if category not in improvement_areas:
                improvement_areas[category] = []
            improvement_areas[category].append(playbook.content)

        # Identify high-priority areas
        priority_areas = {
            category: playbooks for category, playbooks in improvement_areas.items()
            if len(playbooks) >= self.improvement_threshold
        }

        return priority_areas

    def track_improvement_progress(self, previous_version):
        """Track whether improvements have been effective."""

        current_playbooks = self.get_version_playbooks(self.agent_version)
        previous_playbooks = self.get_version_playbooks(previous_version)

        # Simple progress tracking
        progress = {
            "current_version": self.agent_version,
            "previous_version": previous_version,
            "playbook_reduction": len(previous_playbooks) - len(current_playbooks),
            "improvement_detected": len(current_playbooks) < len(previous_playbooks)
        }

        return progress

    def get_version_playbooks(self, version):
        """Get all user playbooks for a specific agent version."""
        user_playbooks = self.client.get_user_playbooks()
        return [
            pb for pb in user_playbooks.user_playbooks
            if pb.agent_version == version
        ]

# Usage
improvement_system = PlaybookDrivenImprovement(client, "v2.1.4")
opportunities = improvement_system.identify_improvement_opportunities()

print("Priority improvement areas:")
for area, playbooks in opportunities.items():
    print(f"- {area}: {len(playbooks)} playbook entries")
    print(f"  Sample: {playbooks[0]}")

Playbook-Driven Development Cycle

1. Deploy and Monitor

# Deploy new agent version with playbook tracking
new_version = "v2.2.0"
publish_interactions_with_version(client, new_version)

# Monitor playbook collection
playbook_monitor = PlaybookMonitor(client, new_version)
playbook_monitor.start_monitoring()

2. Analyze and Prioritize

# After sufficient interaction volume
playbook_analysis = generate_playbook_report(new_version)

# Prioritize improvements
high_priority = [
    category for category, data in playbook_analysis["categories"].items()
    if data["count"] >= 5  # Significant volume
]

3. Implement and Validate

# Implement improvements for next version
next_version = "v2.2.1"

# Deploy and compare
comparison = compare_agent_versions("v2.2.0", "v2.2.1")

if comparison.get("warning"):
    print("Potential regression detected - investigate immediately")
else:
    print("Playbook patterns stable or improved")

Best Practices for Playbook Systems

1. Configure Meaningful Playbook Categories

# Create specific, actionable categories
playbook_categories = {
    "response_relevance": "How well agent responses match user intent",
    "communication_clarity": "Clarity and understandability of agent communication",
    "problem_resolution": "Effectiveness at solving user problems",
    "efficiency": "Speed and conciseness of agent interactions",
    "empathy": "Emotional intelligence and user sensitivity"
}

2. Set Appropriate Aggregation Thresholds

# Balance signal vs. noise
aggregation_strategies = {
    "high_volume_system": PlaybookAggregatorConfig(min_cluster_size=10),
    "moderate_volume": PlaybookAggregatorConfig(min_cluster_size=5),
    "low_volume_system": PlaybookAggregatorConfig(min_cluster_size=2)
}

3. Regular Playbook Review Cycles

def weekly_playbook_review():
    """Regular review process for playbook analysis."""

    # Get user playbooks from last week
    week_ago = int((datetime.now() - timedelta(days=7)).timestamp())
    recent_playbooks = [
        pb for pb in client.get_user_playbooks().user_playbooks
        if pb.created_at >= week_ago
    ]

    if not recent_playbooks:
        print("No playbook entries to review this week")
        return

    # Generate insights
    print(f"Weekly Playbook Review: {len(recent_playbooks)} new entries")

    # Group by category and analyze trends
    for playbook in recent_playbooks[:5]:  # Show top 5
        print(f"- {playbook.content}")

    # Check for urgent issues (high-frequency negative playbook entries)
    urgent_issues = identify_urgent_playbook_issues(recent_playbooks)
    if urgent_issues:
        print(f"Urgent issues detected: {urgent_issues}")

The playbook system transforms user responses into actionable insights that drive continuous agent improvement. By systematically collecting, analyzing, and acting on playbook entries, you create agents that become more effective and user-friendly over time.