Skip to main content
A prompt template that instructs Ana to query your database and generate an interactive HTML report with filters.

Prompt

# ============================================================================
# PLAYBOOK CONSTRUCTION SKELETON - BEST PRACTICES (GENERALIZED)
# ============================================================================
# A comprehensive guide for building data-driven HTML reports with interactivity
# ============================================================================

## OVERALL STRUCTURE

1. CONFIGURATION & SETUP
2. DATA LOADING (External Files)
3. DATA QUERYING (Primary Data)
4. DATA QUERYING (Detail/Subcategory Data)
5. HELPER FUNCTIONS
6. DATA PROCESSING & TRANSFORMATION
7. METRIC DEFINITIONS
8. TABLE/REPORT BUILDING
9. DATA SERIALIZATION (to JavaScript)
10. HTML GENERATION
11. FILE OUTPUT

# ============================================================================
# STEP 1: CONFIGURATION & SETUP
# ============================================================================

Purpose: Define constants, parameters, and report scope
- Organization/entity identifier
- Data segment combinations (e.g., categories, regions, products)
- Date ranges
- Business logic constants

Best Practices:
- Use descriptive constant names in UPPER_CASE
- Group related configurations together
- Document any non-obvious business rules
- Make it easy to add/remove segments

# ============================================================================
# STEP 2: LOAD EXTERNAL DATA (Excel/CSV files)
# ============================================================================

Purpose: Load reference data from uploaded files (budgets, targets, benchmarks)
- Use try/except for error handling
- Create mappings between file data and database values
- Filter to relevant scope (organization, year, etc.)
- Handle missing files gracefully

Best Practices:
- Always use try/except when loading external files
- Create empty DataFrames as fallback
- Print clear success/error messages
- Validate data after loading (check for required columns)
- Map file names to database names explicitly

# ============================================================================
# STEP 3: QUERY PRIMARY DATA
# ============================================================================

Purpose: Pull main metrics from database
- Loop through each data segment combination
- Store results in dictionary keyed by combination
- Print progress for transparency
- Handle empty results gracefully

Best Practices:
- Use enumerate() for progress tracking
- Store in dictionary with tuple keys for multi-dimensional grouping
- Skip empty results but log them
- Print row counts for validation
- Use consistent filter structure across all queries

# ============================================================================
# STEP 4: QUERY DETAIL/SUBCATEGORY DATA
# ============================================================================

Purpose: Pull granular breakdowns (subcategories, line items, etc.)
- Similar loop structure as primary query
- Include additional dimensions for granularity
- CRITICAL: Include parent category dimension to prevent duplication

Best Practices:
- Only query details for segments with primary data
- Always include parent category to prevent row duplication
- Use consistent filter structure across queries
- Log when detail data is missing
- Verify no duplicate rows in results

# ============================================================================
# STEP 5: DEFINE HELPER FUNCTIONS
# ============================================================================

Purpose: Reusable functions for calculations and formatting

Key functions to include:
- format_value() - Format numbers as currency, percent, decimal, etc.
- get_value() - Extract single period value
- get_annual_value() - Calculate annual aggregate (weighted if needed)
- get_ytd_value() - Calculate year-to-date aggregate (weighted if needed)
- get_detail_value() - Extract subcategory values
- get_reference_value() - Lookup from external file (targets, benchmarks)

Best Practices:
- Add docstrings to all functions
- Handle None/NaN values gracefully
- Use consistent parameter names
- Include type hints for clarity
- Test edge cases (empty data, division by zero)
- Return None for missing data, not 0

# ============================================================================
# STEP 6: PROCESS & TRANSFORM DATA
# ============================================================================

Purpose: Clean, calculate derived metrics, merge datasets
- Convert date columns to datetime
- Calculate derived metrics (ratios, normalized values)
- Merge detail data with primary data
- Store in organized dictionary structure

Best Practices:
- Convert dates early in the process
- Handle division by zero explicitly
- Use .copy() to avoid SettingWithCopyWarning
- Store related datasets together in nested dictionaries
- Validate calculations with spot checks
- Create all derived metrics before building tables

# ============================================================================
# STEP 7: DEFINE COMPREHENSIVE METRICS LIST
# ============================================================================

Purpose: Create master list of all metrics to display
Structure: (display_name, column_name, format_type, is_indent, data_source, filter)

This creates a single source of truth for report structure.

Best Practices:
- Extract subcategories from actual data (don't hardcode)
- Use consistent naming conventions
- Group related metrics together
- Use indentation for hierarchical metrics
- Document any special calculation requirements
- Keep list in logical display order

# ============================================================================
# STEP 8: BUILD TABLES WITH DATA POPULATION
# ============================================================================

Purpose: Create final data tables for each segment combination
- Determine available time periods dynamically
- Loop through metrics list
- Populate each column (years, months, YTD, targets)
- Apply conditional formatting logic
- Store as DataFrame

Best Practices:
- Determine time periods dynamically from data
- Use consistent column naming patterns
- Handle parsing errors gracefully
- Consider metric directionality for conditional formatting
- Validate table structure before proceeding
- Apply conditional formatting after all data is populated

# ============================================================================
# STEP 9: CONVERT TO JAVASCRIPT DATA STRUCTURE
# ============================================================================

Purpose: Serialize Python data to JSON for HTML interactivity
- Organize by hierarchical structure (Category → Segment)
- Convert DataFrames to column-oriented dictionaries
- Serialize to JSON string for embedding

Best Practices:
- Use column-oriented format for efficient rendering
- Indent JSON for readability during debugging
- Validate JSON structure before embedding
- Consider data size (compress if needed)
- Test with special characters in data

# ============================================================================
# STEP 10: GENERATE HTML REPORT
# ============================================================================

Purpose: Create interactive HTML with embedded data and styling

Key Components:
1. Version/Timestamp generation
2. CSS styling (colors, layout, responsive design)
3. HTML structure (header, filters, tables, footer)
4. JavaScript for interactivity
5. Embedded data

Best Practices:
- Use semantic HTML5 elements
- Include meta tags for caching and viewport
- Embed CSS and JavaScript for portability
- Use CSS custom properties for theming
- Make responsive with media queries
- Add accessibility features (ARIA labels)
- Include version in multiple places

# ============================================================================
# STEP 11: FILE OUTPUT
# ============================================================================

Purpose: Save HTML file with proper naming convention

Best Practices:
- Use descriptive filenames with version
- Include UTF-8 encoding for special characters
- Log file size and key statistics
- Provide clear completion message
- Include next steps or usage instructions

# ============================================================================
# KEY DESIGN PATTERNS
# ============================================================================

1. Separation of Concerns
   - Data acquisition (Steps 2-4)
   - Data processing (Steps 5-6)
   - Presentation (Steps 9-10)
   - Each step has a single responsibility

2. Error Handling
   - Try/except for external dependencies
   - Check for empty results before proceeding
   - Graceful degradation (continue without optional data)
   - Clear error messages for debugging

3. Progress Tracking
   - Print statements at each major step
   - Row counts and success indicators
   - Enumeration for loop progress
   - Final summary with statistics

4. Reusability
   - Helper functions for common operations
   - Parameterized queries
   - Modular HTML/CSS/JS components
   - Configuration at the top

5. Maintainability
   - Clear section headers with separators
   - Inline comments for complex logic
   - Consistent naming conventions
   - Version tracking throughout

6. Performance
   - Batch queries by segment
   - Store results in dictionaries (O(1) lookup)
   - Minimize redundant calculations
   - Efficient data structures (column-oriented for JS)

7. Data Integrity
   - Include parent categories to prevent duplication
   - Use weighted averages for rates
   - Validate calculations with spot checks
   - Handle division by zero explicitly

# ============================================================================
# COMMON PITFALLS TO AVOID
# ============================================================================

## 1. DATA DUPLICATION

Problem: Querying subcategories without parent category causes duplicate rows
Solution: Always include parent_category dimension in detail queries

Problem: Using simple averages for rates/ratios
Solution: Use weighted averages by volume/member months

## 2. DATE HANDLING

Problem: Assuming dates are datetime objects
Solution: Convert to datetime early with pd.to_datetime()

Problem: Hardcoding month ranges
Solution: Determine dynamically from data (df['date'].max())

## 3. FORMATTING

Problem: Not checking for None/NaN before formatting
Solution: Always check: if value is None or pd.isna(value): return '-'

Problem: Inconsistent decimal places
Solution: Use format_type parameter consistently

## 4. CALCULATIONS

Problem: Division by zero errors
Solution: Check denominator > 0 before dividing

Problem: Wrong aggregation method for rates
Solution: Weighted average by volume, not simple average

## 5. HTML/JAVASCRIPT

Problem: Not escaping special characters in strings
Solution: Use json.dumps() for safe serialization

Problem: Hardcoding column names in JavaScript
Solution: Iterate through data keys dynamically

## 6. CONDITIONAL FORMATTING

Problem: Assuming lower is always better
Solution: Consider metric directionality (cost vs revenue)

Problem: Applying formatting to wrong column
Solution: Target specific columns (e.g., Current YTD only)

# ============================================================================
# QUICK REFERENCE
# ============================================================================

Step 1: Configure (constants, segments)
Step 2: Load external files (budgets, targets)
Step 3: Query primary data (main metrics)
Step 4: Query detail data (subcategories with parent category)
Step 5: Define helpers (format, calculate, lookup)
Step 6: Process data (transform, derive, merge)
Step 7: Define metrics (master list with formatting)
Step 8: Build tables (populate, format, conditionally style)
Step 9: Serialize to JS (column-oriented JSON)
Step 10: Generate HTML (embed data, CSS, JavaScript)
Step 11: Output file (save with version)

Critical Success Factors:
- Include parent category in detail queries (prevents duplication)
- Use weighted averages for rates (not simple averages)
- Handle None/NaN before formatting (prevents errors)
- Test with multiple segments (ensures flexibility)
- Version everything (enables tracking)