LLM Agent Development Specification

Overview

This specification defines an organizational framework for Large Language Model (LLM) agent development that improves results through clear expectations, boundaries, and structured processes. The framework emphasizes immersive context creation, role separation, and iterative development cycles.

Core Principles

1. Immersive Context Creation

Agents should create rich, detailed contexts that provide comprehensive understanding. Immersion is achieved through:

  • Shared Context: All agents must maintain consistent understanding of the project state
  • Descriptive Language: Use clear, vivid descriptions that establish context and expectations
  • Narrative Continuity: Each interaction should build upon previous context

Example: Instead of "Fix the login bug," use "The authentication gateway is failing to properly validate user credentials, causing users to be locked out of their accounts. The validation mechanism appears to have degraded, possibly due to recent infrastructure changes. Investigate the credential validation process and restore proper system access."

2. Separation of Responsibilities

Agents are organized into distinct roles with specific responsibilities:

Goal Evaluation Agents

  • Purpose: Assess high-level objectives and strategic direction
  • Responsibilities:
    • Analyze project goals for clarity and achievability
    • Identify potential conflicts or dependencies
    • Provide strategic recommendations
  • Context: Strategic oversight role - maintains holistic view of organizational objectives

Task Generation Agents

  • Purpose: Transform goals into actionable tasks
  • Responsibilities:
    • Create detailed Product Requirement Documents (PRDs)
    • Break down complex objectives into executable steps
    • Define success criteria and acceptance conditions
  • Context: Requirements architect - transforms high-level goals into detailed implementation plans

Task Evaluation Agents

  • Purpose: Assess individual task quality and completeness
  • Responsibilities:
    • Review task specifications for clarity
    • Validate task feasibility and resource requirements
    • Ensure tasks align with broader goals
  • Context: Quality assurance role - validates specifications meet standards and requirements

Execution Agents

  • Purpose: Implement tasks and deliver results
  • Responsibilities:
    • Execute assigned tasks according to specifications
    • Report progress and blockers
    • Deliver measurable outcomes
  • Context: Implementation specialists - execute tasks according to specifications

Development Cycle

The development process follows a structured cycle designed to maintain quality and alignment:

Phase 1: Goal Evaluation

Agent Type: Goal Evaluation Agent Input: Project requirements, organizational context Activities:

  • Review and clarify organizational objectives
  • Assess current project state and context
  • Identify success metrics and constraints
  • Generate strategic recommendations

Output: STRATEGY.md

Phase 2: Task Generation

Agent Type: Task Generation Agent Input: STRATEGY.md Activities:

  • Transform goals into specific, actionable tasks
  • Create detailed Product Requirement Documents for each task
  • Define clear acceptance criteria and success metrics
  • Establish task priorities and dependencies

Output: TASKLIST.md

PRD Template:

# Task: [Descriptive Name] ## Context [Rich, detailed description of the current situation and environment] ## Objective [Clear statement of what needs to be accomplished] ## Success Criteria - [ ] Specific, measurable outcome 1 - [ ] Specific, measurable outcome 2 - [ ] Specific, measurable outcome 3 ## Constraints - Technical limitations - Resource constraints - Timeline requirements ## Examples [Provide concrete examples of expected outputs] ## Dependencies - Prerequisites that must be completed first - Resources or information required

Phase 3: Current Tasklist Evaluation

Agent Type: Task Evaluation Agent Input: TASKLIST.md Activities:

  • Review generated tasks for quality and completeness
  • Validate task specifications meet requirements
  • Identify potential issues or improvements
  • Prioritize tasks based on impact and dependencies

Output: VALIDATED_TASKS.md

Phase 4: Begin Tasklist Execution

Agent Type: Execution Agent Input: VALIDATED_TASKS.md Activities:

  • Execute tasks according to specifications
  • Monitor progress and report status
  • Handle new task creation as needed
  • Address immediate blockers or issues

Output: EXECUTION_PROGRESS.md (continuously updated)

Phase 4a: New Task Creation (During Execution)

When new tasks are identified during execution:

Immediate Addressing:

  • If task is critical to current work: pause and address immediately
  • Create mini-PRD for urgent task
  • Execute and return to original task

Backlog Addition:

  • If task is important but not urgent: add to backlog
  • Include full context and reasoning
  • Prioritize relative to existing backlog items

Phase 5: Finish Task List

Agent Type: Execution Agent Input: EXECUTION_PROGRESS.md Activities:

  • Complete all tasks in current sprint
  • Validate deliverables against success criteria
  • Document outcomes and lessons learned
  • Prepare handoff materials

Output: TASK_RESULTS.md

Phase 6: Backlog Evaluation

Agent Type: Task Evaluation Agent Input: TASK_RESULTS.md, existing BACKLOG.md Activities:

  • Review accumulated backlog items
  • Re-prioritize based on current context
  • Identify obsolete or duplicate items
  • Prepare recommendations for next cycle

Output: Updated BACKLOG.md

Phase 7: Goal Evaluation (Cycle Completion)

Agent Type: Goal Evaluation Agent Input: TASK_RESULTS.md, BACKLOG.md Activities:

  • Assess progress toward strategic objectives
  • Identify lessons learned and process improvements
  • Adjust goals based on new information
  • Prepare for next development cycle

Output: Updated STRATEGY.md for next cycle

Task Tracking System

Repository-Level Task Management

Each repository implementing this specification must maintain a comprehensive task history using a standardized tracking system located in the repository root.

Directory Structure

.tasks/ ├── TASK-001.md ├── TASK-002.md ├── TASK-003.md └── ...

Task File Naming Convention

  • Format: TASK-###.md where ### is a zero-padded, incrementing counter
  • Counter: Starts at 001 and increments for each new task in the repository
  • Scope: Counter is repository-specific, not global across projects

Task File Structure

Each task file must contain the following sections:

# TASK-### - [Task Title] ## Metadata - **Created**: [ISO 8601 timestamp] - **Status**: [pending|in_progress|completed|cancelled] - **Agent**: [Agent type that handled this task] - **Priority**: [high|medium|low] - **Cycle**: [Development cycle identifier] ## Original PRD [Complete Product Requirement Document as generated in Phase 2] ## Implementation Details ### Approach [Description of the implementation strategy and methodology used] ### Changes Made [Detailed list of all modifications, additions, or deletions] ### Technical Decisions [Key architectural or implementation choices with rationale] ## Outcomes ### Deliverables [List of concrete outputs produced] ### Success Metrics - [ ] [Original success criterion 1] - [Status: ✅ Met / ❌ Not Met / 🔄 Partial] - [ ] [Original success criterion 2] - [Status: ✅ Met / ❌ Not Met / 🔄 Partial] ### Lessons Learned [Key insights and improvements for future tasks] ## References ### Issues - [Issue #123: Description](link-to-issue) - [Issue #456: Description](link-to-issue) ### Pull Requests - [PR #789: Description](link-to-pr) - [PR #101: Description](link-to-pr) ### Related Tasks - [TASK-002: Related task title](.tasks/TASK-002.md) - [TASK-005: Dependent task title](.tasks/TASK-005.md) ## Timeline - **Started**: [ISO 8601 timestamp] - **Completed**: [ISO 8601 timestamp] - **Duration**: [Time spent on task]

Task Management Requirements

Task Creation
  • Create task file immediately when execution begins (Phase 4)
  • Populate metadata and original PRD sections
  • Set status to in_progress
  • Commit task file to repository
Progress Updates
  • Update Implementation Details section as work progresses
  • Document significant decisions and changes in real-time
  • Link to issues and pull requests as they are created
  • Maintain current status in metadata
Task Completion
  • Complete all sections including outcomes and metrics
  • Set status to completed or cancelled
  • Add completion timestamp
  • Ensure all references are properly linked
  • Final commit with completed task file
Integration with Development Workflow
  • Task files must be created before any implementation work begins
  • All issues created during task execution must be referenced
  • All pull requests must reference the relevant task file
  • Task completion blocks should reference specific task files
  • Code reviews should validate task file completeness

Task Discovery and Navigation

Repositories should maintain a TASKS_INDEX.md file in the .tasks directory that provides:

  • Chronological list of all tasks
  • Status summary dashboard
  • Cross-references between related tasks
  • Quick access to recently completed work

Example TASKS_INDEX.md:

# Task Index ## Quick Stats - Total Tasks: 15 - Completed: 12 - In Progress: 2 - Cancelled: 1 ## Recent Tasks | Task | Title | Status | Agent | Date | |------|-------|--------|-------|------| | [TASK-015](TASK-015.md) | Authentication Security Enhancement | ✅ Completed | Execution | 2024-01-15 | | [TASK-014](TASK-014.md) | API Rate Limiting Implementation | 🔄 In Progress | Execution | 2024-01-12 | ## All Tasks [Chronological listing of all tasks with links]

Implementation Guidelines

Context Management

  • Maintain a shared context document that all agents can access
  • Update context after each phase completion
  • Include relevant examples and precedents
  • Document decisions and rationale
  • Reference relevant task files for historical context

Communication Standards

  • Use descriptive, immersive language in all communications
  • Provide concrete examples whenever possible
  • Maintain narrative consistency across agent interactions
  • Document assumptions and clarifications
  • Include task file references in all implementation discussions

Quality Assurance

  • Each phase must produce specified deliverables
  • Validation checkpoints ensure quality standards
  • Regular retrospectives improve the process
  • Metrics track effectiveness and efficiency
  • Task completion requires fully documented task files

Example Implementation

Scenario: Improving User Authentication System

Goal Evaluation Phase: "The organization's authentication infrastructure has shown degradation in security posture, with potential vulnerabilities that could allow unauthorized access to critical systems. We need to strengthen these security mechanisms while maintaining seamless user experience for legitimate users."

Task Generation Phase:

# Task: Strengthen Authentication Security Mechanisms ## Context The current authentication system has shown vulnerabilities in recent security assessments. Users report slow login times, and security monitoring has detected unauthorized access attempts targeting the authentication endpoints. ## Objective Implement multi-factor authentication and improve login performance while maintaining security standards. ## Success Criteria - [ ] Multi-factor authentication implemented for all user types - [ ] Login performance improved by 40% from current baseline - [ ] Zero security vulnerabilities in penetration testing - [ ] 95% user satisfaction in usability testing ## Examples - SMS-based second factor for standard users - Hardware token support for administrative roles - Biometric options for high-security access

Document Definitions

STRATEGY.md

Purpose: Strategic assessment and high-level planning document Properties:

  • Objectives: Clear statement of organizational goals
  • Current State: Assessment of existing systems and capabilities
  • Success Metrics: Quantifiable measures of goal achievement
  • Constraints: Resource, technical, and timeline limitations
  • Risk Assessment: Potential issues and mitigation strategies
  • Strategic Recommendations: High-level approach and priorities

TASKLIST.md

Purpose: Detailed task specifications with implementation requirements Properties:

  • Task Inventory: Complete list of tasks with unique identifiers
  • Task PRDs: Individual Product Requirement Documents for each task
  • Dependencies: Inter-task relationships and prerequisites
  • Priority Matrix: Task prioritization with justification
  • Resource Requirements: Skills, tools, and time estimates needed
  • Timeline: Proposed sequence and scheduling

VALIDATED_TASKS.md

Purpose: Quality-assured task specifications ready for execution Properties:

  • Approved Tasks: Tasks that passed validation criteria
  • Validation Notes: Quality assessment findings for each task
  • Risk Mitigation: Identified risks and proposed solutions
  • Execution Order: Optimized sequence based on dependencies
  • Resource Allocation: Confirmed availability of required resources
  • Acceptance Criteria: Final validation requirements for each task

EXECUTION_PROGRESS.md

Purpose: Real-time tracking of task execution status Properties:

  • Task Status: Current state (not started, in progress, completed, blocked)
  • Progress Updates: Timestamped status changes and notes
  • Blockers: Issues preventing task completion with escalation paths
  • Deliverables: Completed outputs and artifacts
  • Time Tracking: Actual vs. estimated effort
  • Quality Metrics: Performance indicators during execution

TASK_RESULTS.md

Purpose: Comprehensive summary of completed work and outcomes Properties:

  • Completion Summary: Final status of all tasks in the cycle
  • Deliverables Catalog: All outputs produced with quality validation
  • Lessons Learned: Key insights and improvement recommendations
  • Performance Metrics: Actual vs. planned time, quality, and scope
  • Outstanding Issues: Unresolved items requiring future attention
  • Handoff Documentation: Information needed for subsequent phases

BACKLOG.md

Purpose: Prioritized inventory of future work items Properties:

  • Backlog Items: Tasks not included in current cycle with descriptions
  • Priority Ranking: Ordered list with business value assessment
  • Effort Estimates: High-level sizing for planning purposes
  • Context Preservation: Background information for future reference
  • Stakeholder Input: Requirements and feedback from various parties
  • Cycle Recommendations: Suggestions for future sprint planning

This specification provides the framework for consistent, high-quality LLM agent development while maintaining the immersive, context-rich approach that leads to better outcomes.