Skip to content

Conversation

@gtg7784
Copy link
Contributor

@gtg7784 gtg7784 commented Jan 5, 2026

Summary

This PR enhances Sisyphus's orchestration capabilities by incorporating structural patterns inspired by HKUDS/DeepTutor, an AI-based tutoring system with sophisticated multi-agent coordination.

Changes

1. Enhanced Request Type Classification Matrix

  • Added Min Parallel Calls column to enforce thorough exploration
  • Split request types into more specific categories: Conceptual, Implementation, Debugging, Refactoring
  • Each type now has explicit tool strategy guidance

2. Tool Selection Strategy by Phase

  • Early phase: Prioritize fast local tools (grep, glob, read)
  • Middle phase: Use LSP and AST tools for pattern discovery
  • Late phase: Escalate to expensive external tools only when needed
  • Added tool cost awareness (FREE → CHEAP → EXPENSIVE)

3. Sufficiency Check Gate

  • Added 5 mandatory checkpoints before proceeding to implementation:
    • Context (3+ sources gathered?)
    • Patterns (existing code patterns understood?)
    • Dependencies (imports/dependencies identified?)
    • Edge Cases (potential issues identified?)
    • Scope (change scope clearly defined?)
  • Requires 80% checkpoint pass rate to proceed

4. Iteration Limits (Per-Task Guardrails)

  • Fix attempts per task: max 3
  • Same file edits: max 5
  • Consecutive failures: max 2
  • Time on single task: ~15 min
  • Automatic escalation to Oracle or user when limits exceeded

Motivation

These changes address common issues:

  • Rushing to implementation without sufficient context
  • Under-exploring (launching fewer parallel searches than needed)
  • Infinite fix loops that waste tokens
  • Jumping to expensive tools when local search would suffice

Testing

  • ✅ TypeCheck passed
  • ✅ Agent tests passed (6 tests)
  • ✅ Build successful

Summary by cubic

Strengthens Sisyphus orchestration with DeepTutor-inspired structure to enforce disciplined exploration and safe implementation. Improves tool selection, reduces wasted tokens, and prevents infinite fix loops.

  • New Features

    • Request type matrix with Min Parallel Calls and tool guidance (Conceptual, Implementation, Debugging, Refactoring, GitHub Work, Ambiguous).
    • Phase-based tool strategy (Early/Middle/Late) with cost tiers (FREE/CHEAP/EXPENSIVE) to favor local tools first.
    • Sufficiency Check gate with 5 checkpoints (Context, Patterns, Dependencies, Edge Cases, Scope); requires 80% to proceed.
    • Iteration guardrails per task (3 fix attempts, 5 same-file edits, 2 consecutive failures, ~15 min cap) with clear escalation paths.
  • Bug Fixes

    • Corrected tool name from lsp_references to lsp_find_references.

Written for commit ac08891. Summary will update on new commits.

- Add Request Type Classification Matrix with min parallel calls enforcement
- Add Tool Selection Strategy by exploration phase (Early/Middle/Late)
- Add Sufficiency Check Gate with 5 checkpoints before implementation
- Add Iteration Limits per task to prevent infinite loops

Inspired by HKUDS/DeepTutor's structured approach to AI tutoring.
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file

Confidence score: 4/5

  • Tool invocation in src/agents/sisyphus.ts uses lsp_references, so the agent can’t call the actual lsp_find_references tool and any reference lookup requests will fail.
  • Despite the naming slip, the rest of the change looks straightforward, so merging should be safe once the tool name is aligned.
  • Pay close attention to src/agents/sisyphus.ts - tool invocation uses lsp_references but available tool is lsp_find_references.
Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/agents/sisyphus.ts">

<violation number="1" location="src/agents/sisyphus.ts:60">
P2: Incorrect tool name: `lsp_references` should be `lsp_find_references` to match the actual tool definition.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@greptile-apps
Copy link

greptile-apps bot commented Jan 5, 2026

Greptile Summary

This PR enhances Sisyphus's orchestration capabilities with structural patterns inspired by DeepTutor to enforce more disciplined exploration and prevent common failure modes like premature implementation and infinite fix loops.

Key Improvements

  • Request Classification Matrix: Added "Min Parallel Calls" column (3-5+ depending on complexity) to prevent under-exploration. Split request types into more granular categories (Conceptual, Implementation, Debugging, Refactoring) with explicit tool strategy guidance for each.

  • Phase-Based Tool Strategy: Introduces cost-aware tool selection (FREE → CHEAP → EXPENSIVE) with clear guidance to exhaust local tools (grep, glob, read, lsp_*) before escalating to agents (explore, librarian) or expensive external resources (oracle, websearch).

  • Sufficiency Check Gate: Adds mandatory 5-checkpoint validation before proceeding to implementation: Context (3+ sources?), Patterns (understood?), Dependencies (identified?), Edge Cases (considered?), Scope (clearly defined?). Requires 80% pass rate to proceed.

  • Iteration Limits: Introduces per-task guardrails to prevent infinite loops: max 3 fix attempts per task, max 5 same-file edits, max 2 consecutive failures, ~15 min time cap. Clear escalation paths to Oracle or user when limits exceeded.

These changes directly address token waste from premature implementation, under-exploration, and infinite fix loops while maintaining backward compatibility with existing prompt structure.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • All changes are prompt engineering improvements with no breaking changes to code structure. Tests pass, typecheck passes, and the changes add guardrails that prevent problematic behavior rather than introducing new logic that could fail
  • No files require special attention

Important Files Changed

Filename Overview
src/agents/sisyphus.ts Added DeepTutor-inspired structure: request classification matrix with min parallel calls, phase-based tool strategy with cost tiers, sufficiency check gate with 5 checkpoints, and per-task iteration limits with escalation paths

Sequence Diagram

sequenceDiagram
    participant User
    participant Sisyphus
    participant Tools as Local Tools<br/>(grep/glob/read/lsp)
    participant Agents as Agents<br/>(explore/librarian)
    participant Oracle

    User->>Sisyphus: Request (e.g., "Add feature X")
    
    Note over Sisyphus: Phase 0: Classify Request Type
    Sisyphus->>Sisyphus: Check classification matrix<br/>Type: Implementation → Min 4+ parallel calls
    
    Note over Sisyphus: Phase 2A: Exploration (Early)
    Sisyphus->>Sisyphus: Tool Strategy: Start with FREE tools
    par Parallel Exploration
        Sisyphus->>Tools: grep patterns (FREE)
        Sisyphus->>Tools: glob files (FREE)
        Sisyphus->>Tools: read relevant files (FREE)
        Sisyphus->>Tools: lsp_references (FREE)
    end
    Tools-->>Sisyphus: Results
    
    alt If gaps remain (Middle Phase)
        Sisyphus->>Agents: background explore (CHEAP)
        Sisyphus->>Agents: background librarian (CHEAP)
        Agents-->>Sisyphus: Additional context
    end
    
    Note over Sisyphus: Sufficiency Check Gate
    Sisyphus->>Sisyphus: Validate 5 checkpoints:<br/>Context, Patterns, Dependencies,<br/>Edge Cases, Scope
    
    alt Score < 80%
        Sisyphus->>Tools: Continue exploration
    else Score >= 80%
        Note over Sisyphus: Phase 2B: Implementation
        Sisyphus->>Sisyphus: Create detailed todo list
        Sisyphus->>Sisyphus: Track iteration count
        
        loop Max 3 attempts per task
            Sisyphus->>Tools: Implement changes
            Sisyphus->>Tools: lsp_diagnostics
            
            alt Attempt > 3
                Sisyphus->>Oracle: Consult with full context (EXPENSIVE)
                Oracle-->>Sisyphus: Strategic guidance
            end
        end
    end
    
    Note over Sisyphus: Phase 3: Completion
    Sisyphus->>Tools: Final verification
    Sisyphus-->>User: Task complete with evidence
Loading

@greptile-apps
Copy link

greptile-apps bot commented Jan 5, 2026

Greptile's behavior is changing!

From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section.

This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR".

@code-yeongyu
Copy link
Owner

sorry atm i don't think changing the way how sisyphus behaves kinda worried- can you share your experiences trying with this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants