feat(sisyphus): Add DeepTutor-inspired structural improvements #504

gtg7784 · 2026-01-05T09:57:11Z

Summary

This PR enhances Sisyphus's orchestration capabilities by incorporating structural patterns inspired by HKUDS/DeepTutor, an AI-based tutoring system with sophisticated multi-agent coordination.

Changes

1. Enhanced Request Type Classification Matrix

Added Min Parallel Calls column to enforce thorough exploration
Split request types into more specific categories: Conceptual, Implementation, Debugging, Refactoring
Each type now has explicit tool strategy guidance

2. Tool Selection Strategy by Phase

Early phase: Prioritize fast local tools (grep, glob, read)
Middle phase: Use LSP and AST tools for pattern discovery
Late phase: Escalate to expensive external tools only when needed
Added tool cost awareness (FREE → CHEAP → EXPENSIVE)

3. Sufficiency Check Gate

Added 5 mandatory checkpoints before proceeding to implementation:
- Context (3+ sources gathered?)
- Patterns (existing code patterns understood?)
- Dependencies (imports/dependencies identified?)
- Edge Cases (potential issues identified?)
- Scope (change scope clearly defined?)
Requires 80% checkpoint pass rate to proceed

4. Iteration Limits (Per-Task Guardrails)

Fix attempts per task: max 3
Same file edits: max 5
Consecutive failures: max 2
Time on single task: ~15 min
Automatic escalation to Oracle or user when limits exceeded

Motivation

These changes address common issues:

Rushing to implementation without sufficient context
Under-exploring (launching fewer parallel searches than needed)
Infinite fix loops that waste tokens
Jumping to expensive tools when local search would suffice

Testing

✅ TypeCheck passed
✅ Agent tests passed (6 tests)
✅ Build successful

Summary by cubic

Strengthens Sisyphus orchestration with DeepTutor-inspired structure to enforce disciplined exploration and safe implementation. Improves tool selection, reduces wasted tokens, and prevents infinite fix loops.

New Features
- Request type matrix with Min Parallel Calls and tool guidance (Conceptual, Implementation, Debugging, Refactoring, GitHub Work, Ambiguous).
- Phase-based tool strategy (Early/Middle/Late) with cost tiers (FREE/CHEAP/EXPENSIVE) to favor local tools first.
- Sufficiency Check gate with 5 checkpoints (Context, Patterns, Dependencies, Edge Cases, Scope); requires 80% to proceed.
- Iteration guardrails per task (3 fix attempts, 5 same-file edits, 2 consecutive failures, ~15 min cap) with clear escalation paths.
Bug Fixes
- Corrected tool name from lsp_references to lsp_find_references.

^{Written for commit ac08891. Summary will update on new commits.}

- Add Request Type Classification Matrix with min parallel calls enforcement - Add Tool Selection Strategy by exploration phase (Early/Middle/Late) - Add Sufficiency Check Gate with 5 checkpoints before implementation - Add Iteration Limits per task to prevent infinite loops Inspired by HKUDS/DeepTutor's structured approach to AI tutoring.

cubic-dev-ai

1 issue found across 1 file

Confidence score: 4/5

Tool invocation in src/agents/sisyphus.ts uses lsp_references, so the agent can’t call the actual lsp_find_references tool and any reference lookup requests will fail.
Despite the naming slip, the rest of the change looks straightforward, so merging should be safe once the tool name is aligned.
Pay close attention to src/agents/sisyphus.ts - tool invocation uses lsp_references but available tool is lsp_find_references.

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/agents/sisyphus.ts">

<violation number="1" location="src/agents/sisyphus.ts:60">
P2: Incorrect tool name: `lsp_references` should be `lsp_find_references` to match the actual tool definition.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

src/agents/sisyphus.ts

greptile-apps · 2026-01-05T10:03:21Z

Greptile Summary

This PR enhances Sisyphus's orchestration capabilities with structural patterns inspired by DeepTutor to enforce more disciplined exploration and prevent common failure modes like premature implementation and infinite fix loops.

Key Improvements

Request Classification Matrix: Added "Min Parallel Calls" column (3-5+ depending on complexity) to prevent under-exploration. Split request types into more granular categories (Conceptual, Implementation, Debugging, Refactoring) with explicit tool strategy guidance for each.
Phase-Based Tool Strategy: Introduces cost-aware tool selection (FREE → CHEAP → EXPENSIVE) with clear guidance to exhaust local tools (grep, glob, read, lsp_*) before escalating to agents (explore, librarian) or expensive external resources (oracle, websearch).
Sufficiency Check Gate: Adds mandatory 5-checkpoint validation before proceeding to implementation: Context (3+ sources?), Patterns (understood?), Dependencies (identified?), Edge Cases (considered?), Scope (clearly defined?). Requires 80% pass rate to proceed.
Iteration Limits: Introduces per-task guardrails to prevent infinite loops: max 3 fix attempts per task, max 5 same-file edits, max 2 consecutive failures, ~15 min time cap. Clear escalation paths to Oracle or user when limits exceeded.

These changes directly address token waste from premature implementation, under-exploration, and infinite fix loops while maintaining backward compatibility with existing prompt structure.

Confidence Score: 5/5

This PR is safe to merge with minimal risk
All changes are prompt engineering improvements with no breaking changes to code structure. Tests pass, typecheck passes, and the changes add guardrails that prevent problematic behavior rather than introducing new logic that could fail
No files require special attention

Important Files Changed

Filename	Overview
src/agents/sisyphus.ts	Added DeepTutor-inspired structure: request classification matrix with min parallel calls, phase-based tool strategy with cost tiers, sufficiency check gate with 5 checkpoints, and per-task iteration limits with escalation paths

Sequence Diagram

sequenceDiagram
    participant User
    participant Sisyphus
    participant Tools as Local Tools<br/>(grep/glob/read/lsp)
    participant Agents as Agents<br/>(explore/librarian)
    participant Oracle

    User->>Sisyphus: Request (e.g., "Add feature X")
    
    Note over Sisyphus: Phase 0: Classify Request Type
    Sisyphus->>Sisyphus: Check classification matrix<br/>Type: Implementation → Min 4+ parallel calls
    
    Note over Sisyphus: Phase 2A: Exploration (Early)
    Sisyphus->>Sisyphus: Tool Strategy: Start with FREE tools
    par Parallel Exploration
        Sisyphus->>Tools: grep patterns (FREE)
        Sisyphus->>Tools: glob files (FREE)
        Sisyphus->>Tools: read relevant files (FREE)
        Sisyphus->>Tools: lsp_references (FREE)
    end
    Tools-->>Sisyphus: Results
    
    alt If gaps remain (Middle Phase)
        Sisyphus->>Agents: background explore (CHEAP)
        Sisyphus->>Agents: background librarian (CHEAP)
        Agents-->>Sisyphus: Additional context
    end
    
    Note over Sisyphus: Sufficiency Check Gate
    Sisyphus->>Sisyphus: Validate 5 checkpoints:<br/>Context, Patterns, Dependencies,<br/>Edge Cases, Scope
    
    alt Score < 80%
        Sisyphus->>Tools: Continue exploration
    else Score >= 80%
        Note over Sisyphus: Phase 2B: Implementation
        Sisyphus->>Sisyphus: Create detailed todo list
        Sisyphus->>Sisyphus: Track iteration count
        
        loop Max 3 attempts per task
            Sisyphus->>Tools: Implement changes
            Sisyphus->>Tools: lsp_diagnostics
            
            alt Attempt > 3
                Sisyphus->>Oracle: Consult with full context (EXPENSIVE)
                Oracle-->>Sisyphus: Strategic guidance
            end
        end
    end
    
    Note over Sisyphus: Phase 3: Completion
    Sisyphus->>Tools: Final verification
    Sisyphus-->>User: Task complete with evidence

greptile-apps · 2026-01-05T10:03:21Z

Greptile's behavior is changing!

From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section.

_{This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR".}

code-yeongyu · 2026-01-05T13:51:44Z

sorry atm i don't think changing the way how sisyphus behaves kinda worried- can you share your experiences trying with this?

cubic-dev-ai bot reviewed Jan 5, 2026

View reviewed changes

src/agents/sisyphus.ts Outdated Show resolved Hide resolved

fix(sisyphus): correct tool name lsp_references → lsp_find_references

ac08891

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(sisyphus): Add DeepTutor-inspired structural improvements #504

feat(sisyphus): Add DeepTutor-inspired structural improvements #504

gtg7784 commented Jan 5, 2026 •

edited by cubic-dev-ai bot

Loading

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

greptile-apps bot commented Jan 5, 2026

Uh oh!

greptile-apps bot commented Jan 5, 2026

Uh oh!

code-yeongyu commented Jan 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(sisyphus): Add DeepTutor-inspired structural improvements #504

Are you sure you want to change the base?

feat(sisyphus): Add DeepTutor-inspired structural improvements #504

Conversation

gtg7784 commented Jan 5, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

1. Enhanced Request Type Classification Matrix

2. Tool Selection Strategy by Phase

3. Sufficiency Check Gate

4. Iteration Limits (Per-Task Guardrails)

Motivation

Testing

Summary by cubic

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot commented Jan 5, 2026

Greptile Summary

Key Improvements

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot commented Jan 5, 2026

Greptile's behavior is changing!

Uh oh!

code-yeongyu commented Jan 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gtg7784 commented Jan 5, 2026 •

edited by cubic-dev-ai bot

Loading