-
Notifications
You must be signed in to change notification settings - Fork 0
Description
ADR-00XX: Add Cursor-compatible MDC rule ingestion to goagent
Context. Cursor rules are stored under .cursor/rules as .mdc files with YAML front matter keys like description, globs, and alwaysApply. These keys define three inclusion modes: Always, Auto-Attached (when files matching globs are referenced), and Agent-Requested. The rule file body contains the actual guidance text.
Decision. Implement a rules loader that scans .goagent/rules and .cursor/rules by default, parses .mdc (and .md for convenience) with YAML front matter, and builds a normalized ruleset. At run time, goagent attaches rules in this order: 1) all alwaysApply: true; 2) rules whose globs match any “referenced file” for the invocation; 3) rules explicitly requested via CLI flag or prompt directive (Agent-Requested). Front matter is preserved and exposed to the model for transparency/debugging. This mirrors Cursor semantics while fitting a non-IDE CLI.
Options considered. (a) Ignore globs and always attach: simplest but noisy. (b) Attach by scanning repo for any glob match: over-inclusive on large repos. (c) Attach only when referenced files intersect globs: closest to Cursor “Auto-Attached”—chosen. Community write-ups and forum posts reinforce the front-matter contract and .cursor/rules location.
Rationale. Deterministic ordering + minimal surface area keeps security and reproducibility high. Using two default roots supports project-local rules (.cursor/rules) and goagent-specific rules (.goagent/rules). YAML front matter is widely documented for MDC.
Migration. No breaking changes; new behavior is opt-out via --no-default-rules or configurable roots.
High-level design
flowchart TD
A[Start invocation] --> B[Collect referenced files]
B --> C[Scan rule roots\n.default: .goagent/rules, .cursor/rules]
C --> D[Parse *.mdc/*.md\nsplit YAML front matter\n+ body markdown]
D --> E[Normalize Rule objects\nname, path, fm: description, globs, alwaysApply, ..., body]
E --> F[Select rules]
F -->|alwaysApply: true| G[Include]
F -->|globs ∩ referenced != ∅| G
F -->|agentRequested list| G
G --> H[Stable sort by sourceRoot, relPath]
H --> I[Compose model context blocks\nfront matter summary + body]
I --> J[Call model]
Public interfaces (Go)
Rule file struct and loader contract.
type RuleFrontMatter struct {
Description string `yaml:"description,omitempty"`
Globs []string `yaml:"globs,omitempty"` // allow string or list; normalize
AlwaysApply bool `yaml:"alwaysApply,omitempty"`
// Pass-through for future keys (kept in Extras):
Extras map[string]any `yaml:"-"`
}
type Rule struct {
Name string // filename sans extension
SourceRoot string // ".goagent/rules" or ".cursor/rules"
RelPath string // relative path under SourceRoot
Front RuleFrontMatter
Body string // markdown text after front matter
}
type RuleResolveInput struct {
RuleDirs []string // roots to scan
ReferencedFiles []string // absolute or repo-relative
AgentRequested []string // rule names or glob of rule paths
}
type RuleSet struct {
All []Rule
Used []Rule
Skipped []Rule
}
type RuleLoader interface {
Load(ctx context.Context, in RuleResolveInput) (RuleSet, error)
}CLI flags and env.
--rules-dir PATH (repeatable); --no-default-rules; --rules-include NAME_OR_PATH (repeatable, agent-requested); --files PATH (repeatable, referenced files). Env mirrors: GOAGENT_RULES_DIRS, GOAGENT_RULES_INCLUDE. Deterministic precedence: flag > env > default roots.
Front matter parsing. Detect initial ---…---. Parse with yaml.v3. Accept globs as string or list; normalize to []string. Preserve unknown keys into Extras. If no front matter, treat whole file as Body with zeroed Front.
Glob matching. Use github.com/bmatcuk/doublestar/v4 for ** patterns against repo-relative paths. A rule is “auto-attached” when any normalized ReferencedFiles matches any globs. This mimics Cursor’s “Auto Attached” rule type.
Stable ordering. Sort Used by (SourceRoot, RelPath) to ensure deterministic context.
Security & limits. Max rule file size (default 256 KiB per file), max included rules (default 64), and total token budget guardrails (truncate bodies with an ellipsis when budget exceeded). No code execution. Reject symlink escapes that traverse outside the rule root.
Minimal example: coding-standards.mdc
---
description: Enforce our Go standards: go fmt, go vet, golangci-lint, error wrapping with %w
globs:
- "**/*.go"
alwaysApply: false
---
When writing Go code:
1) Run `go fmt` and `go vet`.
2) Use `golangci-lint` locally before commit.
3) Wrap errors with `%w` and return context.
4) Prefer `context.Context` as the first param in exported funcs.
This front matter + markdown body matches the documented MDC shape (YAML front matter keys + rule content).
Test plan (deterministic, local)
- Loader parses front matter variants: only
alwaysApply, onlyglobs, both, none. Mixed string/list forglobs. DoD: table tests cover all variants; goldenRulestructs equal expected. 2) Directory resolution: defaults to.goagent/rulesand.cursor/rules; flags override; env respected. DoD: tests assert scan order, duplicates resolved by stable sort, and.mdc+.mdboth accepted. 3) Glob attach: provide--fileslist; assert only matching rules are selected; doublestar semantics verified (**/*.go, negations unsupported in v1). DoD: tests pass with edge cases (nested dirs, Windows paths normalized). 4) Agent-requested:--rules-include coding-standardsincludes even withoutglobs/alwaysApply. DoD: test asserts inclusion. 5) Limits: enforce file size and count caps; assert truncation markers when token budget exceeded (budget simulated). 6) Security: symlink traversal test—symlinked file outside root is skipped; DoD: test asserts skip with warning. 7) Ordering: assert stable order across runs and filesystems. 8) Error reporting: malformed YAML produces a precise error (file, line, col) without aborting entire load (soft-fail with skip); DoD: unit test inspects aggregated errors.
Implementation checklist (GitHub-style)
- Add module
internal/mdcwith aLoaderimplementingRuleLoader; accept defaults.goagent/rulesand.cursor/rules; DoD:go test ./...green with unit tests for parsing and selection and README ininternal/mdcdocuments usage. - Implement front-matter splitter: read bytes, detect first
---\nblock, parse YAML between---…---, remainder is body; support CRLF; DoD: tests cover no front matter, malformed fence, BOM handling. - YAML parsing with
yaml.v3: map known keys to struct; unknown keys intoExtras; normalizeglobsto[]string; DoD: tests verify normalization and extras preservation. - Glob matcher via
doublestar/v4: normalizeReferencedFilesto repo-relative paths; DoD: tests confirm**behavior and case sensitivity on Unix. - CLI integration: add flags
--rules-dir,--no-default-rules,--rules-include,--files; env overrides; update help/README; DoD: argument precedence tests pass and README shows examples. - Deterministic sort and caps: sort by
(SourceRoot, RelPath); enforceMaxRules,MaxRuleBytes; DoD: tests verify deterministic selection and capping behavior. - Context composer: prepend a compact “header” summarizing front matter (description + inclusion reason) then the body; DoD: snapshot test of final message array includes selected rules in order with headers.
- Telemetry/logging: debug logs show why each rule was included or skipped; DoD: unit test uses logger hook to assert messages.
- Sample rules & docs: add
docs/rules/with 2–3 example.mdcfiles and guidance onalwaysApplyvsglobsvs agent-requested; DoD: docs rendered and linked from main README.
Notes on semantics vs Cursor
Cursor documents .mdc files with YAML front matter (description, globs, alwaysApply) and applies them as Always, Auto-Attached, or Agent-Requested. We mirror this mapping while exposing agent-requested selection via --rules-include for a non-IDE workflow. If later we want parity with Cursor’s “Agent Requested by AI,” we can allow the model to return a tool call that names rules to include in a second pass.
If you want, I can also draft a tiny JSON schema for the front matter and add a linter subcommand to validate .mdc files before runs.