-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
(Starlark is designed for hermetic, deterministic execution; go.starlark.net/starlark executes from a string with no FS/net/clock unless you expose it.
- Implement tool contract and handler for
code.sandbox.starlark.runthat executes source from memory: createinternal/tools/starlarkrun/handler.goexposingName() string { return "code.sandbox.starlark.run" }andCall(ctx, json.RawMessage) (ToolResult, error)which accepts{source:string,input:string,limits:{wall_ms:int,output_kb:int},caps:{}}; parse JSON, run Starlark withExecFileusing aThreadandpredeclaredfunctionsread_input()->strandemit(str); enforce wall-time viacontext.WithTimeoutand cap emitted bytes with a bounded buffer; return{stdout:string}; DoD: unit test passes showing a scriptemit(read_input())returns input, times out onwhile True: pass, and output > limit is truncated with a clear error. - Add dependency and wiring: append
require go.starlark.net vXtogo.mod, createinternal/tools/starlarkrun/module.goregistering the tool into the tool registry (constructor + dependency-free init), and updateREADME.md“Tools” table with usage and example input/output; DoD:go build ./...succeeds locally, registry lists the tool, README shows a runnable curl example. - Harden capabilities (deny-by-default): ensure only
emitandread_inputare available (no FS, net, clock); do not bind anyos,time, or custom builtins; add negative tests that attempt to import or access such capabilities and expect failure; DoD: tests demonstrate no ambient side effects are reachable and only declared builtins exist. - Determinism test: add a table test running the same
source+input100× and asserting identical output and errors; DoD: flaky rate 0/100 locally; test name includes “deterministic”; comment cites Starlark determinism; tests green locally; DoD documented in test. - Structured errors & shared schema: return standardized errors
{code:string,message:string,details?:object}for timeouts (TIMEOUT), output limit exceeded (OUTPUT_LIMIT), and evaluation failures (EVAL_ERROR); update shared error schema doc and example in README; DoD: unit tests assert JSON shape for each failure mode and README section is present. - Observability: add structured logs (trace id, tool name, wall_ms, bytes_out) and emit OpenTelemetry span attributes; DoD: local run shows JSON logs with those fields and a span named
tools.starlark.run. - Contract examples: add
docs/interfaces/code.sandbox.starlark.run.mdwith request/response examples (valid, timeout, error), security notes, and performance caveats; DoD: doc renders and is linked from main docs.
Metadata
Metadata
Assignees
Labels
No labels