Skip to content

Conversation

@cristianoc
Copy link
Collaborator

Overview

  • The current change teaches the analysis server’s go-to-definition path to surface usable locations for module aliases that ultimately resolve into the shipped stdlib. It does that by introducing a fallbackLocFromSource helper in analysis/src/References.ml (lines 302-347) that scans the relevant .res file with a regexp when the location stored in the cmt is ghost or points at an interface stub.
  • This was enough to fix the user-facing “jump to definition lands on (0, 0) for stdlib symbols” regression and makes the new alias repro in tests/analysis_tests/tests/src/Definition.res pass.

Why the regexp fallback is a hack

  1. We are bypassing semantic data:

    • The fallback ignores the typed/compiled location metadata entirely and instead trawls the textual source to guess where the definition starts.
    • Any divergence between the compiled shape and surface syntax (preprocessor rewrites, alternate formatting, code generated by macros) will cause the heuristic to either miss definitions or return the wrong span.
  2. The heuristic only knows about let and type:

    • The helper hardcodes prefixes for value and type declarations. Constructors, module aliases, external, class, exception, etc. are invisible to it.
    • Even for let, it assumes the name immediately follows the keyword and accepts at most an optional rec. Patterns such as destructuring, attribute annotations, or inline comments break the match.
  3. It assumes the runtime is available as source:

    • The stdlib can be shipped as .cmi + .cmj artifacts without an accessible .res snapshot. In those environments Files.readFile returns None, so we fall straight back to a ghost location.
  4. Concurrency and perf concerns:

    • We synchronously read large source files on the definition hot path every time an alias hops into the stdlib. There is no caching or memoisation, so repeated lookups thrash the file system.
  5. It is brittle with formatting details:

    • A single indentation tweak or switching between spaces and tabs can move the match around or prevent it from triggering entirely. We have implicitly coupled navigation correctness to the exact printing style of runtime/ sources.

Direction for a proper implementation

  1. Make the build preserve definition locations

    • Investigate why the stdlib .cmi produced by runtime/Makefile drops location spans even when -keep-locs is threaded through the CLI.
    • Audit the dune rules (e.g. runtime/dune, the rescript bootstrap pipeline) to confirm whether -bin-annot or any stripping step erases locations during rescript to .cmj/.cmi compilation.
    • Ideally we ensure the emitted .cmt[cmi] files already contain correct Location.t entries and stop relying on downstream heuristics.
  2. Extend the analysis format to ship explicit aliases

    • If keeping locs is impossible for legacy reasons, teach the analysis’ ProcessCmt layer to emit synthetic source spans for alias definitions during extraction time. We already walk Declared.t records; we could rehydrate the location from the original AST before desugaring.
  3. Cache and index runtime metadata

    • Build a precomputed index for the stdlib containing modulePath → location tuples baked into the analysis bundle. This can be generated once during packaging and read by the server in O(1) at runtime, avoiding regex scans entirely.
  4. Treat module aliases separately

    • Instead of falling back to textual search, thread enough information through resolveModuleReference so that when we hit a GlobalMod, we can surface the alias site together with the resolved file and stamp. That preserves user expectations (“go to definition shows me the implementation”) while staying within typed data.
  5. Testing and regression coverage

    • Add targeted tests at each layer (runtime build artifact inspection, analysis extraction, go-to-definition integration) to ensure we do not regress location fidelity again.
    • Document the required compiler flags (-keep-locs, -bin-annot) in runtime/README.md or Makefile comments so future maintainers do not unknowingly strip locs.

Extremely detailed change log for the PR

  1. Runtime / build systemunchanged by the patch, but the hack shows we need to revisit how the stdlib is compiled and packaged. The investigation proved that the shipped .cmi lacks location data even when the compiler is invoked with -keep-locs.
  2. Analysis server (OCaml)
    • analysis/src/References.ml
      • Added isGhostLoc and fallbackLocFromSource. The latter compiles a regexp based on the tip (only let/type) and scans the runtime source file to synthesize a non-ghost Location.t.
      • Updated resolveModuleDefinition to return the alias site when resolving to GlobalMod, preventing (0, 0) results for Stdlib modules.
      • Adjusted definition to invoke the fallback whenever the best location is ghost or points at an interface.
  3. Analysis integration tests
    • tests/analysis_tests/tests/src/Definition.res now contains a StdlibAlias module aliasing Stdlib.List.
    • tests/analysis_tests/tests/src/expected/Definition.res.txt asserts that StdlibAlias.List.map resolves to the stdlib implementation span instead of (0, 0).
    • The reanalyze snapshot tests/analysis_tests/tests-reanalyze/deadcode/expected/deadcode.txt gained multiple addValueReference … --> Pervasives.res/Js.res/etc. lines, confirming the alias resolution now attaches real stdlib locations.

Proposed follow-up work

  1. Harden the runtime build to keep location metadata (detailed auditing of the lib target and dune rules).
  2. Replace the regex fallback with a location table extracted from the compiler during stdlib compilation.
  3. Add automated tests that inspect the generated .cmi to ensure we never regress on location fidelity again.

Signed-off-by: Codex Bot <codex@example.com>
addValueReference Hooks.res:10:29 --> Hooks.res:4:12
DeadOptionalArgs.addReferences Stdlib.Int.toString called with optional argNames: argNamesMaybe: Hooks.res:10:62
addValueReference Hooks.res:10:75 --> Hooks.res:5:7
addValueReference Hooks.res:10:62 --> Stdlib_Int.resi:205:0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something in the right direction is happening here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@send
external toString: (int, ~radix: int=?) => string = "toString"

@cristianoc
Copy link
Collaborator Author

@cknitt @zth I hope this is in a good enough state that I can leave it here to be completed

@cknitt
Copy link
Member

cknitt commented Nov 5, 2025

Investigate why the stdlib .cmi produced by runtime/Makefile drops location spans even when -keep-locs is threaded through the CLI.

Hmm, packages/@rescript/runtime/rescript.json does contain -no-keep-locs. Is that not the problem?

@cristianoc
Copy link
Collaborator Author

Investigate why the stdlib .cmi produced by runtime/Makefile drops location spans even when -keep-locs is threaded through the CLI.

Hmm, packages/@rescript/runtime/rescript.json does contain -no-keep-locs. Is that not the problem?

No

@cristianoc
Copy link
Collaborator Author

Investigate why the stdlib .cmi produced by runtime/Makefile drops location spans even when -keep-locs is threaded through the CLI.

Hmm, packages/@rescript/runtime/rescript.json does contain -no-keep-locs. Is that not the problem?

No

Or at least, changing that does not solve the problem.
Once the problem is solved user-side, as in the example in the PR which does not involve stdlib access at all, one can see if that option is an ADDITIONAL problem.

@cknitt
Copy link
Member

cknitt commented Nov 5, 2025

Not sure I get what you mean by "the example in the PR which does not involve stdlib access at all".

But if I remove the -no-keep-locs flag, I get the exact same test output changes as you are getting in this PR here.

@cristianoc
Copy link
Collaborator Author

Let me remove all of this crap.

@cristianoc cristianoc closed this Nov 5, 2025
@cristianoc cristianoc deleted the fix-stdlib-alias-definition branch November 5, 2025 09:27
@zth
Copy link
Member

zth commented Nov 5, 2025

Ehm, is the fix here then to do what @cknitt said?

@cristianoc
Copy link
Collaborator Author

Ehm, is the fix here then to do what @cknitt said?

The question is: what is the problem

@cristianoc
Copy link
Collaborator Author

Screenshot 2025-11-05 at 16 00 50

@cknitt
Copy link
Member

cknitt commented Nov 5, 2025

Just tried to reproduce the original problem from #8003 and couldn't.

In VS Code, with the latest prerelease of the editor extension installed, hover and navigate to work fine for me with the example given in #8003.

@zth
Copy link
Member

zth commented Nov 7, 2025

Just tried to reproduce the original problem from #8003 and couldn't.

In VS Code, with the latest prerelease of the editor extension installed, hover and navigate to work fine for me with the example given in #8003.

@cknitt try this example:

Console.log("Hello, world!")

let ff = Ok(1)

let x = []->Array.at(1)

let y = []->Array.ignore

For me, go to definition works for Console.log, but not for any of the array functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants