-
Notifications
You must be signed in to change notification settings - Fork 4
eng-1213-adjustments to markdown: use roam links #649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
This pull request has been ignored for the connected project Preview Branches by Supabase. |
|
Explanations in https://www.loom.com/share/afa48744aa0441628ec182a2147e8321 |
4d19e5b to
483e769
Compare
78b1bfe to
2384603
Compare
|
Note: Matt's request is found here: |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
📝 WalkthroughWalkthroughThis PR enhances Roam's markdown export by improving regex escaping in node content transformation, extracting internal text references to nodes from page content, and introducing optional flags (blockRefsAsLinks and blockAnchors) to control block reference and anchor rendering in exported markdown. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Pre-merge checks✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
apps/roam/src/utils/exportUtils.tsapps/roam/src/utils/jsonld.tsapps/roam/src/utils/pageToMarkdown.ts
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/main.mdc)
**/*.{ts,tsx}: Use Tailwind CSS for styling where possible
When refactoring inline styles, use tailwind classes
Prefertypeoverinterfacein TypeScript
Use explicit return types for functions
Avoidanytypes when possible
Prefer arrow functions over regular function declarations
Use named parameters (object destructuring) when a function has more than 2 parameters
Use PascalCase for components and types
Use camelCase for variables and functions
Use UPPERCASE for constants
Function names should describe their purpose clearly
Prefer early returns over nested conditionals for better readability
Files:
apps/roam/src/utils/exportUtils.tsapps/roam/src/utils/jsonld.tsapps/roam/src/utils/pageToMarkdown.ts
apps/roam/**/*.{js,ts,tsx,jsx,json}
📄 CodeRabbit inference engine (.cursor/rules/roam.mdc)
Prefer existing dependencies from package.json when working on the Roam Research extension
Files:
apps/roam/src/utils/exportUtils.tsapps/roam/src/utils/jsonld.tsapps/roam/src/utils/pageToMarkdown.ts
apps/roam/**/*.{ts,tsx,jsx,js,css,scss}
📄 CodeRabbit inference engine (.cursor/rules/roam.mdc)
Use BlueprintJS 3 components and Tailwind CSS for platform-native UI in the Roam Research extension
Files:
apps/roam/src/utils/exportUtils.tsapps/roam/src/utils/jsonld.tsapps/roam/src/utils/pageToMarkdown.ts
apps/roam/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.cursor/rules/roam.mdc)
apps/roam/**/*.{ts,tsx,js,jsx}: Use the roamAlphaApi docs from https://roamresearch.com/#/app/developer-documentation/page/tIaOPdXCj when implementing Roam functionality
Use Roam Depot/Extension API docs from https://roamresearch.com/#/app/developer-documentation/page/y31lhjIqU when implementing extension functionality
Files:
apps/roam/src/utils/exportUtils.tsapps/roam/src/utils/jsonld.tsapps/roam/src/utils/pageToMarkdown.ts
apps/roam/**
📄 CodeRabbit inference engine (.cursor/rules/roam.mdc)
Implement the Discourse Graph protocol in the Roam Research extension
Files:
apps/roam/src/utils/exportUtils.tsapps/roam/src/utils/jsonld.tsapps/roam/src/utils/pageToMarkdown.ts
🧠 Learnings (11)
📓 Common learnings
Learnt from: sid597
Repo: DiscourseGraphs/discourse-graph PR: 372
File: apps/roam/src/components/DiscourseNodeMenu.tsx:116-116
Timestamp: 2025-08-25T15:53:21.799Z
Learning: In apps/roam/src/components/DiscourseNodeMenu.tsx, when handling tag insertion, multiple leading hashtags (like ##foo) should be preserved as they represent user intent, not normalized to a single hashtag. The current regex /^#/ is correct as it only removes one leading # before adding one back, maintaining any additional hashtags the user intended.
📚 Learning: 2025-08-25T15:53:21.799Z
Learnt from: sid597
Repo: DiscourseGraphs/discourse-graph PR: 372
File: apps/roam/src/components/DiscourseNodeMenu.tsx:116-116
Timestamp: 2025-08-25T15:53:21.799Z
Learning: In apps/roam/src/components/DiscourseNodeMenu.tsx, when handling tag insertion, multiple leading hashtags (like ##foo) should be preserved as they represent user intent, not normalized to a single hashtag. The current regex /^#/ is correct as it only removes one leading # before adding one back, maintaining any additional hashtags the user intended.
Applied to files:
apps/roam/src/utils/exportUtils.tsapps/roam/src/utils/jsonld.ts
📚 Learning: 2025-11-23T23:53:43.094Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 0
File: :0-0
Timestamp: 2025-11-23T23:53:43.094Z
Learning: In `apps/roam/src/utils/supabaseContext.ts`, the Roam URL normalization uses a targeted string replacement `url.replace("/?server-port=3333#/", "/#/")` rather than URL parsing to avoid impacting unforeseen URL patterns. This conservative approach is intentional to handle only the specific known case.
Applied to files:
apps/roam/src/utils/exportUtils.tsapps/roam/src/utils/jsonld.ts
📚 Learning: 2025-11-25T00:52:41.934Z
Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/roam.mdc:0-0
Timestamp: 2025-11-25T00:52:41.934Z
Learning: Applies to apps/roam/**/*.{ts,tsx,js,jsx} : Use the roamAlphaApi docs from https://roamresearch.com/#/app/developer-documentation/page/tIaOPdXCj when implementing Roam functionality
Applied to files:
apps/roam/src/utils/jsonld.ts
📚 Learning: 2025-11-25T00:52:41.934Z
Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/roam.mdc:0-0
Timestamp: 2025-11-25T00:52:41.934Z
Learning: Applies to apps/roam/**/*.{js,ts,tsx,jsx,json} : Prefer existing dependencies from package.json when working on the Roam Research extension
Applied to files:
apps/roam/src/utils/jsonld.ts
📚 Learning: 2025-06-22T10:40:52.752Z
Learnt from: sid597
Repo: DiscourseGraphs/discourse-graph PR: 232
File: apps/roam/src/utils/getAllDiscourseNodesSince.ts:18-31
Timestamp: 2025-06-22T10:40:52.752Z
Learning: In apps/roam/src/utils/getAllDiscourseNodesSince.ts, the user confirmed that querying for `?title` with `:node/title` and mapping it to the `text` field in the DiscourseGraphContent type is the correct implementation for retrieving discourse node content from Roam Research, despite it appearing to query page titles rather than block text content.
Applied to files:
apps/roam/src/utils/jsonld.tsapps/roam/src/utils/pageToMarkdown.ts
📚 Learning: 2025-11-25T00:52:41.934Z
Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/roam.mdc:0-0
Timestamp: 2025-11-25T00:52:41.934Z
Learning: Applies to apps/roam/**/*.{ts,tsx,js,jsx} : Use Roam Depot/Extension API docs from https://roamresearch.com/#/app/developer-documentation/page/y31lhjIqU when implementing extension functionality
Applied to files:
apps/roam/src/utils/jsonld.ts
📚 Learning: 2025-06-17T23:37:45.289Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 220
File: apps/roam/src/utils/conceptConversion.ts:42-56
Timestamp: 2025-06-17T23:37:45.289Z
Learning: In the DiscourseNode interface from apps/roam/src/utils/getDiscourseNodes.ts, the field `type` serves as the unique identifier field, not a type classification field. The interface has no `uid` or `id` field, making `node.type` the correct field to use for UID-related operations.
Applied to files:
apps/roam/src/utils/jsonld.ts
📚 Learning: 2025-06-17T23:37:45.289Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 220
File: apps/roam/src/utils/conceptConversion.ts:42-56
Timestamp: 2025-06-17T23:37:45.289Z
Learning: In the DiscourseNode interface from apps/roam/src/utils/getDiscourseNodes.ts, the field `node.type` serves as the UID field rather than having a conventional `node.uid` field. This is an unusual naming convention where the type field actually contains the unique identifier.
Applied to files:
apps/roam/src/utils/jsonld.ts
📚 Learning: 2025-11-05T21:57:14.909Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 534
File: apps/roam/src/utils/createReifiedBlock.ts:40-48
Timestamp: 2025-11-05T21:57:14.909Z
Learning: In the discourse-graph repository, the function `getPageUidByPageTitle` from `roamjs-components/queries/getPageUidByPageTitle` is a synchronous function that returns a string directly (the page UID or an empty string if not found), not a Promise. It should be called without `await`.
Applied to files:
apps/roam/src/utils/jsonld.tsapps/roam/src/utils/pageToMarkdown.ts
📚 Learning: 2025-12-07T20:54:20.007Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 559
File: apps/roam/src/utils/findDiscourseNode.ts:37-39
Timestamp: 2025-12-07T20:54:20.007Z
Learning: In apps/roam/src/utils/findDiscourseNode.ts, the function findDiscourseNodeByTitleAndUid accepts both uid and title parameters where uid is primarily used for cache access (as the cache key) while title is used for the actual matching via matchDiscourseNode. This design reflects the pattern where downstream, the uid is mostly used to fetch the title, so the function caches by uid but matches by title.
Applied to files:
apps/roam/src/utils/jsonld.tsapps/roam/src/utils/pageToMarkdown.ts
🧬 Code graph analysis (2)
apps/roam/src/utils/jsonld.ts (2)
apps/roam/src/utils/getRelationData.ts (1)
getRelationDataUtil(9-67)apps/roam/src/utils/exportUtils.ts (1)
getPageData(19-67)
apps/roam/src/utils/pageToMarkdown.ts (1)
apps/roam/src/utils/exportUtils.ts (1)
toLink(121-128)
🔇 Additional comments (11)
apps/roam/src/utils/jsonld.ts (4)
110-110: LGTM: Regex pattern correctly captures page UIDs from Roam URLs.The pattern properly captures 9-10 character UIDs from Roam page URLs with word boundaries to prevent partial matches.
115-115: LGTM: Efficient use of Set for internal reference lookup.Using a Set provides O(1) lookup time when filtering textRefersToNode in the subsequent logic.
142-143: LGTM: Enables Roam-style linkification for JSON-LD export.These options align with the PR's secondary goal of making Markdown hyperlinks more legible to LLMs.
167-167: LGTM: Conditional inclusion prevents empty arrays in JSON-LD output.Only adding
textRefersToNodewhen references exist keeps the output clean and follows JSON-LD best practices.apps/roam/src/utils/pageToMarkdown.ts (7)
149-161: LGTM: Improved readability of chained replacements.The reformatting makes the embed and embed-children regex replacements easier to follow without changing functionality.
183-207: LGTM: Wiki link resolution supports Roam-style linkification.The enhanced logic resolves wiki-style
[[links]]to page UIDs whenblockRefsAsLinksis enabled, making links more meaningful to LLMs as described in the PR objectives.
230-230: LGTM: Using Pandoc-style anchors for better markdown compatibility.The
{#block-uid}format follows Pandoc/kramdown conventions rather than Roam's native format. This aligns with the PR's goal of making the export more meaningful to LLMs and standard markdown processors.
231-231: LGTM: Block anchor placement enables paragraph-level references.Inserting the anchor after the heading prefix creates the pseudo paragraph anchor pattern mentioned in the PR description, allowing block references to combine page + anchor.
247-248: LGTM: Parameters properly threaded with safe defaults.The
blockRefsAsLinksandblockAnchorsparameters are correctly added withfalsedefaults, ensuring backward compatibility while enabling the new functionality.Also applies to: 260-261, 316-317
3-3: BothgetPageUidByBlockUidandgetPageUidByPageTitleare correctly called withoutawaitat lines 167 and 190. Usage patterns confirm both functions are synchronous: at line 169,pageUidis compared directly to strings; at line 191, the.lengthproperty is accessed directly. No issues found.
165-175: Fix error handling forgetPageUidByBlockUidreturn value.The function
getPageUidByBlockUidreturnsundefinedwhen a block is not found, not an empty string. Line 169 checkspageUid === ""which will never be true, allowingundefinedto be concatenated into the link string at line 174, creating a malformed link like"undefined#block-xxx". Change the condition to checkpageUid === undefinedor use a nullish coalescing operator to handle the failure case correctly.⛔ Skipped due to learnings
Learnt from: maparent Repo: DiscourseGraphs/discourse-graph PR: 534 File: apps/roam/src/utils/createReifiedBlock.ts:40-48 Timestamp: 2025-11-05T21:57:14.909Z Learning: In the discourse-graph repository, the function `getPageUidByPageTitle` from `roamjs-components/queries/getPageUidByPageTitle` is a synchronous function that returns a string directly (the page UID or an empty string if not found), not a Promise. It should be called without `await`.
mdroidian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets address the coderabbit suggestions (or resolve if unnecessary ), then re-tag for review.
483e769 to
6c772f6
Compare
2384603 to
491aa3e
Compare
6c772f6 to
4aaf3e1
Compare
491aa3e to
ff01286
Compare
4aaf3e1 to
715c55c
Compare
e235bf5 to
03b3755
Compare
mdroidian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sharing the Loom — that helps clarify the motivation.
From the video, my understanding is that Matt requested the ability to compute a textRefersToNode field. Could you please link to that original request or discussion, so I can read the context directly?
Separately, the {#block-<uid>} / #block-<uid> anchor pattern appears to be a new convention introduced to make block references linkable and scannable in markdown, even though this is not a Roam pattern. Was this approach discussed with the team beforehand, and what specific problem does it solve compared to using existing Roam block URLs or block UIDs?
Clarifying this will help me understand whether this is intended as a temporary export-only workaround or a longer-term convention we want to support.
c5d3cc0 to
f3e9ded
Compare
95ab75f to
91929e9
Compare
|
Matt's original request was here
Now, that one probably deserves a round of shared design. |
If we keep all of the important context (background, scope, Loom walkthrough, etc.) consolidated in the PR description, it makes reviews significantly easier—for reviewers and for us when we come back to this later. PR comment threads tend to accumulate automated messages and incidental discussion over time, which makes key context harder to find. Treating the PR body as the single source of truth helps keep the review focused and reduces friction. |
|
To clarify, single source of truth was just referring to GitHub comments vs PR body. Stated differently, if you are adding any additional context to a GitHub PR, add it to the body and not as a separate comment. |

The primary goal is to note inter-node references in text. We extract this as an extra json-ld attribute.
The secondary goal is to make markdown hyperlinks more legible to LLMs.
For that purpose, we use roam-url linkification, so the links match the node identities.
We also introduce a (pseudo) paragraph anchor, so that block references can refer to the combination of the page and the anchor. This does not fit roam patterns, but should again be more meaningful to LLMs.
These changes are currently applied only to the json-ld export.
There is an unrelated change to the regexp escaping that was suggested to coderabbit.
Summary by CodeRabbit
Bug Fixes
New Features
✏️ Tip: You can customize this high-level summary in your review settings.