-
Notifications
You must be signed in to change notification settings - Fork 4
ENG-1230 Backlinks in JSON-LD export #659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
This pull request has been ignored for the connected project Preview Branches by Supabase. |
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
📝 WalkthroughWalkthroughThe changes introduce a backlink mechanism to the discourse graph export system. The JSON-LD export now queries Roam for backlinks, constructs nodes asynchronously with backlink data, and integrates the new dg_core ontology schema. The dg_base schema removes the textRefersToNode property, while dg_core adds containment and backlink vocabulary terms. Changes
Sequence Diagram(s)sequenceDiagram
participant Exp as Export Process
participant Roam as Roam Database
participant Node as Node Constructor
participant Graph as JSON-LD Graph
Exp->>Roam: Query pages for export
Roam-->>Exp: Return page data
loop For each page
Exp->>Exp: Convert page to Markdown (pageToMarkdown)
Exp->>Roam: Query backlinks for page UID
Roam-->>Exp: Return backlink page UIDs
Exp->>Node: Construct node with metadata
Node->>Node: Resolve node type via nodeSchemaUriByName
Note over Node: Add fields: `@id`, `@type`, title,<br/>content, modified, created, creator
alt Has backlinks
Node->>Node: Format backlinks as pages:uid array
end
Node-->>Exp: Return constructed node
Exp->>Exp: Update export progress
end
Exp->>Exp: Query relation data (getRelationData)
Exp->>Graph: Assemble final JSON-LD with nodes & relations
Graph-->>Exp: Complete graph
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Pre-merge checks✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
apps/roam/src/utils/jsonld.ts (1)
109-111: Fix type mismatch: Return type doesn't allow string arrays.The return type on line 110 specifies
Record<string, string>[]for the@graphfield, but thebacklinksproperty (line 179) is typed asstring[]. This causes the compilation error reported by the pipeline.🔎 Proposed fix for return type
}): Promise< - Record<string, string | Record<string, string> | Record<string, string>[]> + Record<string, string | string[] | Record<string, string> | Record<string, string | string[]>[]> > => {This allows
@graphentries to have bothstringandstring[]values, which aligns with the backlinks implementation.Also applies to: 169-169, 203-203
apps/website/public/schema/dg_core.ttl (1)
1-11: Add missing dct: prefix declaration.Lines 103 and 106 reference
dct:hasPartanddct:references(Dublin Core Terms), but thedct:prefix is not declared. This will cause RDF parsing errors.🔎 Proposed fix
Add the Dublin Core Terms prefix declaration after line 4:
@prefix dc: <http://purl.org/dc/elements/1.1/> . +@prefix dct: <http://purl.org/dc/terms/> . @prefix owl: <http://www.w3.org/2002/07/owl#> .Also applies to: 103-103, 106-106
🧹 Nitpick comments (1)
apps/roam/src/utils/jsonld.ts (1)
150-154: Remove unreachable fallback operator.The nullish coalescing operator
?? "nodeSchema"on line 171 is unreachable becauseinternalErrorthrows an exception whennodeTypeis falsy (lines 150-154). The fallback value will never be used.🔎 Proposed fix
"@type": nodeType ?? "nodeSchema", // eslint-disable-line @typescript-eslint/naming-convention + "@type": nodeType, // eslint-disable-line @typescript-eslint/naming-conventionAlso applies to: 171-171
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
apps/roam/src/utils/jsonld.tsapps/website/public/schema/dg_base.ttlapps/website/public/schema/dg_core.ttl
💤 Files with no reviewable changes (1)
- apps/website/public/schema/dg_base.ttl
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/main.mdc)
**/*.{ts,tsx}: Use Tailwind CSS for styling where possible
When refactoring inline styles, use tailwind classes
Prefertypeoverinterfacein TypeScript
Use explicit return types for functions
Avoidanytypes when possible
Prefer arrow functions over regular function declarations
Use named parameters (object destructuring) when a function has more than 2 parameters
Use PascalCase for components and types
Use camelCase for variables and functions
Use UPPERCASE for constants
Function names should describe their purpose clearly
Prefer early returns over nested conditionals for better readability
Files:
apps/roam/src/utils/jsonld.ts
apps/roam/**/*.{js,ts,tsx,jsx,json}
📄 CodeRabbit inference engine (.cursor/rules/roam.mdc)
Prefer existing dependencies from package.json when working on the Roam Research extension
Files:
apps/roam/src/utils/jsonld.ts
apps/roam/**/*.{ts,tsx,jsx,js,css,scss}
📄 CodeRabbit inference engine (.cursor/rules/roam.mdc)
Use BlueprintJS 3 components and Tailwind CSS for platform-native UI in the Roam Research extension
Files:
apps/roam/src/utils/jsonld.ts
apps/roam/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.cursor/rules/roam.mdc)
apps/roam/**/*.{ts,tsx,js,jsx}: Use the roamAlphaApi docs from https://roamresearch.com/#/app/developer-documentation/page/tIaOPdXCj when implementing Roam functionality
Use Roam Depot/Extension API docs from https://roamresearch.com/#/app/developer-documentation/page/y31lhjIqU when implementing extension functionality
Files:
apps/roam/src/utils/jsonld.ts
apps/roam/**
📄 CodeRabbit inference engine (.cursor/rules/roam.mdc)
Implement the Discourse Graph protocol in the Roam Research extension
Files:
apps/roam/src/utils/jsonld.ts
apps/website/**
📄 CodeRabbit inference engine (.cursor/rules/website.mdc)
Prefer existing dependencies from apps/website/package.json when adding dependencies to the website application
Files:
apps/website/public/schema/dg_core.ttl
🧠 Learnings (3)
📓 Common learnings
Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/roam.mdc:0-0
Timestamp: 2025-11-25T00:52:41.934Z
Learning: Applies to apps/roam/** : Implement the Discourse Graph protocol in the Roam Research extension
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 189
File: packages/database/supabase/migrations/20250603144146_account_centric.sql:50-63
Timestamp: 2025-06-04T11:41:34.951Z
Learning: In the discourse-graph database, all accounts currently stored are Roam platform accounts, making platform-specific migration logic safe for global operations.
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 220
File: apps/roam/src/utils/conceptConversion.ts:11-40
Timestamp: 2025-06-23T11:49:45.457Z
Learning: In the DiscourseGraphs/discourse-graph codebase, direct access to `window.roamAlphaAPI` is the established pattern throughout the codebase. The team prefers to maintain this pattern consistently rather than making piecemeal changes, and plans to address dependency injection as a global refactor when scheduled.
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 165
File: packages/database/schema.yaml:116-121
Timestamp: 2025-05-20T03:06:16.600Z
Learning: In the discourse-graph project's LinkML schema (packages/database/schema.yaml), attributes and slots are equivalent constructs. Items can be defined either as slots or attributes without needing to duplicate them in both sections.
📚 Learning: 2025-06-22T10:40:52.752Z
Learnt from: sid597
Repo: DiscourseGraphs/discourse-graph PR: 232
File: apps/roam/src/utils/getAllDiscourseNodesSince.ts:18-31
Timestamp: 2025-06-22T10:40:52.752Z
Learning: In apps/roam/src/utils/getAllDiscourseNodesSince.ts, the user confirmed that querying for `?title` with `:node/title` and mapping it to the `text` field in the DiscourseGraphContent type is the correct implementation for retrieving discourse node content from Roam Research, despite it appearing to query page titles rather than block text content.
Applied to files:
apps/roam/src/utils/jsonld.ts
📚 Learning: 2025-11-05T21:57:14.909Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 534
File: apps/roam/src/utils/createReifiedBlock.ts:40-48
Timestamp: 2025-11-05T21:57:14.909Z
Learning: In the discourse-graph repository, the function `getPageUidByPageTitle` from `roamjs-components/queries/getPageUidByPageTitle` is a synchronous function that returns a string directly (the page UID or an empty string if not found), not a Promise. It should be called without `await`.
Applied to files:
apps/roam/src/utils/jsonld.ts
🧬 Code graph analysis (1)
apps/roam/src/utils/jsonld.ts (3)
apps/roam/src/utils/types.ts (1)
Result(42-46)apps/roam/src/utils/pageToMarkdown.ts (1)
pageToMarkdown(212-347)apps/roam/src/utils/getExportTypes.ts (1)
updateExportProgress(25-33)
🪛 GitHub Actions: CI
apps/roam/src/utils/jsonld.ts
[error] 203-203: Type '(Record<string, string | string[]> | { "@type": string; predicate: string; source: string; destination: string; })[]' is not assignable to type 'string | Record<string, string> | Record<string, string>[]'.
🔇 Additional comments (5)
apps/roam/src/utils/jsonld.ts (3)
20-20: LGTM: Context extensions for backlinks.The new CURIE mappings for
dg(core schema namespace) andbacklinkproperty are correctly structured and align with the ontology additions indg_core.ttl.Also applies to: 36-36
138-185: LGTM: Async node construction with backlinks.The refactor to async node construction is well-structured:
- Proper use of
Promise.allfor concurrent processing- Correct integration with
pageToMarkdownper the codebase pattern- Conditional inclusion of backlinks only when present
- Progress tracking maintained throughout
155-168: The backlinks query logic is correct.The Datalog query properly finds all pages that reference the current page:
- It locates the target page and retrieves all blocks on it (including nested blocks)
- It finds blocks with
:block/refsrelationships to either those blocks or the page itself- It extracts the UID of the pages containing those references
- It filters results to only pages in the export
This implementation handles both page-level and block-level references correctly and is consistent with similar query patterns elsewhere in the codebase. No changes needed.
apps/website/public/schema/dg_core.ttl (2)
12-17: LGTM: Ontology metadata.The ontology header properly declares the core vocabulary with appropriate metadata (date, version, labels). The tentative version marker correctly signals the experimental status.
101-101: LGTM: Backlink property semantics (pending fixes).The ontology design is sound:
dg:containsRecas a transitive property correctly models hierarchical containment- The property chain axiom
(dg:containsRec dct:references)logically derives that if A transitively contains B and B references C, then A contains a reference to Cdg:backlinkas the inverse ofdg:containsRefcorrectly models bidirectional reference relationshipsThis aligns well with the JSON-LD implementation in
jsonld.ts.Note: This approval is contingent on fixing the critical issues identified in previous comments (missing
dct:prefix andrdfs:subPropertyOf).Also applies to: 105-111
mdroidian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to be more stringent with adherence to the PR workflows going forward. Please take a minute to re-read the handbook if necessary.
Please add the loom video to the body of the PR then re-tag me for review.
https://linear.app/discourse-graphs/issue/ENG-1230/backlinks-in-json-ld-export
Add backlink information in json-ld export
Create a backlink relationship based on common elements (dcterms) and also reference to the document structure ontology (which allows describing roam block containment)
Accessorily: Added ontology info to dg-core, consolidated per-page calculations in a single function for better progress tracking.
Loom: https://www.loom.com/share/b9fc610d5d9349e0aff6af16dc801636
Summary by CodeRabbit
Release Notes
✏️ Tip: You can customize this high-level summary in your review settings.