Skip to content

Conversation

mongodben
Copy link
Collaborator

@mongodben mongodben commented Jun 24, 2025

Jira: https://jira.mongodb.org/browse/EAI-956

Changes

  • Search content route
  • List content sources route
  • Documentation on new routes

Notes

Ben Perlmutter and others added 9 commits June 24, 2025 17:01
* Create MongoDbSearchResultsStore, add limit to DefaultFindContent and add test for limit

* Implement saveSearchResult, create MongoDbSearchResultsStore.test.ts

* lint, format

* Check entire returned document in MongoDbSearchResultsStore.test.ts

* Create ResultChunk type and zod check

* Correct usage of limit in makeDefaultFindContent

* PR feedback: cast badSearchResultRecord as any

* Use unknown instead of any for ResultChunk additional metadata

* PR feedback: Combine describe blocks in MongoDbSearchResultsStore.test.ts, remove zod checks where unnecessary in MongoDbSearchResultsStore

* Combine describes
* Create MongoDbSearchResultsStore, add limit to DefaultFindContent and add test for limit

* Implement saveSearchResult, create MongoDbSearchResultsStore.test.ts

* lint, format

* Check entire returned document in MongoDbSearchResultsStore.test.ts

* Create ResultChunk type and zod check

* Correct usage of limit in makeDefaultFindContent

* PR feedback: cast badSearchResultRecord as any

* Starting structure of searchContent

* Use unknown instead of any for ResultChunk additional metadata

* Add searchContent test file, broaden QueryFilter & MongoDbAtlasVectorSearchFilter types

* Work on clarity of comments in contentRouter

* PR feedback: Combine describe blocks in MongoDbSearchResultsStore.test.ts, remove zod checks where unnecessary in MongoDbSearchResultsStore

* Use generics on middleware: requireRequestOrigin, requireValidIpAddress

* Structure out contentRouter test file

* Combine describes

* Clean

* makeFindContentWithMongoDbMetadata

* config.test.ts

* Clean

* PR feedback

* Correct types

* Correct test

* Fix test return type

* lint

* Revert move of classifyMongoDbProgrammingLanguageAndProduct, jest needs function outside of file to mock

* Remove unnecessary tests and comments
* Add searchContent pieces to openapi spec

* PR feedback
* Create MongoDbSearchResultsStore, add limit to DefaultFindContent and add test for limit

* Implement saveSearchResult, create MongoDbSearchResultsStore.test.ts

* lint, format

* Check entire returned document in MongoDbSearchResultsStore.test.ts

* Create ResultChunk type and zod check

* Correct usage of limit in makeDefaultFindContent

* PR feedback: cast badSearchResultRecord as any

* Starting structure of searchContent

* Use unknown instead of any for ResultChunk additional metadata

* Add searchContent test file, broaden QueryFilter & MongoDbAtlasVectorSearchFilter types

* Work on clarity of comments in contentRouter

* PR feedback: Combine describe blocks in MongoDbSearchResultsStore.test.ts, remove zod checks where unnecessary in MongoDbSearchResultsStore

* Use generics on middleware: requireRequestOrigin, requireValidIpAddress

* Structure out contentRouter test file

* Combine describes

* Clean

* makeFindContentWithMongoDbMetadata

* config.test.ts

* Clean

* PR feedback

* Correct types

* Correct test

* Fix test return type

* lint

* Revert move of classifyMongoDbProgrammingLanguageAndProduct, jest needs function outside of file to mock

* Created addCustomData.ts, generics, use in both contentRouter and conversationRouter

* Clean

* Remove unnecessary tests and comments

* Added custom middleware to contentRouter, used in searchContent route, added to tests

* Add customData to db...

* Clean: allow undefined customData value

* Alter types for createConversationsMiddlewareReq

* Rerun tests

* PR feedback

* Add Locals types to middleware invocations

* Lint, fix trace name, remove unnecessary import

* (EAI-972) Add extra braintrust tracing to searchContent route (#822)

Add extra braintrust tracing to searchContent route

* Use safely parsed req.body, handle possibly undefined dataSources
* Create listDataSources route, plug into searchContent, blank file created for listDatasources test file

* Clean sources endpoint and data types

* Add openapi spec documentation for new listDataSources endpoint

* listDataSources test file

* PR feedback and add isCurrent

* Clean yaml

* Rename version -> versions

* PR feedback

* ci

* Change embedding search index

* Use plain memory server

* Fix tests

* Make unused parameter obvious

* Use cache in listDataSources

* Add tests for cache
@mongodben mongodben changed the title [feature branch] Search Content API [feature branch] Search Content API (Epic EAI-956) Jul 28, 2025
@mmeigs mmeigs marked this pull request as ready for review July 28, 2025 15:18
@mmeigs mmeigs requested a review from nlarew July 28, 2025 15:43
@nlarew
Copy link
Collaborator

nlarew commented Jul 30, 2025

Generally looks good - I tested both endpoints locally and they seem to work in the base case.

Notably I'm not getting results for this example that searches on a specific version:

curl -X POST 'http://localhost:5183/api/v1/content/search' \
  --header 'Content-Type: application/json' \
  --header 'X-Request-Origin: NicksLaptop' \
  --data-raw '{ "query": "how to use index?", "limit": 3, "dataSources": [{ "name": "docs", "versionLabel": "6.0" }] }'

mmeigs and others added 2 commits July 31, 2025 09:53
@mmeigs
Copy link
Collaborator

mmeigs commented Jul 31, 2025

Generally looks good - I tested both endpoints locally and they seem to work in the base case.

Notably I'm not getting results for this example that searches on a specific version:

curl -X POST 'http://localhost:5183/api/v1/content/search' \
  --header 'Content-Type: application/json' \
  --header 'X-Request-Origin: NicksLaptop' \
  --data-raw '{ "query": "how to use index?", "limit": 3, "dataSources": [{ "name": "docs", "versionLabel": "6.0" }] }'

Howdy! Sooo, took a while trying to debug this, and then Ben helped me realize it wasn't actually a bug! The dev db has only a few documents that match the filter, and yet none of them are actually relevant to the query. That's why the result is empty. Pointing to the staging db will return results. Ben just merged in some stronger typing though, and we're going to merge to main in a bit.

Copy link
Collaborator

@nlarew nlarew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@mmeigs mmeigs merged commit b791605 into main Jul 31, 2025
1 check passed
@mmeigs mmeigs deleted the search_content_route branch July 31, 2025 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants