Skip to content

Conversation

Alex619829
Copy link

This feature adds ability to delete only one repo-index with zoekt-dynamic-indexsearver.
Currently zoekt-dynamic-indexserver supports only deleting all repositories with the /truncate route.
This pull request adds a new route /delete, which deletes only one repository by its ID.
This feature is necessary for using zoekt as a server for indexing repositories in systems such as gitea, forgejo, etc.
Now I am trying to add support for searching code using zoekt to a Forgejo project, and I encountered the need for this functionality

@Alex619829
Copy link
Author

@stefanhengl Hello! Can somebody look at this please?

@earl-warren
Copy link

I'm interested to see this pull request land. Is there something blocking the review?

Copy link
Member

@keegancsmith keegancsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @DylanGriffith this index server was added by you for GitLab. Is GitLab still using it? If not, we should actually open a discussion on removing this from the repo. This code isn't used by the maintainers and doesn't actually import any zoekt APIs. So is more appropriate to live somewhere else.

return
}

err := deleteRepository(s.opts, req.RepoID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't you also need to delete the shards for repo, or will you let converge by the indexing server noticing it is gone.

@DylanGriffith
Copy link
Contributor

cc @DylanGriffith this index server was added by you for GitLab. Is GitLab still using it? If not, we should actually open a discussion on removing this from the repo. This code isn't used by the maintainers and doesn't actually import any zoekt APIs. So is more appropriate to live somewhere else.

@keegancsmith we stopped using this a while ago at GitLab when we built a specialised indexer in it's own repo at https://gitlab.com/gitlab-org/gitlab-zoekt-indexer . I believe we're fine for this to be removed from sourcegraph/zoekt . But I guess this PR implies that some people are using it. The specialised indexer we built for GitLab is unlikely to be suitable for anyone trying to index projects outside of GitLab.

@keegancsmith
Copy link
Member

@earl-warren @Alex619829 what is your use case for zoekt dynamic indexserver? Would you be open to creating your own project (can advertise it) which is a fork of its code?

@earl-warren
Copy link

@keegancsmith you will find background and integration pull request with Forgejo at https://codeberg.org/forgejo/forgejo/pulls/8827

I'm an interested party but it really is @Alex619829 leading this effort so I'll let them provide a longer answer 😁

@Alex619829
Copy link
Author

Hello! I'm adding code search functionality to Forgejo using zoekt. I'd also like to enable the code indexer to run on a separate server—this will relieve the Forgejo server of the indexing load. Zoekt-dynamic-indexserver is ideal for this, as it uses a push model. This means that when a repository is added, updated, or deleted, you can make an API request to zoekt-dynamic-indexserver, and the repository index will be indexed or deleted. I think this is the most convenient way to integrate code storage projects like gitea, Forgejo, and the code search project, as they can be run on different machines, and whenever a repository in the code storage system changes, the code will be indexed immediately on the dedicated server. Maybe I'm missing something, but why give up such a convenient tool? Thanks for the reply!

@keegancsmith
Copy link
Member

The reason to give it up is a few:

  • It is independent. It only calls out to zoekt binaries, but doesn't actually use any zoekt APIs.
  • No one uses it (yet :P).
  • The maintainers don't use it so I don't want to provide guarantees around it.
  • The code is so simple it really isn't robust. I don't want to start maintaining stuff as you make it more robust. EG what happens if a git clone on disk becomes corrupt.

Given how simple it is, it really is more appropriate for you to vendor in this code yourself. You can then make it work so it better fits the APIs of forego.

IE the API that I feel good about providing is the CLI's you are calling out to. Additionally I'm happy to do things like make zoekt-git-index take in a flag to ignore some files (eg vendor like mentioned in the linked issue)

Additionally, I assume your project actually has some APIs around git exported tarballs? You can then end up with a "dynamic server" which doesn't need to keep around git clones. You can decide how to lay stuff out/etc. You also end up being a bit more of an expert on the glue between your project and zoekt, which I assume is better for your project. Happy to give out advice here, not so happy to maintain things that are not core to zoekt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants