Skip to content

Conversation

@inv-jishnu
Copy link
Contributor

Description

In this PR I have replaced the usage of DIstributedStorage with SingleCrudTransactionManager in scalardb data loader when storage mode is selected as import mode. This PR only removes distributed storage from import section but not completely. The other occurrences will be removed as well in later PRs.

Related issues and/or PRs

NA

Changes made

  • Replaced usage of DistributedStorage with SingleCrudTransactionManager
    • Made related changes in both core and CLI
    • Updated unit tests to reflect the change

Checklist

The following is a best-effort checklist. If any items in this checklist are not applicable to this PR or are dependent on other, unmerged PRs, please still mark the checkboxes after you have read and understood each item.

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes.
  • I have considered whether similar issues could occur in other products, components, or modules if this PR is for bug fixes.
  • Any remaining open issues linked to this PR are documented and up-to-date (Jira, GitHub, etc.).
  • Tests (unit, integration, etc.) have been added for the changes.
  • My changes generate no new warnings.
  • Any dependent changes in other PRs have been merged and published.

Additional notes (optional)

I will create other PRs for related changes to table metadata service, and export service

Release notes

Replaced DistributedStorage with SingleCrudTransactionManager on data loader import storage mode

@inv-jishnu inv-jishnu requested a review from ypeckstadt November 6, 2025 12:25
@inv-jishnu inv-jishnu self-assigned this Nov 6, 2025
@inv-jishnu inv-jishnu marked this pull request as draft November 6, 2025 12:25
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @inv-jishnu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the data loader's import mechanism to utilize SingleCrudTransactionManager instead of DistributedStorage when operating in storage mode. This enhancement aims to optimize data operations by aligning with a more suitable transaction management approach for single CRUD operations, thereby improving efficiency and consistency within ScalarDB's data handling framework.

Highlights

  • DistributedStorage Replacement: Replaced the usage of DistributedStorage with SingleCrudTransactionManager within the data loader's import storage mode.
  • Scope of Change: This change specifically targets the import functionality; other instances of DistributedStorage will be addressed in future pull requests.
  • Codebase Adjustments: Necessary modifications were applied across the core data loader logic and its command-line interface (CLI), accompanied by updated unit tests.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully replaces DistributedStorage with SingleCrudOperationTransactionManager for the data loader's import storage mode, which is a positive step toward unifying the data access APIs. The changes are applied consistently across the CLI, core logic, and tests. My review includes a couple of suggestions for improvement. Firstly, there's a notable amount of code duplication in ScalarDbDao from adding new overloaded methods; refactoring this could enhance maintainability. Secondly, I've identified an inconsistency in exception handling within a new method in the same class. Addressing these points will further improve the code's quality and robustness.

Comment on lines +73 to +90
public Optional<Result> get(
String namespace,
String table,
Key partitionKey,
Key clusteringKey,
SingleCrudOperationTransactionManager manager)
throws ScalarDbDaoException {

// Retrieving the key data for logging
String loggingKey = keysToString(partitionKey, clusteringKey);

try {
Get get = createGetWith(namespace, table, partitionKey, clusteringKey);
return manager.get(get);
} catch (CrudException e) {
throw new ScalarDbDaoException("error GET " + loggingKey, e);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This pull request introduces several new overloaded methods for SingleCrudOperationTransactionManager and DistributedTransactionManager (get, put, scan, createScanner). These new methods are very similar to existing ones, leading to significant code duplication. For instance, this new get method is almost identical to the one for DistributedTransaction.

Since SingleCrudOperationTransactionManager, DistributedTransactionManager, and DistributedTransaction all implement or can be treated as CrudOperable, you could refactor this to use a single generic method that accepts a CrudOperable argument. This would greatly reduce code duplication and improve the maintainability of this class.

A generic method might look something like this:

public Optional<Result> get(
    String namespace,
    String table,
    Key partitionKey,
    Key clusteringKey,
    CrudOperable operable)
    throws ScalarDbDaoException {
  // ...
}

This approach could be applied to put, scan, and createScanner methods as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the contents will be removed in another PR. I will also consider this and make necessary changes.

@inv-jishnu
Copy link
Contributor Author

Closing this PR as this change is no longer needed.

@inv-jishnu inv-jishnu closed this Nov 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant