Add support for data manipulation operations in Blob Storage adapter #3124

KodaiD · 2025-11-06T10:14:34Z

Description

This PR adds DML support for the Blob adapter, enabling data manipulation operations on Azure Blob Storage.

Support for S3 and GCS will be addressed in subsequent PRs.

Related issues and/or PRs

Add support for administrative operations in Blob adapter #3104

Changes made

Added DistributedStorage implementation for Object Storage.
Added unit tests.
Added integration tests.

Checklist

I have commented my code, particularly in hard-to-understand areas.
I have updated the documentation to reflect the changes.
I have considered whether similar issues could occur in other products, components, or modules if this PR is for bug fixes.
Any remaining open issues linked to this PR are documented and up-to-date (Jira, GitHub, etc.).
Tests (unit, integration, etc.) have been added for the changes.
My changes generate no new warnings.
Any dependent changes in other PRs have been merged and published.

Additional notes (optional)

N/A

Release notes

Added support for data manipulation operations over Azure Blob Storage.

KodaiD · 2025-11-06T12:53:12Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces data manipulation language (DML) support for the Blob Storage adapter, which is a significant and well-implemented feature. The changes include the core logic for handling mutations, numerous integration tests to ensure correctness, and necessary updates to error codes and utility classes. My review focuses on improving code clarity, robustness, and performance in the new classes. I've identified opportunities to remove dead code, enhance immutability, improve test coverage for conditional operations, and optimize a hashCode implementation. Overall, this is a solid contribution that greatly enhances the capabilities of the object storage adapter.

core/src/main/java/com/scalar/db/storage/objectstorage/ObjectStorageRecord.java

...ava/com/scalar/db/storage/objectstorage/ObjectStorageConditionalMutationIntegrationTest.java

core/src/main/java/com/scalar/db/common/CoreError.java

core/src/main/java/com/scalar/db/storage/objectstorage/ObjectStorageMutation.java

core/src/main/java/com/scalar/db/storage/objectstorage/PartitionIdentifier.java

gemini-code-assist

Code Review

This pull request adds significant data manipulation capabilities to the Blob Storage adapter, a major step forward. The overall implementation is solid, with good use of handlers for different statement types and a clear read-modify-write pattern for mutations. I've identified a few areas for improvement, including a bug in data serialization, a suboptimal hash implementation, and opportunities to enhance test coverage and code readability. My detailed comments are below.

core/src/main/java/com/scalar/db/storage/objectstorage/MapVisitor.java

...ava/com/scalar/db/storage/objectstorage/ObjectStorageConditionalMutationIntegrationTest.java

core/src/main/java/com/scalar/db/common/CoreError.java

core/src/main/java/com/scalar/db/storage/objectstorage/ObjectStorage.java

core/src/main/java/com/scalar/db/storage/objectstorage/PartitionIdentifier.java

gemini-code-assist

Code Review

This pull request introduces Data Manipulation Language (DML) support for the Azure Blob Storage adapter, which is a significant enhancement. The changes include new classes for handling mutations and selections, as well as comprehensive integration and unit tests. The implementation correctly identifies and enforces several limitations inherent to object storage, such as the lack of index support, restricted conditional operations for BLOB types, and limitations on cross-partition scan ordering. The addition of specific error codes for these scenarios is also a positive step for clarity and debugging. Overall, the changes are well-structured and provide a solid foundation for DML operations in Object Storage.

core/src/main/java/com/scalar/db/storage/objectstorage/SelectStatementHandler.java

core/src/main/java/com/scalar/db/storage/objectstorage/MapVisitor.java

core/src/main/java/com/scalar/db/storage/objectstorage/ObjectStorageRecord.java

core/src/main/java/com/scalar/db/storage/objectstorage/ResultInterpreter.java

core/src/integration-test/java/com/scalar/db/storage/objectstorage/ObjectStorageTestUtils.java

Copilot

Copilot reviewed 56 out of 56 changed files in this pull request and generated no comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

KodaiD · 2025-11-07T01:54:50Z

...ation-test/java/com/scalar/db/storage/objectstorage/ObjectStorageWrapperIntegrationTest.java

    // Assert
    assertThat(keys).isEmpty();
  }



Since Blob Storage can return up to 5,000 keys with a LIST operation, we should check whether getKeys can return more than 5,000 keys.

Ref: https://learn.microsoft.com/rest/api/storageservices/list-blobs

komamitsu · 2025-11-07T03:03:26Z

core/src/main/java/com/scalar/db/storage/objectstorage/ClusteringKeyComparator.java

+
+      Column<?> column1 =
+          ColumnValueMapper.convert(
+              clusteringKey1.get(columnName), columnName, metadata.getColumnDataType(columnName));


[minor] metadata.getColumnDataType(columnName) can be reused.

Fixed in 533f3cd.

komamitsu · 2025-11-07T04:29:21Z

core/src/main/java/com/scalar/db/storage/objectstorage/MutateStatementHandler.java

+    Map<String, ObjectStorageRecord> partition =
+        getPartition(namespaceName, tableName, partitionKey, readVersionMap);
+    for (Mutation mutation : mutations) {
+      if (mutation instanceof Put) {


Insert, Update and Upsert aren't passed to this method?

DistributedStorage only handles Put. Insert, Update, and Upsert are translated into Put with a condition.

komamitsu · 2025-11-07T05:09:40Z

core/src/main/java/com/scalar/db/storage/objectstorage/StatementHandler.java

+      if (validationFailed) {
+        throw new ExecutionException(
+            String.format(
+                "A condition failed. ConditionalExpression: %s, Column: %s",


How about including the operator in this error message?

Ah, it's a mistake. We should pass the expression that includes both the column and operator information instead of expectedColumn. Let me fix it.

Fixed in 533f3cd.

komamitsu · 2025-11-07T06:17:44Z

core/src/main/java/com/scalar/db/storage/objectstorage/SelectStatementHandler.java

+        throw new IllegalArgumentException(
+            CoreError.OPERATION_CHECK_ERROR_ORDERING_NOT_PROPERLY_SPECIFIED.buildMessage(scan));
+      }
+      boolean rightOrder =


Can you elaborate why this variable name is rightOrder?

Also, if the first scan order is the same as the clustering key order and the second scan order is different, it will fail?

Sorry, maybe isValidOrder is more suitable. The variable indicates whether the specified scan ordering is valid.

In ScalarDB, we can only read records sorted by the clustering order defined in the table metadata, in forward or reverse order. For example, if a table has clustering keys first (Order=ASC) and second (Order=DESC), we can specify the following scan order:

first second

ASC -

ASC DESC

DESC -

DESC ASC

@brfrn169 I think we discussed this previously, but is this still correct?

Also, if the first scan order is the same as the clustering key order and the second scan order is different, it will fail?

So, the answer is "yes."

komamitsu · 2025-11-07T06:25:01Z

core/src/main/java/com/scalar/db/storage/objectstorage/SelectStatementHandler.java

+
+    records.sort(
+        (o1, o2) ->
+            new ClusteringKeyComparator(metadata)


I might be missing something, but why don't we simply sort the records based on Scan.ordering (and the metadata's clustering key orders) at this point? If it works, the following Collections.reverse() isn't needed?

Due to this specification, we need to sort the records by the clustering order defined in the table metadata. Then, we can decide whether to read the sorted records in forward or reverse order based on the order specified in the scan request.

Add DML support for Blob Storage

1231cb7

KodaiD self-assigned this Nov 6, 2025

KodaiD added the enhancement New feature or request label Nov 6, 2025

gemini-code-assist bot reviewed Nov 6, 2025

View reviewed changes

Apply suggestions

4811884

KodaiD marked this pull request as ready for review November 7, 2025 01:26

Copilot AI review requested due to automatic review settings November 7, 2025 01:26

Copilot AI reviewed Nov 7, 2025

View reviewed changes

KodaiD requested review from a team, Torch3333, brfrn169, feeblefakie and komamitsu and removed request for a team November 7, 2025 01:44

KodaiD commented Nov 7, 2025

View reviewed changes

Fix to remove BLOB type limitations

0a73daf

KodaiD force-pushed the add-dml-support-for-blob-storage branch from 70f2d31 to 0a73daf Compare November 7, 2025 02:34

Update the OperationChecker unit test to reflect the code change

dcc742e

komamitsu reviewed Nov 7, 2025

View reviewed changes

KodaiD requested a review from komamitsu November 7, 2025 07:25

Fix based on review

533f3cd

Add support for data manipulation operations in Blob Storage adapter #3124

Are you sure you want to change the base?

Add support for data manipulation operations in Blob Storage adapter #3124

Uh oh!

Conversation

KodaiD commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related issues and/or PRs

Changes made

Checklist

Additional notes (optional)

Release notes

Uh oh!

KodaiD commented Nov 6, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

KodaiD Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

KodaiD Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

KodaiD Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

KodaiD commented Nov 6, 2025 •

edited

Loading

KodaiD Nov 7, 2025 •

edited

Loading

KodaiD Nov 7, 2025 •

edited

Loading

KodaiD Nov 7, 2025 •

edited

Loading