Skip to content

Conversation

@swamirishi
Copy link
Contributor

What changes were proposed in this pull request?

Instead of performing ByteWise comparison on Java side moving the comparison to native side would more optimal since most of the transactions are going to use direct byte buffers. Here we intend to use the rocksdb Slice to perform bytewise comparisons. RocksDB Bytewise comparators also uses the same comparator.
https://github.com/facebook/rocksdb/blob/c110091d368b8a01b5be36a14198769e60786c05/util/comparator.cc#L36-L38
Unfortunately currently in 7.7.3 there is no direct native comparison that has been implemented. We can look into contributing to Rocksdb by having a JNI implementation for

https://github.com/facebook/rocksdb/blob/0bf9079d44eea91afda7151306d3a3439a39511b/java/src/main/java/org/rocksdb/NativeComparatorWrapper.java#L16

Till then this is a nice workaround to have.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-14238

How was this patch tested?

Update existing unit tests

@swamirishi swamirishi force-pushed the HDDS-14238 branch 2 times, most recently from e4e5016 to 692b5ed Compare December 24, 2025 13:51
@swamirishi swamirishi force-pushed the HDDS-14238 branch 8 times, most recently from a3349d7 to deb7900 Compare December 25, 2025 05:47
…on for optimization

Change-Id: Ia7655ff5148197be488a2c1151ec7fd1d6f9d452
@swamirishi
Copy link
Contributor Author

@szetszwo This is good to be reviewed. I will click on ready for review when I get a +1. I don't want to run the CI multiple times.

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@swamirishi , thanks for working on this! Please see the comments inlined.

Comment on lines 103 to 111
private void initWithByteArray(byte[] array) {
this.slice = new ManagedSlice(array);
this.hash = ByteBuffer.wrap(array).hashCode();
}

ByteBuffer asReadOnlyByteBuffer() {
return buffer.asReadOnlyByteBuffer();
private void initWithDirectByteBuffer(ByteBuffer byteBuffer) {
this.slice = new ManagedDirectSlice(byteBuffer);
this.hash = byteBuffer.hashCode();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constructors should not call any functions. Otherwise, the fields cannot be final, which is very useful for reducing bugs and for understanding the code..

    private final AbstractSlice<?> slice;
    /** Cache the hash value. */
    private final int hash;

    static Bytes newBytes(CodecBuffer buffer) {
      Objects.requireNonNull(buffer, "buffer == null");
      return buffer.isDirect() ? new Bytes(buffer.asReadOnlyByteBuffer()) : new Bytes(buffer.getArray());
    }

    Bytes(ByteBuffer buffer) {
      Objects.requireNonNull(buffer, "buffer == null");
      assertTrue(buffer.isDirect(), "buffer is not direct");
      this.slice = new ManagedDirectSlice(buffer);
      this.hash = buffer.hashCode();
    }

    Bytes(byte[] array) {
      Objects.requireNonNull(array, "array == null");
      this.slice = new ManagedSlice(array);
      this.hash = ByteBuffer.wrap(array).hashCode();
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

public String toString() {
return array != null ? bytes2String(array)
: bytes2String(asReadOnlyByteBuffer());
return slice.toString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you checked what will it return? Could you show an example output?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public static void main(String[] args) throws CodecException {
    String value = "test";
    Bytes bytes = new Bytes(value.getBytes(UTF_8));
    System.out.println("To String Heap Value: " + bytes);
    Bytes dbytes = newBytes(CodecBufferCodec.get(true).fromPersistedFormat(value.getBytes(UTF_8)));
    System.out.println("To String Direct Value: " + dbytes);
  }
2025-12-28 18:25:34,786 [main] INFO  db.CodecBuffer (CodecBuffer.java:set(85)) - Successfully set constructor to LeakDetector::newCodecBuffer: org.apache.hadoop.hdds.utils.db.CodecBuffer$$Lambda$4/1694556038@7085bdee
To String Heap Value: test
To String Direct Value: test

Copy link
Contributor

@szetszwo szetszwo Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To String Heap Value: test
To String Direct Value: test

Isn't the data supposed to be binary? This seems not working.

Copy link
Contributor Author

@swamirishi swamirishi Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you say it is not working. Maybe you are not loading the rocksdb libs before this. I forgot there was static codeblock loading the rocksdb lib.
This should work:

public static void main(String[] args) throws CodecException {
    ManagedRocksObjectUtils.loadRocksDBLibrary();
    String value = "test";
    Bytes bytes = new Bytes(value.getBytes(UTF_8));
    System.out.println("To String Heap Value: " + bytes);
    Bytes dbytes = newBytes(CodecBufferCodec.get(true).fromPersistedFormat(value.getBytes(UTF_8)));
    System.out.println("To String Direct Value: " + dbytes);
  }

Copy link
Contributor Author

@swamirishi swamirishi Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output before the change and after will be the same since StringUtils.byte2String() also decodes the byte value to String value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want the binary value then we have to print the hex value.
slice.toString(true) would do that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did the above to keep the behaviour same as before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right -- it is the same as before.

assertEquals(fromArray.hashCode(), fromBuffer.hashCode());
assertEquals(fromArray, fromBuffer);
assertEquals(fromBuffer, fromArray);
for (CodecBuffer.Allocator allocator : ImmutableList.of(CodecBuffer.Allocator.HEAP, CodecBuffer.Allocator.DIRECT)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pass allocator instead of adding a loop.

    runTestBytes(original, codec, CodecBuffer.Allocator.HEAP);
    runTestBytes(original, codec, CodecBuffer.Allocator.DIRECT);
  }

  static <T> void runTestBytes(T object, Codec<T> codec, CodecBuffer.Allocator allocator) throws IOException {
    final byte[] array = codec.toPersistedFormat(object);
    final Bytes fromArray = new Bytes(array);

    try (CodecBuffer buffer = codec.toCodecBuffer(object, allocator)) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@szetszwo
Copy link
Contributor

... I will click on ready for review when I get a +1. I don't want to run the CI multiple times.

@swamirishi , we should pass the CI first before asking for reviewing -- if the change cannot pass CI, it may be fundamentally incorrect. What is the point for reviewing?

⚠️ Also, reviewer's time/effort is much more important than saving the machines' time/effort.

Change-Id: I913be571b87e318abc798c7396ca072de23b01e8
@swamirishi
Copy link
Contributor Author

swamirishi commented Dec 28, 2025

... I will click on ready for review when I get a +1. I don't want to run the CI multiple times.

@swamirishi , we should pass the CI first before asking for reviewing -- if the change cannot pass CI, it may be fundamentally incorrect. What is the point for reviewing?

⚠️ Also, reviewer's time/effort is much more important than saving the machines' time/effort.

One can always check the CI run on the fork. We have been following this model on almost all PRs for a while.
@adoroszlai can also add his opinions on this.

@swamirishi swamirishi requested a review from szetszwo December 28, 2025 23:31
@szetszwo
Copy link
Contributor

One can always check the CI run on the fork. ...

Please provide the link then. I would like to save my time for finding it.

@swamirishi
Copy link
Contributor Author

One can always check the CI run on the fork. ...

Please provide the link then. I would like to save my time for finding it.

https://github.com/swamirishi/ozone/actions/runs/20561058088

@swamirishi swamirishi marked this pull request as ready for review December 29, 2025 09:32
@swamirishi
Copy link
Contributor Author

swamirishi commented Dec 29, 2025

@szetszwo i have addressed all your review comments is this good to be merged?

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@swamirishi , thanks for the update! Please see the comment inlined.

public String toString() {
return array != null ? bytes2String(array)
: bytes2String(asReadOnlyByteBuffer());
return slice.toString();
Copy link
Contributor

@szetszwo szetszwo Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To String Heap Value: test
To String Direct Value: test

Isn't the data supposed to be binary? This seems not working.

@swamirishi swamirishi requested a review from szetszwo December 29, 2025 17:52
Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 the change looks good.

@swamirishi
Copy link
Contributor Author

Thank you @szetszwo for reviewing the patch

@swamirishi swamirishi merged commit 64bb019 into apache:master Dec 29, 2025
95 of 96 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants