You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Axel Hanikel (Jira)" <ji...@apache.org> on 2023/06/19 09:16:00 UTC

[jira] [Created] (OAK-10311) Optimize SegmentBlob#equals for segment blobs that originate from the same blob store

Axel Hanikel created OAK-10311:
----------------------------------

             Summary: Optimize SegmentBlob#equals for segment blobs that originate from the same blob store
                 Key: OAK-10311
                 URL: https://issues.apache.org/jira/browse/OAK-10311
             Project: Jackrabbit Oak
          Issue Type: Story
          Components: segment-tar
    Affects Versions: 1.52.0
            Reporter: Axel Hanikel
             Fix For: 1.54.0


[SegmentBlob#equals|https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.52.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentBlob.java] can be optimized for blobs originating from the same external blob store: In that case, SegmentBlob#getBlobId() can be used to determine if the blobs are equal or not. That change should avoid falling back to a byte-by-byte comparison if the two blobs aren't equal and therefore don't have the same ContentIdentity, which can be very expensive for larger blobs.

The optimization should be placed behind a code/feature toggle or system property and disabled by default. The reason is that the contract of [Blob#getContentIdentity|https://github.com/apache/jackrabbit-oak/blob/cc0521e1cf8dc10ad7a8d41a9f2d3fd2905e5c9b/oak-api/src/main/java/org/apache/jackrabbit/oak/api/Blob.java#L80-L82] does not mention the special case where Blobs reside in the same blob store. 

As part of this story, integration and benchmark tests should be created to demonstrate gains in performance and correct behavior. 




--
This message was sent by Atlassian Jira
(v8.20.10#820010)