You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2021/08/07 08:53:52 UTC

[GitHub] [kafka] showuon opened a new pull request #11187: [WIP] KAFKA-5966: let CacheFunction#key return ByteBuffer

showuon opened a new pull request #11187:
URL: https://github.com/apache/kafka/pull/11187


   
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] showuon commented on a change in pull request #11187: [WIP] KAFKA-5966: let CacheFunction#key return ByteBuffer

Posted by GitBox <gi...@apache.org>.
showuon commented on a change in pull request #11187:
URL: https://github.com/apache/kafka/pull/11187#discussion_r684704095



##########
File path: clients/src/main/java/org/apache/kafka/common/utils/Bytes.java
##########
@@ -40,6 +41,26 @@ public static Bytes wrap(byte[] bytes) {
         return new Bytes(bytes);
     }
 
+    /**
+     * Create a Bytes using the byte buffer. If the provided byteBuffer contains the whole content, we can directly
+     * use the backed array. If the byteBuffer has only partial of the content (ex: a sliced byteBuffer), we'll do array copy
+     *
+     * @param byteBuffer    The byteBuffer becomes the backing storage for the object.
+     */
+    public static Bytes wrap(ByteBuffer byteBuffer) {

Review comment:
       Create a `Bytes.wrap` for byteBuffer instance, and see if we can re-use the backed array directly, or we need to do array copy.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] showuon commented on a change in pull request #11187: [WIP] KAFKA-5966: let CacheFunction#key return ByteBuffer

Posted by GitBox <gi...@apache.org>.
showuon commented on a change in pull request #11187:
URL: https://github.com/apache/kafka/pull/11187#discussion_r684704566



##########
File path: streams/src/main/java/org/apache/kafka/streams/state/internals/MergedSortedCacheSessionStoreIterator.java
##########
@@ -45,8 +47,8 @@
 
     @Override
     Windowed<Bytes> deserializeCacheKey(final Bytes cacheKey) {
-        final byte[] binaryKey = cacheFunction.key(cacheKey).get();
-        final byte[] keyBytes = SessionKeySchema.extractKeyBytes(binaryKey);
+        final ByteBuffer binaryKey = cacheFunction.key(cacheKey);
+        final ByteBuffer keyBytes = SessionKeySchema.extractKeyBytes(binaryKey);
         final Window window = SessionKeySchema.extractWindow(binaryKey);

Review comment:
       This is a good example that we avoided unnecessary array copy.
   Before:
   1. `cacheFunction.key(cacheKey)`  -> do an array copy
   2. `SessionKeySchema.extractKeyBytes(binaryKey)` -> do an array copy
   
   After:
   1. `cacheFunction.key(cacheKey)`  -> no array copy, just return sliced byte buffer
   2. `SessionKeySchema.extractKeyBytes(binaryKey)` -> no array copy, just return sliced byte buffer
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] showuon closed pull request #11187: [WIP] KAFKA-5966: let CacheFunction#key return ByteBuffer

Posted by GitBox <gi...@apache.org>.
showuon closed pull request #11187:
URL: https://github.com/apache/kafka/pull/11187


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] showuon commented on pull request #11187: [WIP] KAFKA-5966: let CacheFunction#key return ByteBuffer

Posted by GitBox <gi...@apache.org>.
showuon commented on pull request #11187:
URL: https://github.com/apache/kafka/pull/11187#issuecomment-903380258


   @guozhangwang , thanks for the response. I closed this PR because I think we need a bigger change (maybe a KIP), to address this `byte[]` change to `ByteBuffer` change, to make the whole API can benefit from the `ByteBuffer` change. Otherwise, like in this PR, we have to wrap the `ByteBuffer` back to `byte[]` in many situations, because they still accept the `byte[]` only.
   
   I'll think about it. Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] showuon commented on pull request #11187: [WIP] KAFKA-5966: let CacheFunction#key return ByteBuffer

Posted by GitBox <gi...@apache.org>.
showuon commented on pull request #11187:
URL: https://github.com/apache/kafka/pull/11187#issuecomment-894739236


   @xvrl @guozhangwang , could you help take a look and make sure I'm on the right way? I know in KAFKA-5966, we expected to support for serdes into/from `ByteBuffer` in streams, but that might impact and change a lot. Do you think we should do this change? Or we can just have a small change like this PR, to have small improvement? 
   
   Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] showuon edited a comment on pull request #11187: [WIP] KAFKA-5966: let CacheFunction#key return ByteBuffer

Posted by GitBox <gi...@apache.org>.
showuon edited a comment on pull request #11187:
URL: https://github.com/apache/kafka/pull/11187#issuecomment-903380258


   @guozhangwang , thanks for the response. I closed this PR because I think we need a bigger change (maybe a KIP), to address this `byte[]` to `ByteBuffer` change, to make the whole API can benefit from the `ByteBuffer` change. Otherwise, like in this PR, we have to wrap the `ByteBuffer` back to `byte[]` in many situations, because they still accept the `byte[]` only.
   
   I'll think about it. Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] showuon commented on a change in pull request #11187: [WIP] KAFKA-5966: let CacheFunction#key return ByteBuffer

Posted by GitBox <gi...@apache.org>.
showuon commented on a change in pull request #11187:
URL: https://github.com/apache/kafka/pull/11187#discussion_r684704759



##########
File path: streams/src/main/java/org/apache/kafka/streams/state/internals/MergedSortedCacheWindowStoreIterator.java
##########
@@ -43,7 +45,7 @@
 
     @Override
     Long deserializeCacheKey(final Bytes cacheKey) {
-        final byte[] binaryKey = bytesFromCacheKey(cacheKey);
+        final ByteBuffer binaryKey = bytesFromCacheKey(cacheKey);
         return WindowKeySchema.extractStoreTimestamp(binaryKey);

Review comment:
       Another example that we avoided the unnecessary array copy by returning sliced byte buffer via `bytesFromCacheKey`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] showuon edited a comment on pull request #11187: [WIP] KAFKA-5966: let CacheFunction#key return ByteBuffer

Posted by GitBox <gi...@apache.org>.
showuon edited a comment on pull request #11187:
URL: https://github.com/apache/kafka/pull/11187#issuecomment-894739236


   @xvrl @guozhangwang , could you help take a look and make sure I'm on the right way? I know in KAFKA-5966, we expected to support for serdes into/from `ByteBuffer` in streams, but that might impact and change a lot. Do you think we should do this change? If so, which ser/des you recommended I can start with? 
   Or we just want a small change like this PR, to have small improvement? 
   
   Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] showuon edited a comment on pull request #11187: [WIP] KAFKA-5966: let CacheFunction#key return ByteBuffer

Posted by GitBox <gi...@apache.org>.
showuon edited a comment on pull request #11187:
URL: https://github.com/apache/kafka/pull/11187#issuecomment-903380258


   @guozhangwang , thanks for the response. I closed this PR because I think we need a bigger change (maybe a KIP), to address this `byte[]` to `ByteBuffer` change, to make the whole API can benefit from the `ByteBuffer` change. Otherwise, like in this PR, we have to wrap the `ByteBuffer` back to `byte[]` in many situations, because most existing APIs still accept the `byte[]` only.
   
   I'll think about it. Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org