You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/06/16 04:56:23 UTC

[GitHub] [pinot] icefury71 opened a new pull request, #8897: Enable key value byte stitching in PulsarMessageBatch

icefury71 opened a new pull request, #8897:
URL: https://github.com/apache/pinot/pull/8897

   Add a config flag for Pulsar stream connector to enable key and value byte array stitching support in PulsarMessageBatch. This is important when ingesting from a Pulsar topic with a key value schema where the message data only includes value bytes and key bytes have to be retrieved separately.
   
   By stitching, we allow higher layers (eg: decoder) to access both key and value in the same byte array.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #8897: Enable key value byte stitching in PulsarMessageBatch

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #8897:
URL: https://github.com/apache/pinot/pull/8897#issuecomment-1157250140

   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8897?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#8897](https://codecov.io/gh/apache/pinot/pull/8897?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7c2861c) into [master](https://codecov.io/gh/apache/pinot/commit/f3bde9ff674bc50ca296e22a0a8df2ded0168fa1?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (f3bde9f) will **decrease** coverage by `3.23%`.
   > The diff coverage is `64.06%`.
   
   > :exclamation: Current head 7c2861c differs from pull request most recent head e128cdc. Consider uploading reports for the commit e128cdc to get more accurate results
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #8897      +/-   ##
   ============================================
   - Coverage     69.60%   66.37%   -3.24%     
   + Complexity     4997     4667     -330     
   ============================================
     Files          1806     1355     -451     
     Lines         94202    68425   -25777     
     Branches      14050    10678    -3372     
   ============================================
   - Hits          65571    45418   -20153     
   + Misses        24072    19769    -4303     
   + Partials       4559     3238    -1321     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `?` | |
   | integration2 | `?` | |
   | unittests1 | `66.37% <64.06%> (-0.01%)` | :arrow_down: |
   | unittests2 | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/8897?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...e/pinot/common/utils/FileUploadDownloadClient.java](https://codecov.io/gh/apache/pinot/pull/8897/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvRmlsZVVwbG9hZERvd25sb2FkQ2xpZW50LmphdmE=) | `18.18% <ø> (-40.91%)` | :arrow_down: |
   | [...che/pinot/core/common/datablock/BaseDataBlock.java](https://codecov.io/gh/apache/pinot/pull/8897/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9jb21tb24vZGF0YWJsb2NrL0Jhc2VEYXRhQmxvY2suamF2YQ==) | `76.09% <ø> (ø)` | |
   | [...pinot/core/common/datablock/ColumnarDataBlock.java](https://codecov.io/gh/apache/pinot/pull/8897/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9jb21tb24vZGF0YWJsb2NrL0NvbHVtbmFyRGF0YUJsb2NrLmphdmE=) | `58.62% <ø> (ø)` | |
   | [...he/pinot/core/common/datablock/DataBlockUtils.java](https://codecov.io/gh/apache/pinot/pull/8897/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9jb21tb24vZGF0YWJsb2NrL0RhdGFCbG9ja1V0aWxzLmphdmE=) | `84.21% <ø> (ø)` | |
   | [...che/pinot/core/common/datablock/MetadataBlock.java](https://codecov.io/gh/apache/pinot/pull/8897/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9jb21tb24vZGF0YWJsb2NrL01ldGFkYXRhQmxvY2suamF2YQ==) | `55.55% <ø> (ø)` | |
   | [...ache/pinot/core/common/datablock/RowDataBlock.java](https://codecov.io/gh/apache/pinot/pull/8897/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9jb21tb24vZGF0YWJsb2NrL1Jvd0RhdGFCbG9jay5qYXZh) | `66.66% <ø> (ø)` | |
   | [...rg/apache/pinot/core/transport/ServerResponse.java](https://codecov.io/gh/apache/pinot/pull/8897/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS90cmFuc3BvcnQvU2VydmVyUmVzcG9uc2UuamF2YQ==) | `60.71% <0.00%> (-39.29%)` | :arrow_down: |
   | [...va/org/apache/pinot/query/runtime/QueryRunner.java](https://codecov.io/gh/apache/pinot/pull/8897/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9RdWVyeVJ1bm5lci5qYXZh) | `86.66% <ø> (ø)` | |
   | [.../pinot/query/runtime/blocks/TransferableBlock.java](https://codecov.io/gh/apache/pinot/pull/8897/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9ibG9ja3MvVHJhbnNmZXJhYmxlQmxvY2suamF2YQ==) | `58.33% <ø> (ø)` | |
   | [...he/pinot/segment/local/utils/SegmentPushUtils.java](https://codecov.io/gh/apache/pinot/pull/8897/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC91dGlscy9TZWdtZW50UHVzaFV0aWxzLmphdmE=) | `12.50% <0.00%> (-0.21%)` | :arrow_down: |
   | ... and [715 more](https://codecov.io/gh/apache/pinot/pull/8897/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8897?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8897?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [f3bde9f...e128cdc](https://codecov.io/gh/apache/pinot/pull/8897?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] icefury71 commented on pull request #8897: Enable key value byte stitching in PulsarMessageBatch

Posted by GitBox <gi...@apache.org>.
icefury71 commented on PR #8897:
URL: https://github.com/apache/pinot/pull/8897#issuecomment-1163933261

   > this change lgtm, but you might want to callout here and tag for release notes, that existing decoders won't work, along with guidelines on how to use this in that case?
   
   Done


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] icefury71 commented on a diff in pull request #8897: Enable key value byte stitching in PulsarMessageBatch

Posted by GitBox <gi...@apache.org>.
icefury71 commented on code in PR #8897:
URL: https://github.com/apache/pinot/pull/8897#discussion_r904520517


##########
pinot-plugins/pinot-stream-ingestion/pinot-pulsar/src/main/java/org/apache/pinot/plugin/stream/pulsar/PulsarMessageBatch.java:
##########
@@ -48,6 +54,10 @@ public int getMessageCount() {
 
   @Override
   public byte[] getMessageAtIndex(int index) {
+    Message<byte[]> msg = _messageList.get(index);
+    if (_enableKeyValueStitch) {
+      return stitchKeyValue(msg.getKeyBytes(), msg.getData());
+    }
     return _messageList.get(index).getData();

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] npawar commented on a diff in pull request #8897: Enable key value byte stitching in PulsarMessageBatch

Posted by GitBox <gi...@apache.org>.
npawar commented on code in PR #8897:
URL: https://github.com/apache/pinot/pull/8897#discussion_r903047881


##########
pinot-plugins/pinot-stream-ingestion/pinot-pulsar/src/main/java/org/apache/pinot/plugin/stream/pulsar/PulsarMessageBatch.java:
##########
@@ -48,6 +54,10 @@ public int getMessageCount() {
 
   @Override
   public byte[] getMessageAtIndex(int index) {
+    Message<byte[]> msg = _messageList.get(index);
+    if (_enableKeyValueStitch) {
+      return stitchKeyValue(msg.getKeyBytes(), msg.getData());
+    }
     return _messageList.get(index).getData();

Review Comment:
   nit: s/_messageList.get(index)/msg



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] KKcorps merged pull request #8897: Enable key value byte stitching in PulsarMessageBatch

Posted by GitBox <gi...@apache.org>.
KKcorps merged PR #8897:
URL: https://github.com/apache/pinot/pull/8897


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] navina commented on a diff in pull request #8897: Enable key value byte stitching in PulsarMessageBatch

Posted by GitBox <gi...@apache.org>.
navina commented on code in PR #8897:
URL: https://github.com/apache/pinot/pull/8897#discussion_r898716958


##########
pinot-plugins/pinot-stream-ingestion/pinot-pulsar/src/main/java/org/apache/pinot/plugin/stream/pulsar/PulsarMessageBatch.java:
##########
@@ -105,4 +119,26 @@ public StreamPartitionMsgOffset getNextStreamPartitionMsgOffsetAtIndex(int index
   public long getNextStreamMessageOffsetAtIndex(int index) {
     throw new UnsupportedOperationException("Pulsar does not support long stream offsets");
   }
+
+  /**
+   * Stitch key and value bytes together using a simple format:
+   * 4 bytes for key length + key bytes + 4 bytes for value length + value bytes
+   */
+  private byte[] stitchKeyValue(byte[] keyBytes, byte[] valueBytes) {

Review Comment:
   I feel like this is the case for other stream connectors too. I don't think the decoders have access the message key or message headers today. 
   
   A more elegant approach maybe to use `MessageBatch<StreamMessageType>`, where StreamMessageType can contain payload and metadata. But I suspect you want to avoid a more invasive code change? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] KKcorps commented on a diff in pull request #8897: Enable key value byte stitching in PulsarMessageBatch

Posted by GitBox <gi...@apache.org>.
KKcorps commented on code in PR #8897:
URL: https://github.com/apache/pinot/pull/8897#discussion_r899079960


##########
pinot-plugins/pinot-stream-ingestion/pinot-pulsar/src/main/java/org/apache/pinot/plugin/stream/pulsar/PulsarMessageBatch.java:
##########
@@ -105,4 +119,26 @@ public StreamPartitionMsgOffset getNextStreamPartitionMsgOffsetAtIndex(int index
   public long getNextStreamMessageOffsetAtIndex(int index) {
     throw new UnsupportedOperationException("Pulsar does not support long stream offsets");
   }
+
+  /**
+   * Stitch key and value bytes together using a simple format:
+   * 4 bytes for key length + key bytes + 4 bytes for value length + value bytes
+   */
+  private byte[] stitchKeyValue(byte[] keyBytes, byte[] valueBytes) {

Review Comment:
   Yep, that particular change will touch too many classes. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org