You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/12/02 23:03:14 UTC

[GitHub] [pinot] zhtaoxiang opened a new pull request, #9906: [bugfix] handle the case when dataschema does not have columns

zhtaoxiang opened a new pull request, #9906:
URL: https://github.com/apache/pinot/pull/9906

   Previously, the splitBlock logic throws `Runtime failure: java.lang.ArithmeticException: / by zero`. This PR fixes this issue.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #9906: [bugfix] handle the case when dataschema does not have columns

Posted by GitBox <gi...@apache.org>.
walterddr commented on code in PR #9906:
URL: https://github.com/apache/pinot/pull/9906#discussion_r1040002426


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/blocks/TransferableBlockUtils.java:
##########
@@ -76,9 +76,13 @@ public static boolean isNoOpBlock(TransferableBlock transferableBlock) {
   public static Iterator<TransferableBlock> splitBlock(TransferableBlock block, DataBlock.Type type, int maxBlockSize) {
     List<TransferableBlock> blockChunks = new ArrayList<>();
     if (type == DataBlock.Type.ROW) {
-      // Use estimated row size, this estimate is not accurate and is used to estimate numRowsPerChunk only.
-      int estimatedRowSizeInBytes = block.getDataSchema().getColumnNames().length * MEDIAN_COLUMN_SIZE_BYTES;
-      int numRowsPerChunk = maxBlockSize / estimatedRowSizeInBytes;
+      // Use 1 row per chunk when block.getDataSchema().size() is 0, we may want to fine tune it.
+      int numRowsPerChunk = 1;
+      if (block.getDataSchema().size() != 0) {
+        // Use estimated row size, this estimate is not accurate and is used to estimate numRowsPerChunk only.
+        int estimatedRowSizeInBytes = block.getDataSchema().size() * MEDIAN_COLUMN_SIZE_BYTES;
+        numRowsPerChunk = maxBlockSize / estimatedRowSizeInBytes;
+      }
       Preconditions.checkState(numRowsPerChunk > 0, "row size too large for query engine to handle, abort!");

Review Comment:
   yeah dataSchema should not be zero once reach this far down. it looks like a bug somewhere else in the planner 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] jackjlli commented on a diff in pull request #9906: [bugfix] handle the case when dataschema does not have columns

Posted by GitBox <gi...@apache.org>.
jackjlli commented on code in PR #9906:
URL: https://github.com/apache/pinot/pull/9906#discussion_r1039950989


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/blocks/TransferableBlockUtils.java:
##########
@@ -76,9 +76,13 @@ public static boolean isNoOpBlock(TransferableBlock transferableBlock) {
   public static Iterator<TransferableBlock> splitBlock(TransferableBlock block, DataBlock.Type type, int maxBlockSize) {
     List<TransferableBlock> blockChunks = new ArrayList<>();
     if (type == DataBlock.Type.ROW) {
-      // Use estimated row size, this estimate is not accurate and is used to estimate numRowsPerChunk only.
-      int estimatedRowSizeInBytes = block.getDataSchema().getColumnNames().length * MEDIAN_COLUMN_SIZE_BYTES;
-      int numRowsPerChunk = maxBlockSize / estimatedRowSizeInBytes;
+      // Use 1 row per chunk when block.getDataSchema().size() is 0, we may want to fine tune it.
+      int numRowsPerChunk = 1;
+      if (block.getDataSchema().size() != 0) {
+        // Use estimated row size, this estimate is not accurate and is used to estimate numRowsPerChunk only.
+        int estimatedRowSizeInBytes = block.getDataSchema().size() * MEDIAN_COLUMN_SIZE_BYTES;
+        numRowsPerChunk = maxBlockSize / estimatedRowSizeInBytes;
+      }
       Preconditions.checkState(numRowsPerChunk > 0, "row size too large for query engine to handle, abort!");

Review Comment:
   nit: you may also want to adjust this precondition, as it's no longer helpful?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #9906: [bugfix] handle the case when dataschema does not have columns

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #9906:
URL: https://github.com/apache/pinot/pull/9906#issuecomment-1335966281

   # [Codecov](https://codecov.io/gh/apache/pinot/pull/9906?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#9906](https://codecov.io/gh/apache/pinot/pull/9906?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2ee625a) into [master](https://codecov.io/gh/apache/pinot/commit/85e413ebe8c80513a8a676fc090b1af0f50861f0?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (85e413e) will **decrease** coverage by `35.75%`.
   > The diff coverage is `0.00%`.
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #9906       +/-   ##
   =============================================
   - Coverage     60.86%   25.11%   -35.76%     
   + Complexity     5378       44     -5334     
   =============================================
     Files          1969     1969               
     Lines        105939   105941        +2     
     Branches      16073    16074        +1     
   =============================================
   - Hits          64479    26605    -37874     
   - Misses        36689    76641    +39952     
   + Partials       4771     2695     -2076     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `25.11% <0.00%> (?)` | |
   | integration2 | `?` | |
   | unittests1 | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/9906?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...t/query/runtime/blocks/TransferableBlockUtils.java](https://codecov.io/gh/apache/pinot/pull/9906/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9ibG9ja3MvVHJhbnNmZXJhYmxlQmxvY2tVdGlscy5qYXZh) | `0.00% <0.00%> (-86.96%)` | :arrow_down: |
   | [...in/java/org/apache/pinot/spi/utils/BytesUtils.java](https://codecov.io/gh/apache/pinot/pull/9906/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvQnl0ZXNVdGlscy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/spi/trace/BaseRecording.java](https://codecov.io/gh/apache/pinot/pull/9906/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdHJhY2UvQmFzZVJlY29yZGluZy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/spi/trace/NoOpRecording.java](https://codecov.io/gh/apache/pinot/pull/9906/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdHJhY2UvTm9PcFJlY29yZGluZy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ava/org/apache/pinot/spi/config/table/FSTType.java](https://codecov.io/gh/apache/pinot/pull/9906/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvY29uZmlnL3RhYmxlL0ZTVFR5cGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ava/org/apache/pinot/spi/config/user/RoleType.java](https://codecov.io/gh/apache/pinot/pull/9906/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvY29uZmlnL3VzZXIvUm9sZVR5cGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ava/org/apache/pinot/spi/data/MetricFieldSpec.java](https://codecov.io/gh/apache/pinot/pull/9906/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9NZXRyaWNGaWVsZFNwZWMuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ava/org/apache/pinot/spi/stream/StreamMessage.java](https://codecov.io/gh/apache/pinot/pull/9906/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvc3RyZWFtL1N0cmVhbU1lc3NhZ2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/common/tier/TierFactory.java](https://codecov.io/gh/apache/pinot/pull/9906/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdGllci9UaWVyRmFjdG9yeS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...a/org/apache/pinot/spi/config/table/TableType.java](https://codecov.io/gh/apache/pinot/pull/9906/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvY29uZmlnL3RhYmxlL1RhYmxlVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [1319 more](https://codecov.io/gh/apache/pinot/pull/9906/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] zhtaoxiang closed pull request #9906: [multistage][bugfix] handle the case when dataschema does not have columns

Posted by "zhtaoxiang (via GitHub)" <gi...@apache.org>.
zhtaoxiang closed pull request #9906: [multistage][bugfix] handle the case when dataschema does not have columns
URL: https://github.com/apache/pinot/pull/9906


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] zhtaoxiang commented on a diff in pull request #9906: [bugfix] handle the case when dataschema does not have columns

Posted by GitBox <gi...@apache.org>.
zhtaoxiang commented on code in PR #9906:
URL: https://github.com/apache/pinot/pull/9906#discussion_r1039997133


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/blocks/TransferableBlockUtils.java:
##########
@@ -76,9 +76,13 @@ public static boolean isNoOpBlock(TransferableBlock transferableBlock) {
   public static Iterator<TransferableBlock> splitBlock(TransferableBlock block, DataBlock.Type type, int maxBlockSize) {
     List<TransferableBlock> blockChunks = new ArrayList<>();
     if (type == DataBlock.Type.ROW) {
-      // Use estimated row size, this estimate is not accurate and is used to estimate numRowsPerChunk only.
-      int estimatedRowSizeInBytes = block.getDataSchema().getColumnNames().length * MEDIAN_COLUMN_SIZE_BYTES;
-      int numRowsPerChunk = maxBlockSize / estimatedRowSizeInBytes;
+      // Use 1 row per chunk when block.getDataSchema().size() is 0, we may want to fine tune it.
+      int numRowsPerChunk = 1;
+      if (block.getDataSchema().size() != 0) {
+        // Use estimated row size, this estimate is not accurate and is used to estimate numRowsPerChunk only.
+        int estimatedRowSizeInBytes = block.getDataSchema().size() * MEDIAN_COLUMN_SIZE_BYTES;
+        numRowsPerChunk = maxBlockSize / estimatedRowSizeInBytes;
+      }
       Preconditions.checkState(numRowsPerChunk > 0, "row size too large for query engine to handle, abort!");

Review Comment:
   I had an offline discussion with @walterddr . He  mentioned that 
   1/ we should throw exception when dataSchema size is 0
   2/ the current built query plan for `select 1 as one from {table} where 1 < 2` is not correct, we should update the query plan building logic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] 61yao commented on a diff in pull request #9906: [multistage][bugfix] handle the case when dataschema does not have columns

Posted by GitBox <gi...@apache.org>.
61yao commented on code in PR #9906:
URL: https://github.com/apache/pinot/pull/9906#discussion_r1040167803


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/blocks/TransferableBlockUtils.java:
##########
@@ -76,9 +76,13 @@ public static boolean isNoOpBlock(TransferableBlock transferableBlock) {
   public static Iterator<TransferableBlock> splitBlock(TransferableBlock block, DataBlock.Type type, int maxBlockSize) {
     List<TransferableBlock> blockChunks = new ArrayList<>();
     if (type == DataBlock.Type.ROW) {
-      // Use estimated row size, this estimate is not accurate and is used to estimate numRowsPerChunk only.
-      int estimatedRowSizeInBytes = block.getDataSchema().getColumnNames().length * MEDIAN_COLUMN_SIZE_BYTES;
-      int numRowsPerChunk = maxBlockSize / estimatedRowSizeInBytes;
+      // Use 1 row per chunk when block.getDataSchema().size() is 0, we may want to fine tune it.
+      int numRowsPerChunk = 1;
+      if (block.getDataSchema().size() != 0) {
+        // Use estimated row size, this estimate is not accurate and is used to estimate numRowsPerChunk only.
+        int estimatedRowSizeInBytes = block.getDataSchema().size() * MEDIAN_COLUMN_SIZE_BYTES;
+        numRowsPerChunk = maxBlockSize / estimatedRowSizeInBytes;
+      }
       Preconditions.checkState(numRowsPerChunk > 0, "row size too large for query engine to handle, abort!");

Review Comment:
   @walterddr, For select 1 from tbl. the dataSchema should int I guess? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org