You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2019/05/07 01:47:05 UTC

[GitHub] [drill] Ben-Zvi commented on a change in pull request #1783: DRILL-7240: Catch runtime pruning filter-match exceptions and do not prune these rowgroups

Ben-Zvi commented on a change in pull request #1783: DRILL-7240: Catch runtime pruning filter-match exceptions and do not prune these rowgroups
URL: https://github.com/apache/drill/pull/1783#discussion_r281435745
 
 

 ##########
 File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/AbstractParquetScanBatchCreator.java
 ##########
 @@ -184,18 +185,30 @@ protected ScanBatch getBatch(ExecutorFragmentContext context, AbstractParquetRow
           //
           // Perform the Run-Time Pruning - i.e. Skip this rowgroup if the match fails
           //
-          RowsMatch match = FilterEvaluatorUtils.matches(filterPredicate, columnsStatistics, footerRowCount);
-
-          // collect logging info
-          long timeToRead = pruneTimer.elapsed(TimeUnit.MICROSECONDS);
-          pruneTimer.stop();
-          pruneTimer.reset();
-          totalPruneTime += timeToRead;
-          logger.trace("Run-time pruning: {} row-group {} (RG index: {} row count: {}), took {} usec", // trace each single rowgroup
-            match == RowsMatch.NONE ? "Excluded" : "Included", rowGroup.getPath(), rowGroupIndex, footerRowCount, timeToRead);
+          RowsMatch matchResult = RowsMatch.ALL;
+          try {
+            matchResult = FilterEvaluatorUtils.matches(filterPredicate, columnsStatistics, footerRowCount);
+
+            // collect logging info
+            long timeToRead = pruneTimer.elapsed(TimeUnit.MICROSECONDS);
+            pruneTimer.stop();
+            pruneTimer.reset();
+            totalPruneTime += timeToRead;
+            logger.trace("Run-time pruning: {} row-group {} (RG index: {} row count: {}), took {} usec", // trace each single rowgroup
+              matchResult == RowsMatch.NONE ? "Excluded" : "Included", rowGroup.getPath(), rowGroupIndex, footerRowCount, timeToRead);
+          } catch (ClassCastException cce) {
+            if ( ! matchCastErrorNotified ) {
+              logger.info("Run-time pruning check failed due to type casting. Skipping pruning rowgroups starting from {}. (Error: {})", rowGroup.getPath(), cce.getMessage());
 
 Review comment:
   Done, here is a sample from the log (tested with 6 rowgroups/files; three of which fail, or the remaining three one was pruned):
   ```
   2019-05-06 18:37:19,809 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] TRACE o.a.d.e.s.p.AbstractParquetScanBatchCreator - ParquetTrace,Read Footer,,/tmp/twofoo/sub/0_0_3.parquet,,0,0,0,170059
   2019-05-06 18:37:19,828 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] TRACE o.a.d.e.s.p.AbstractParquetScanBatchCreator - ParquetTrace,Read Footer,,/tmp/twofoo/sub/0_0_2.parquet,,0,0,0,4691
   2019-05-06 18:37:19,829 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] TRACE o.a.d.e.s.p.AbstractParquetScanBatchCreator - Run-time pruning: Included row-group /tmp/twofoo/sub/0_0_2.parquet (RG index: 0 row count: 2), took 466 usec
   2019-05-06 18:37:19,869 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] TRACE o.a.d.e.s.p.AbstractParquetScanBatchCreator - ParquetTrace,Read Footer,,/tmp/twofoo/sub/0_0_0.parquet,,0,0,0,37316
   2019-05-06 18:37:19,870 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] TRACE o.a.d.e.s.p.AbstractParquetScanBatchCreator - Run-time pruning: Excluded row-group /tmp/twofoo/sub/0_0_0.parquet (RG index: 0 row count: 2), took 683 usec
   2019-05-06 18:37:19,909 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] TRACE o.a.d.e.s.p.AbstractParquetScanBatchCreator - ParquetTrace,Read Footer,,/tmp/twofoo/sub/0_0_1.parquet,,0,0,0,38874
   2019-05-06 18:37:19,914 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] TRACE o.a.d.e.s.p.AbstractParquetScanBatchCreator - ParquetTrace,Read Footer,,/tmp/twofoo/sub/0_0_4.parquet,,0,0,0,2417
   2019-05-06 18:37:19,915 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] TRACE o.a.d.e.s.p.AbstractParquetScanBatchCreator - Run-time pruning: Included row-group /tmp/twofoo/sub/0_0_4.parquet (RG index: 0 row count: 2), took 356 usec
   2019-05-06 18:37:19,919 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] TRACE o.a.d.e.s.p.AbstractParquetScanBatchCreator - ParquetTrace,Read Footer,,/tmp/twofoo/sub/0_0_5.parquet,,0,0,0,2749
   2019-05-06 18:37:19,922 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] INFO  o.a.d.e.s.p.AbstractParquetScanBatchCreator - Finished parquet_runtime_pruning in 1505 usec. Out of given 6 rowgroups, 1 were pruned. 
   2019-05-06 18:37:19,922 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] INFO  o.a.d.e.s.p.AbstractParquetScanBatchCreator - Run-time pruning check skipped for 3 out of 6 rowgroups due to: java.lang.Integer cannot be cast to java.lang.Long
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services