You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Adam Gilmore <dr...@gmail.com> on 2015/05/05 03:27:10 UTC

Review Request 33836: DRILL-1950: Parquet pushdown filtering

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33836/
-----------------------------------------------------------

Review request for drill and Jacques Nadeau.


Repository: drill-git


Description
-------

An implementation of Parquet pushdown filtering for Drill.  More details can be found in the JIRA item (DRILL-1950).


Diffs
-----

  contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseStoragePlugin.java c10b0abd0d79e94524a7f4ce0043cd63ace60a69 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveStoragePlugin.java f4baf3b023f83452eea2fccc5b54feeaa262b53c 
  contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoStoragePlugin.java e46d8ec589785d5db1b655dae0857464fa3c9159 
  exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java bd93206d531699ee4f3fcaff11ce0df143f3032f 
  exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java ca2a0486d7f18fb27a592c0824dbef1a31cb3b70 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java b98778d2a90eefc297df81739ce945beb1080021 
  exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java 1a8559e54810bd9c678bece13b1f42aad89c00e2 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/AbstractStoragePlugin.java b032fce8763d92e58993b78b48993444cb84e5c3 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java ef5978c8b1b9c91d4d6a873ff798ae096b0bd1e7 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePluginRegistry.java 5d0eed6a8446b02610ccb0e9f2b4a667142b3900 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemPlugin.java c5ca41b160782e6421c6d9f90dbfea555d5d14b8 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FormatPlugin.java 58d5b426937accee0c3ba1752e40bfec8df23e19 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/easy/EasyFormatPlugin.java 6e1e0cc25109beabd119fbb6aac13a3fea670c55 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/ischema/InfoSchemaStoragePlugin.java 77c6b9ad0c53fc29e7ed73d9d184f411e3f21308 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/EmptyRowGroupScan.java PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/EmptyScanBatchCreator.java PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetCompareFunctionProcessor.java PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetFilterBuilder.java PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetFormatPlugin.java e204a2c28a39c3a647686f09445af3682185fa5e 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetGroupScan.java acac61f35448b2884c60a03830532ccd24bc638a 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetPushDownFilter.java PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRowGroupScan.java fd40f417a65921efccae2deb5226e6adf86364c3 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetScanBatchCreator.java c1f815e68945bd454d0b6fd268a17a33b6eda00c 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet2/DrillParquetReader.java 9f8357bef94d6bc00392d32a61bf35ca10a8feaa 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet2/DrillParquetRecordMaterializer.java 720e8be7f5befb3039dccd062230ca686cb5f32e 
  exec/java-exec/src/main/java/parquet/hadoop/FilterPredicateSerializer.java PRE-CREATION 
  exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestFilterPushdown.java PRE-CREATION 
  exec/java-exec/src/test/java/parquet/hadoop/TestFilterPredicateSerializer.java PRE-CREATION 
  exec/java-exec/src/test/resources/parquet/pushdown/0_0_0.parquet PRE-CREATION 
  exec/java-exec/src/test/resources/parquet/pushdown/0_0_1.parquet PRE-CREATION 
  protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java ac1bcbb8761b69dc944180234399c0badc517e4f 

Diff: https://reviews.apache.org/r/33836/diff/


Testing
-------

I have created a number of test cases to test that the filter is correctly pushed down in various scenarios.  This also ensures that the pushdown filter is correctly working as it must run through the row group filtering to estimate fewer rows scanned and for the optimizer to pick that as a better plan.


Thanks,

Adam Gilmore


Re: Review Request 33836: DRILL-1950: Parquet pushdown filtering

Posted by Adam Gilmore <dr...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33836/
-----------------------------------------------------------

(Updated July 15, 2015, 4:41 a.m.)


Review request for drill and Jacques Nadeau.


Repository: drill-git


Description
-------

An implementation of Parquet pushdown filtering for Drill.  More details can be found in the JIRA item (DRILL-1950).


Diffs (updated)
-----

  contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseStoragePlugin.java 7737f69 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveStoragePlugin.java fb827cc 
  contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoStoragePlugin.java 093df57 
  exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 140e9a8 
  exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java 6bf1280 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java 2d1bac2 
  exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java 2d41740 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/AbstractStoragePlugin.java 58c8622 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java b60c16f 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePluginRegistry.java 80a0876 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemPlugin.java 4ae0cc8 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FormatPlugin.java 14f1441 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/easy/EasyFormatPlugin.java 3c2b806 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/ischema/InfoSchemaStoragePlugin.java 597d24c 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/EmptyRowGroupScan.java PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/EmptyScanBatchCreator.java PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetCompareFunctionProcessor.java PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetFilterBuilder.java PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetFormatPlugin.java 56a1f00 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetGroupScan.java 845bce9 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetPushDownFilter.java PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRowGroupScan.java 987f792 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetScanBatchCreator.java 441a707 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet2/DrillParquetReader.java 4e7d628 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet2/DrillParquetRecordMaterializer.java a80eb57 
  exec/java-exec/src/main/java/parquet/hadoop/FilterPredicateSerializer.java PRE-CREATION 
  exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestFilterPushdown.java PRE-CREATION 
  exec/java-exec/src/test/java/parquet/hadoop/TestFilterPredicateSerializer.java PRE-CREATION 
  exec/java-exec/src/test/resources/parquet/pushdown/0_0_0.parquet PRE-CREATION 
  exec/java-exec/src/test/resources/parquet/pushdown/0_0_1.parquet PRE-CREATION 
  protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java e76d748 

Diff: https://reviews.apache.org/r/33836/diff/


Testing
-------

I have created a number of test cases to test that the filter is correctly pushed down in various scenarios.  This also ensures that the pushdown filter is correctly working as it must run through the row group filtering to estimate fewer rows scanned and for the optimizer to pick that as a better plan.


Thanks,

Adam Gilmore