You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Mehant Baid <ba...@gmail.com> on 2015/07/25 09:35:00 UTC

Review Request 36809: DRILL-3121: Hive partition pruning

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36809/
-----------------------------------------------------------

Review request for drill and Aman Sinha.


Repository: drill-git


Description
-------

Add support for interpreter based partition pruning for hive tables. Also removes the old partition pruning logic.


Diffs
-----

  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionDescriptor.java 8307dff 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionLocation.java PRE-CREATION 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/HivePushPartitionFilterIntoScan.java 6ab1a78 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDataTypeUtility.java PRE-CREATION 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveRecordReader.java 088fb74 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveStoragePlugin.java fb827cc 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveTable.java 99101cc 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DirPathBuilder.java 892e8cb 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushPartitionFilterIntoScan.java b83cedd 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/PartitionPruningUtil.java 05ccfb9 

Diff: https://reviews.apache.org/r/36809/diff/


Testing
-------

Pending unit tests.


Thanks,

Mehant Baid


Re: Review Request 36809: DRILL-3121: Hive partition pruning

Posted by Aman Sinha <as...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36809/#review93701
-----------------------------------------------------------

Ship it!


Ship It!

- Aman Sinha


On July 31, 2015, 3:32 a.m., Mehant Baid wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36809/
> -----------------------------------------------------------
> 
> (Updated July 31, 2015, 3:32 a.m.)
> 
> 
> Review request for drill and Aman Sinha.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> Add support for interpreter based partition pruning for hive tables. Also removes the old partition pruning logic.
> 
> 
> Diffs
> -----
> 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionDescriptor.java 8307dff 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionLocation.java PRE-CREATION 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/HivePushPartitionFilterIntoScan.java 6ab1a78 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDataTypeUtility.java PRE-CREATION 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveRecordReader.java 088fb74 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveStoragePlugin.java fb827cc 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveTable.java 99101cc 
>   contrib/storage-hive/core/src/test/java/org/apache/drill/exec/TestHivePartitionPruning.java c846328 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DirPathBuilder.java 892e8cb 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushPartitionFilterIntoScan.java b83cedd 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/PartitionPruningUtil.java 05ccfb9 
> 
> Diff: https://reviews.apache.org/r/36809/diff/
> 
> 
> Testing
> -------
> 
> Pending unit tests.
> 
> 
> Thanks,
> 
> Mehant Baid
> 
>


Re: Review Request 36809: DRILL-3121: Hive partition pruning

Posted by Mehant Baid <ba...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36809/
-----------------------------------------------------------

(Updated July 31, 2015, 3:32 a.m.)


Review request for drill and Aman Sinha.


Changes
-------

Rebased on latest master, addressed review comments.


Repository: drill-git


Description
-------

Add support for interpreter based partition pruning for hive tables. Also removes the old partition pruning logic.


Diffs (updated)
-----

  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionDescriptor.java 8307dff 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionLocation.java PRE-CREATION 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/HivePushPartitionFilterIntoScan.java 6ab1a78 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDataTypeUtility.java PRE-CREATION 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveRecordReader.java 088fb74 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveStoragePlugin.java fb827cc 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveTable.java 99101cc 
  contrib/storage-hive/core/src/test/java/org/apache/drill/exec/TestHivePartitionPruning.java c846328 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DirPathBuilder.java 892e8cb 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushPartitionFilterIntoScan.java b83cedd 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/PartitionPruningUtil.java 05ccfb9 

Diff: https://reviews.apache.org/r/36809/diff/


Testing
-------

Pending unit tests.


Thanks,

Mehant Baid


Re: Review Request 36809: DRILL-3121: Hive partition pruning

Posted by Aman Sinha <as...@maprtech.com>.

> On July 30, 2015, 11:50 p.m., Aman Sinha wrote:
> > contrib/storage-hive/core/src/test/java/org/apache/drill/exec/TestHivePartitionPruning.java, line 68
> > <https://reviews.apache.org/r/36809/diff/2/?file=1023984#file1023984line68>
> >
> >     Are there any Hive PP unit tests that reference subdirectories ? Would be good to add couple of tests.
> 
> Mehant Baid wrote:
>     In testRangeFilter() and testRangeFilterWithDisjunct() column 'c' is the top level partition and column 'd' is the next level partition. Is that what you were referring to or am I missing something?

Ok, yeah if c and d represent the hierarchy of partitioning levels, it should be good enough.


- Aman


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36809/#review93660
-----------------------------------------------------------


On July 31, 2015, 3:32 a.m., Mehant Baid wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36809/
> -----------------------------------------------------------
> 
> (Updated July 31, 2015, 3:32 a.m.)
> 
> 
> Review request for drill and Aman Sinha.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> Add support for interpreter based partition pruning for hive tables. Also removes the old partition pruning logic.
> 
> 
> Diffs
> -----
> 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionDescriptor.java 8307dff 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionLocation.java PRE-CREATION 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/HivePushPartitionFilterIntoScan.java 6ab1a78 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDataTypeUtility.java PRE-CREATION 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveRecordReader.java 088fb74 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveStoragePlugin.java fb827cc 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveTable.java 99101cc 
>   contrib/storage-hive/core/src/test/java/org/apache/drill/exec/TestHivePartitionPruning.java c846328 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DirPathBuilder.java 892e8cb 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushPartitionFilterIntoScan.java b83cedd 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/PartitionPruningUtil.java 05ccfb9 
> 
> Diff: https://reviews.apache.org/r/36809/diff/
> 
> 
> Testing
> -------
> 
> Pending unit tests.
> 
> 
> Thanks,
> 
> Mehant Baid
> 
>


Re: Review Request 36809: DRILL-3121: Hive partition pruning

Posted by Mehant Baid <ba...@gmail.com>.

> On July 30, 2015, 11:50 p.m., Aman Sinha wrote:
> > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionDescriptor.java, line 104
> > <https://reviews.apache.org/r/36809/diff/2/?file=1023977#file1023977line104>
> >
> >     Do you want this to be case-sensitive comparison ?

The new files are essentially a subset of the old files so case-sensitive comparison should suffice.


> On July 30, 2015, 11:50 p.m., Aman Sinha wrote:
> > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionLocation.java, line 38
> > <https://reviews.apache.org/r/36809/diff/2/?file=1023978#file1023978line38>
> >
> >     This will throw IOBE if supplied max nesting level is less than mostDirs.length.

This should not be the case, since in the case of hive, we get max partition hierarchy from the partition metadata itself as opposed to in the case of dfs where we set a maximum hierarchy of 10. I have added an assert.


> On July 30, 2015, 11:50 p.m., Aman Sinha wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/planner/ParquetPartitionDescriptor.java, line 89
> > <https://reviews.apache.org/r/36809/diff/2/?file=1023985#file1023985line89>
> >
> >     It wasn't clear why the Hive changes affected this.. is this based on a different patch from before ?

While creating a patch I missed this minor change, it belongs in 3503. Moved it.


> On July 30, 2015, 11:50 p.m., Aman Sinha wrote:
> > contrib/storage-hive/core/src/test/java/org/apache/drill/exec/TestHivePartitionPruning.java, line 68
> > <https://reviews.apache.org/r/36809/diff/2/?file=1023984#file1023984line68>
> >
> >     Are there any Hive PP unit tests that reference subdirectories ? Would be good to add couple of tests.

In testRangeFilter() and testRangeFilterWithDisjunct() column 'c' is the top level partition and column 'd' is the next level partition. Is that what you were referring to or am I missing something?


- Mehant


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36809/#review93660
-----------------------------------------------------------


On July 31, 2015, 3:32 a.m., Mehant Baid wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36809/
> -----------------------------------------------------------
> 
> (Updated July 31, 2015, 3:32 a.m.)
> 
> 
> Review request for drill and Aman Sinha.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> Add support for interpreter based partition pruning for hive tables. Also removes the old partition pruning logic.
> 
> 
> Diffs
> -----
> 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionDescriptor.java 8307dff 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionLocation.java PRE-CREATION 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/HivePushPartitionFilterIntoScan.java 6ab1a78 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDataTypeUtility.java PRE-CREATION 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveRecordReader.java 088fb74 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveStoragePlugin.java fb827cc 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveTable.java 99101cc 
>   contrib/storage-hive/core/src/test/java/org/apache/drill/exec/TestHivePartitionPruning.java c846328 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DirPathBuilder.java 892e8cb 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushPartitionFilterIntoScan.java b83cedd 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/PartitionPruningUtil.java 05ccfb9 
> 
> Diff: https://reviews.apache.org/r/36809/diff/
> 
> 
> Testing
> -------
> 
> Pending unit tests.
> 
> 
> Thanks,
> 
> Mehant Baid
> 
>


Re: Review Request 36809: DRILL-3121: Hive partition pruning

Posted by Aman Sinha <as...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36809/#review93660
-----------------------------------------------------------



contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionDescriptor.java (line 104)
<https://reviews.apache.org/r/36809/#comment148091>

    Do you want this to be case-sensitive comparison ?



contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionLocation.java (line 24)
<https://reviews.apache.org/r/36809/#comment148071>

    Why not use a String array since that's what you are populating it with when you call mostDirs[i].substring.



contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionLocation.java (line 33)
<https://reviews.apache.org/r/36809/#comment148072>

    This prefix won't work on all supported platforms would it ?  Also, could you add javadoc on the type of partition descriptor expected in Hive - i.e  'year=2015'.



contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionLocation.java (line 38)
<https://reviews.apache.org/r/36809/#comment148073>

    This will throw IOBE if supplied max nesting level is less than mostDirs.length.



contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionLocation.java (line 43)
<https://reviews.apache.org/r/36809/#comment148075>

    Is the caller ensuring index < partitionValue.length ? Otherwise, add assert..



contrib/storage-hive/core/src/test/java/org/apache/drill/exec/TestHivePartitionPruning.java (line 68)
<https://reviews.apache.org/r/36809/#comment148092>

    Are there any Hive PP unit tests that reference subdirectories ? Would be good to add couple of tests.



exec/java-exec/src/main/java/org/apache/drill/exec/planner/ParquetPartitionDescriptor.java (line 89)
<https://reviews.apache.org/r/36809/#comment148093>

    It wasn't clear why the Hive changes affected this.. is this based on a different patch from before ?


- Aman Sinha


On July 28, 2015, 10:38 p.m., Mehant Baid wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36809/
> -----------------------------------------------------------
> 
> (Updated July 28, 2015, 10:38 p.m.)
> 
> 
> Review request for drill and Aman Sinha.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> Add support for interpreter based partition pruning for hive tables. Also removes the old partition pruning logic.
> 
> 
> Diffs
> -----
> 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionDescriptor.java 8307dff 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionLocation.java PRE-CREATION 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/HivePushPartitionFilterIntoScan.java 6ab1a78 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDataTypeUtility.java PRE-CREATION 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveRecordReader.java 088fb74 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveStoragePlugin.java fb827cc 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveTable.java 99101cc 
>   contrib/storage-hive/core/src/test/java/org/apache/drill/exec/TestHivePartitionPruning.java c846328 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/ParquetPartitionDescriptor.java 127e70a 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DirPathBuilder.java 892e8cb 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushPartitionFilterIntoScan.java b83cedd 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/PartitionPruningUtil.java 05ccfb9 
> 
> Diff: https://reviews.apache.org/r/36809/diff/
> 
> 
> Testing
> -------
> 
> Pending unit tests.
> 
> 
> Thanks,
> 
> Mehant Baid
> 
>


Re: Review Request 36809: DRILL-3121: Hive partition pruning

Posted by Mehant Baid <ba...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36809/
-----------------------------------------------------------

(Updated July 28, 2015, 10:38 p.m.)


Review request for drill and Aman Sinha.


Changes
-------

Enabled some of the more complex partition pruning tests that were disabled, added unit tests for range based filter predicates.


Repository: drill-git


Description
-------

Add support for interpreter based partition pruning for hive tables. Also removes the old partition pruning logic.


Diffs (updated)
-----

  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionDescriptor.java 8307dff 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionLocation.java PRE-CREATION 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/HivePushPartitionFilterIntoScan.java 6ab1a78 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDataTypeUtility.java PRE-CREATION 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveRecordReader.java 088fb74 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveStoragePlugin.java fb827cc 
  contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveTable.java 99101cc 
  contrib/storage-hive/core/src/test/java/org/apache/drill/exec/TestHivePartitionPruning.java c846328 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/ParquetPartitionDescriptor.java 127e70a 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DirPathBuilder.java 892e8cb 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushPartitionFilterIntoScan.java b83cedd 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/PartitionPruningUtil.java 05ccfb9 

Diff: https://reviews.apache.org/r/36809/diff/


Testing
-------

Pending unit tests.


Thanks,

Mehant Baid