You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Syed Albiz <s....@gmail.com> on 2011/06/06 23:37:38 UTC

Review Request: HIVE-2036: Update bitmap indexes for automatic usage

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
-----------------------------------------------------------

Review request for hive and John Sichi.


Summary
-------

Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
    https://issues.apache.org/jira/browse/HIVE-2036


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
  ql/src/test/queries/clientpositive/index_bitmap3.q 508eb94 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
-------

Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.


Thanks,

Syed


Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

Posted by John Sichi <js...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review782
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java
<https://reviews.apache.org/r/857/#comment1680>

    It's preferable to apply the unparsing right at the point of SQL rendering.


- John


On 2011-06-08 00:22:37, Syed Albiz wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/857/
> -----------------------------------------------------------
> 
> (Updated 2011-06-08 00:22:37)
> 
> 
> Review request for hive and John Sichi.
> 
> 
> Summary
> -------
> 
> Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
> 
> 
> This addresses bug HIVE-2036.
>     https://issues.apache.org/jira/browse/HIVE-2036
> 
> 
> Diffs
> -----
> 
>   ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
>   ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
>   ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
>   ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
>   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
>   ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/857/diff
> 
> 
> Testing
> -------
> 
> Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
> 
> 
> Thanks,
> 
> Syed
> 
>


Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

Posted by John Sichi <js...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review785
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java
<https://reviews.apache.org/r/857/#comment1681>

    I think that should be a period instead of a comma in "indexes, if"



ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java
<https://reviews.apache.org/r/857/#comment1682>

    How exactly are they combined?  This Javadoc should be written as a contract between the optimizer and the index plugin author, so that the author knows exactly how to interpret the inputs and also what is going to be done with the output.
    



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
<https://reviews.apache.org/r/857/#comment1683>

    Why do you need to use toArray here?  indexCols.keySet is already a collection.



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
<https://reviews.apache.org/r/857/#comment1684>

    Why are you converting the search conditions back into predicate form here?  Wouldn't it be easier to analyze them as search conditions?


- John


On 2011-06-08 00:22:37, Syed Albiz wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/857/
> -----------------------------------------------------------
> 
> (Updated 2011-06-08 00:22:37)
> 
> 
> Review request for hive and John Sichi.
> 
> 
> Summary
> -------
> 
> Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
> 
> 
> This addresses bug HIVE-2036.
>     https://issues.apache.org/jira/browse/HIVE-2036
> 
> 
> Diffs
> -----
> 
>   ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
>   ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
>   ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
>   ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
>   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
>   ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/857/diff
> 
> 
> Testing
> -------
> 
> Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
> 
> 
> Thanks,
> 
> Syed
> 
>


Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

Posted by John Sichi <js...@fb.com>.

> On 2011-06-13 22:57:46, John Sichi wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java, line 114
> > <https://reviews.apache.org/r/857/diff/4/?file=20984#file20984line114>
> >
> >     I don't think this should be necessary.  We just want to propagate the partition column predicate (whatever it is) from the base table query to the index table query; partition pruning on the index table query will do the rest of the work.
> >     
> >     In other words, if the original query had
> >     
> >     part_key=<whatever>
> >     
> >     we want to preserve that on the index table query.  That's what the code is already supposed to be doing before your change; was it not working?
> >
> 
> Syed Albiz wrote:
>     This code is to prevent automatic usage from kicking in if the index has not been built on the partition specified in the partition predicate. (i.e. if the index has only been built on partition ds=foo, and the query is select key from src where ds=bar; We do not want to execute an index query in this case. It seems like adding a test for bitmaps specifically to mirror index_auto_unused.q(which is where this functionality is tested for Compact indices) would be a good idea.

The logic for making sure that the necessary index partitions exist is already present in IndexWhereProcessor.checkPartitionsCoveredByIndex.  If that's not working, we should fix it; it should not be necessary to change the predicate analyzer at all.


- John


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review825
-----------------------------------------------------------


On 2011-06-14 04:05:43, Syed Albiz wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/857/
> -----------------------------------------------------------
> 
> (Updated 2011-06-14 04:05:43)
> 
> 
> Review request for hive and John Sichi.
> 
> 
> Summary
> -------
> 
> Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
> 
> 
> This addresses bug HIVE-2036.
>     https://issues.apache.org/jira/browse/HIVE-2036
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
>   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
>   ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
>   ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
>   ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
>   ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
>   ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/857/diff
> 
> 
> Testing
> -------
> 
> Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
> 
> 
> Thanks,
> 
> Syed
> 
>


Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

Posted by Syed Albiz <s....@gmail.com>.

> On 2011-06-13 22:57:46, John Sichi wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java, line 114
> > <https://reviews.apache.org/r/857/diff/4/?file=20984#file20984line114>
> >
> >     I don't think this should be necessary.  We just want to propagate the partition column predicate (whatever it is) from the base table query to the index table query; partition pruning on the index table query will do the rest of the work.
> >     
> >     In other words, if the original query had
> >     
> >     part_key=<whatever>
> >     
> >     we want to preserve that on the index table query.  That's what the code is already supposed to be doing before your change; was it not working?
> >

This code is to prevent automatic usage from kicking in if the index has not been built on the partition specified in the partition predicate. (i.e. if the index has only been built on partition ds=foo, and the query is select key from src where ds=bar; We do not want to execute an index query in this case. It seems like adding a test for bitmaps specifically to mirror index_auto_unused.q(which is where this functionality is tested for Compact indices) would be a good idea.


- Syed


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review825
-----------------------------------------------------------


On 2011-06-11 19:05:42, Syed Albiz wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/857/
> -----------------------------------------------------------
> 
> (Updated 2011-06-11 19:05:42)
> 
> 
> Review request for hive and John Sichi.
> 
> 
> Summary
> -------
> 
> Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
> 
> 
> This addresses bug HIVE-2036.
>     https://issues.apache.org/jira/browse/HIVE-2036
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
>   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
>   ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
>   ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
>   ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
>   ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
>   ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/857/diff
> 
> 
> Testing
> -------
> 
> Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
> 
> 
> Thanks,
> 
> Syed
> 
>


Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

Posted by John Sichi <js...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review825
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java
<https://reviews.apache.org/r/857/#comment1790>

    I don't think this should be necessary.  We just want to propagate the partition column predicate (whatever it is) from the base table query to the index table query; partition pruning on the index table query will do the rest of the work.
    
    In other words, if the original query had
    
    part_key=<whatever>
    
    we want to preserve that on the index table query.  That's what the code is already supposed to be doing before your change; was it not working?
    


- John


On 2011-06-11 19:05:42, Syed Albiz wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/857/
> -----------------------------------------------------------
> 
> (Updated 2011-06-11 19:05:42)
> 
> 
> Review request for hive and John Sichi.
> 
> 
> Summary
> -------
> 
> Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
> 
> 
> This addresses bug HIVE-2036.
>     https://issues.apache.org/jira/browse/HIVE-2036
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
>   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
>   ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
>   ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
>   ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
>   ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
>   ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/857/diff
> 
> 
> Testing
> -------
> 
> Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
> 
> 
> Thanks,
> 
> Syed
> 
>


Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

Posted by John Sichi <js...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review826
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
<https://reviews.apache.org/r/857/#comment1792>

    Don't bother with empty return statements.


- John


On 2011-06-11 19:05:42, Syed Albiz wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/857/
> -----------------------------------------------------------
> 
> (Updated 2011-06-11 19:05:42)
> 
> 
> Review request for hive and John Sichi.
> 
> 
> Summary
> -------
> 
> Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
> 
> 
> This addresses bug HIVE-2036.
>     https://issues.apache.org/jira/browse/HIVE-2036
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
>   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
>   ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
>   ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
>   ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
>   ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
>   ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/857/diff
> 
> 
> Testing
> -------
> 
> Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
> 
> 
> Thanks,
> 
> Syed
> 
>


Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

Posted by John Sichi <js...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review836
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java
<https://reviews.apache.org/r/857/#comment1806>

    Slight rephrasing suggested:
    
    "If multiple indexes are provided, it is up to handler to decide whether to use none, one, some, or all of them.  The supplied predicate may reference any of the columns from any of the indexes.  If the handler decides to use more than one index, then it is responsible for generating tasks to combine their search results (e.g. via a JOIN)."



ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java
<https://reviews.apache.org/r/857/#comment1805>

    This should be gone.



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
<https://reviews.apache.org/r/857/#comment1807>

    Delete commented-out code, or convert it into a TODO with a corresponding JIRA issue link.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
<https://reviews.apache.org/r/857/#comment1808>

    Could you explain more about what's going on here?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
<https://reviews.apache.org/r/857/#comment1817>

    Only do indexes.get(0) once.


- John


On 2011-06-14 04:05:43, Syed Albiz wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/857/
> -----------------------------------------------------------
> 
> (Updated 2011-06-14 04:05:43)
> 
> 
> Review request for hive and John Sichi.
> 
> 
> Summary
> -------
> 
> Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
> 
> 
> This addresses bug HIVE-2036.
>     https://issues.apache.org/jira/browse/HIVE-2036
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
>   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
>   ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
>   ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
>   ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
>   ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
>   ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/857/diff
> 
> 
> Testing
> -------
> 
> Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
> 
> 
> Thanks,
> 
> Syed
> 
>


Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

Posted by Syed Albiz <s....@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
-----------------------------------------------------------

(Updated 2011-06-17 22:34:18.950303)


Review request for hive and John Sichi.


Changes
-------

added comments, only push filter expr into TS operator when automatic indexing is turned on.


Summary
-------

Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
    https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 95fef73 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java d22654b 
  ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto.q.out 713bb40 
  ql/src/test/results/clientpositive/index_auto_file_format.q.out 894a556 
  ql/src/test/results/clientpositive/index_auto_multiple.q.out 27092dc 
  ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
  ql/src/test/results/clientpositive/index_auto_unused.q.out 8a1eda5 
  ql/src/test/results/clientpositive/index_bitmap3.q.out dadfa77 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
-------

Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.


Thanks,

Syed


Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

Posted by John Sichi <js...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review856
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
<https://reviews.apache.org/r/857/#comment1865>

    Need to update this comment now, explaining why we don't even look for the filter operator any more.


- John


On 2011-06-15 23:46:24, Syed Albiz wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/857/
> -----------------------------------------------------------
> 
> (Updated 2011-06-15 23:46:24)
> 
> 
> Review request for hive and John Sichi.
> 
> 
> Summary
> -------
> 
> Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
> 
> 
> This addresses bug HIVE-2036.
>     https://issues.apache.org/jira/browse/HIVE-2036
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
>   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
>   ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 95fef73 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java d22654b 
>   ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
>   ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
>   ql/src/test/results/clientpositive/index_auto.q.out 713bb40 
>   ql/src/test/results/clientpositive/index_auto_file_format.q.out 894a556 
>   ql/src/test/results/clientpositive/index_auto_multiple.q.out 27092dc 
>   ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
>   ql/src/test/results/clientpositive/index_auto_unused.q.out 8a1eda5 
>   ql/src/test/results/clientpositive/index_bitmap3.q.out dadfa77 
>   ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/857/diff
> 
> 
> Testing
> -------
> 
> Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
> 
> 
> Thanks,
> 
> Syed
> 
>


Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

Posted by Syed Albiz <s....@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
-----------------------------------------------------------

(Updated 2011-06-15 23:46:24.176586)


Review request for hive and John Sichi.


Changes
-------

Used setFilterExpr on the TableScanDesc to propagate the complete original predicate as the partition predicate was getting removed by the PartitionConditionRemover from the FilterOperator


Summary
-------

Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
    https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 95fef73 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java d22654b 
  ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto.q.out 713bb40 
  ql/src/test/results/clientpositive/index_auto_file_format.q.out 894a556 
  ql/src/test/results/clientpositive/index_auto_multiple.q.out 27092dc 
  ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
  ql/src/test/results/clientpositive/index_auto_unused.q.out 8a1eda5 
  ql/src/test/results/clientpositive/index_bitmap3.q.out dadfa77 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
-------

Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.


Thanks,

Syed


Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

Posted by Syed Albiz <s....@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
-----------------------------------------------------------

(Updated 2011-06-14 21:26:21.276789)


Review request for hive and John Sichi.


Changes
-------

Addressed comments, added some more commenting for why we use indexes.get(0) in IndexWhereProcessor as that seemed a bit unclear


Summary
-------

Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
    https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
  ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
-------

Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.


Thanks,

Syed


Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

Posted by Syed Albiz <s....@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
-----------------------------------------------------------

(Updated 2011-06-14 04:05:43.158797)


Review request for hive and John Sichi.


Changes
-------

Removed redundant check on partition predicate (which is done in IndexWhereProcessor). The reason this was causing problems was that when the index was being built, the query generated to build the index was run through the optimizer and at this stage the optimizer thought that the index was already built and had the partition. A simpler solution is to just disable index query optimization for building indexes. 


Summary
-------

Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
    https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
  ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
-------

Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.


Thanks,

Syed


Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

Posted by Syed Albiz <s....@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
-----------------------------------------------------------

(Updated 2011-06-11 19:05:42.241706)


Review request for hive and John Sichi.


Changes
-------

Fix index_auto_unused.q testcase by adding a check for partitions in the index and ensuring that only partitions actually in the index are used to compute index predicates.


Summary
-------

Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
    https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
  ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
-------

Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.


Thanks,

Syed


Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

Posted by Syed Albiz <s....@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
-----------------------------------------------------------

(Updated 2011-06-10 06:35:32.125295)


Review request for hive and John Sichi.


Changes
-------

Based on a discussion with yongqian, I re-implemented the predicate decomposition into two steps, computing the overall residual predicate from the union of all columns in the available indexes, and then computing the predicates to apply to each index individually. Additionally I have also extended the functionality to pass in partition columns to allowColumnNames and added/extended the testcases to check that partition predicates are propagated correctly. This required adding a check in IndexWhereProcessor.java that the correct FilterOperator was passed to the process(...) method (apparently a duplicate FilterOperator that does not have the entire predicate gets created).


Summary
-------

Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
    https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
  ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
-------

Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.


Thanks,

Syed


Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

Posted by Syed Albiz <s....@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
-----------------------------------------------------------

(Updated 2011-06-08 00:22:37.292935)


Review request for hive and John Sichi.


Changes
-------

Addressed comments. Still does not propagate partition predicates to every single index sub-query, but it does ensure that predicates are only applied to indexes for which there are matching columns. After looking at the behavior of CompactIndexHandler on partitioned tables (and in testcase index_auto_partitioned.q) I can't quite see how the CompactIndexHandler identifies and propagates partitioning predicates correctly.


Summary
-------

Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
    https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-----

  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
-------

Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.


Thanks,

Syed


Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

Posted by Syed Albiz <s....@gmail.com>.

> On 2011-06-07 18:30:15, John Sichi wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java, line 54
> > <https://reviews.apache.org/r/857/diff/1/?file=20596#file20596line54>
> >
> >     Use HiveUtils.unparseIdentifier

HiveUtils.unparseIdentifier is used on the argument passed in through to the constructor.


> On 2011-06-07 18:30:15, John Sichi wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java, line 25
> > <https://reviews.apache.org/r/857/diff/1/?file=20599#file20599line25>
> >
> >     Why do we need this class at all?  The superclass already uses hive.index.blockfilter.file by default.
> >

removed in next diff


- Syed


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review773
-----------------------------------------------------------


On 2011-06-08 00:22:37, Syed Albiz wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/857/
> -----------------------------------------------------------
> 
> (Updated 2011-06-08 00:22:37)
> 
> 
> Review request for hive and John Sichi.
> 
> 
> Summary
> -------
> 
> Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
> 
> 
> This addresses bug HIVE-2036.
>     https://issues.apache.org/jira/browse/HIVE-2036
> 
> 
> Diffs
> -----
> 
>   ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
>   ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
>   ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
>   ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
>   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
>   ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/857/diff
> 
> 
> Testing
> -------
> 
> Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
> 
> 
> Thanks,
> 
> Syed
> 
>


Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

Posted by John Sichi <js...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review773
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java
<https://reviews.apache.org/r/857/#comment1666>

    Update Javadoc and param name, including an explanation of what handler is supposed to do when multiple indexes are passed in.



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
<https://reviews.apache.org/r/857/#comment1675>

    I'm confused by the logic here.  You are throwing together all of the columns for all of the indexes, but we need to keep them segregated, don't we?  Each subquery should only contain references to the columns relevant to the corresponding index.
    
    (But the partitioning predicates need to be applied to each index.)
    



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java
<https://reviews.apache.org/r/857/#comment1668>

    Why is this public instead of private?



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java
<https://reviews.apache.org/r/857/#comment1667>

    Use HiveUtils.unparseIdentifier



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java
<https://reviews.apache.org/r/857/#comment1669>

    Why do we need this class at all?  The superclass already uses hive.index.blockfilter.file by default.
    



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
<https://reviews.apache.org/r/857/#comment1672>

    Seems like we should only be looking at the indexes on the table accessed by this table scan.  (This comment is retroactive to the original version of the file.)
    



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
<https://reviews.apache.org/r/857/#comment1673>

    Seems like the costing comment below applies to this too.



ql/src/test/queries/clientpositive/index_bitmap3.q
<https://reviews.apache.org/r/857/#comment1670>

    Why do we need this setting at all?  (I'm not sure why it was there in the original version of the file.)


- John


On 2011-06-06 21:37:38, Syed Albiz wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/857/
> -----------------------------------------------------------
> 
> (Updated 2011-06-06 21:37:38)
> 
> 
> Review request for hive and John Sichi.
> 
> 
> Summary
> -------
> 
> Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
> 
> 
> This addresses bug HIVE-2036.
>     https://issues.apache.org/jira/browse/HIVE-2036
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
>   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
>   ql/src/test/queries/clientpositive/index_bitmap3.q 508eb94 
>   ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
>   ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/857/diff
> 
> 
> Testing
> -------
> 
> Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
> 
> 
> Thanks,
> 
> Syed
> 
>