You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Russell Melick (JIRA)" <ji...@apache.org> on 2011/03/09 07:48:00 UTC

[jira] Created: (HIVE-2036) Update bitmap indexes for automatic usage

Update bitmap indexes for automatic usage
-----------------------------------------

                 Key: HIVE-2036
                 URL: https://issues.apache.org/jira/browse/HIVE-2036
             Project: Hive
          Issue Type: Improvement
          Components: Indexing
    Affects Versions: 0.8.0
            Reporter: Russell Melick
            Assignee: Jeffrey Lym


HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045770#comment-13045770 ] 

John Sichi commented on HIVE-2036:
----------------------------------

I'll take a look at the new patch tomorrow.  index_auto_partitioned.q does not actually include a predicate on the partitioning column, so it should be enhanced to do that.

The way it works for the compact index handler is that if we have a predicate like

{noformat}
WHERE part_col = 1 AND index_col = 2 AND some_other_col = 3
{noformat}

then it should generate

{noformat}
WHERE part_col = 1 AND index_col = 2
{noformat}

in the internal query against the index table.  That's the reason that getIndexPredicateAnalyzer walks through all the partitions and adds the predicate columns via allowColumnName.  (The way it does it isn't so great since it repeats it for each partition, when in fact one partition should be good enough.)


> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045717#comment-13045717 ] 

jiraposter@reviews.apache.org commented on HIVE-2036:
-----------------------------------------------------



bq.  On 2011-06-07 18:30:15, John Sichi wrote:
bq.  > ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java, line 54
bq.  > <https://reviews.apache.org/r/857/diff/1/?file=20596#file20596line54>
bq.  >
bq.  >     Use HiveUtils.unparseIdentifier

HiveUtils.unparseIdentifier is used on the argument passed in through to the constructor.


bq.  On 2011-06-07 18:30:15, John Sichi wrote:
bq.  > ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java, line 25
bq.  > <https://reviews.apache.org/r/857/diff/1/?file=20599#file20599line25>
bq.  >
bq.  >     Why do we need this class at all?  The superclass already uses hive.index.blockfilter.file by default.
bq.  >

removed in next diff


- Syed


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review773
-----------------------------------------------------------


On 2011-06-08 00:22:37, Syed Albiz wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-06-08 00:22:37)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.      https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
bq.    ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Syed
bq.  
bq.



> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "Russell Melick (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Russell Melick reassigned HIVE-2036:
------------------------------------

    Assignee: Syed S. Albiz  (was: Jeffrey Lym)

> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045772#comment-13045772 ] 

jiraposter@reviews.apache.org commented on HIVE-2036:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review782
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java
<https://reviews.apache.org/r/857/#comment1680>

    It's preferable to apply the unparsing right at the point of SQL rendering.


- John


On 2011-06-08 00:22:37, Syed Albiz wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-06-08 00:22:37)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.      https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
bq.    ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Syed
bq.  
bq.



> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049406#comment-13049406 ] 

jiraposter@reviews.apache.org commented on HIVE-2036:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review836
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java
<https://reviews.apache.org/r/857/#comment1806>

    Slight rephrasing suggested:
    
    "If multiple indexes are provided, it is up to handler to decide whether to use none, one, some, or all of them.  The supplied predicate may reference any of the columns from any of the indexes.  If the handler decides to use more than one index, then it is responsible for generating tasks to combine their search results (e.g. via a JOIN)."



ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java
<https://reviews.apache.org/r/857/#comment1805>

    This should be gone.



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
<https://reviews.apache.org/r/857/#comment1807>

    Delete commented-out code, or convert it into a TODO with a corresponding JIRA issue link.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
<https://reviews.apache.org/r/857/#comment1808>

    Could you explain more about what's going on here?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
<https://reviews.apache.org/r/857/#comment1817>

    Only do indexes.get(0) once.


- John


On 2011-06-14 04:05:43, Syed Albiz wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-06-14 04:05:43)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.      https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
bq.    ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
bq.    ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
bq.    ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Syed
bq.  
bq.



> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054058#comment-13054058 ] 

Hudson commented on HIVE-2036:
------------------------------

Integrated in Hive-trunk-h0.21 #790 (See [https://builds.apache.org/job/Hive-trunk-h0.21/790/])
    

> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>             Fix For: 0.8.0
>
>         Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch, HIVE-2036.8.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048859#comment-13048859 ] 

jiraposter@reviews.apache.org commented on HIVE-2036:
-----------------------------------------------------



bq.  On 2011-06-13 22:57:46, John Sichi wrote:
bq.  > ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java, line 114
bq.  > <https://reviews.apache.org/r/857/diff/4/?file=20984#file20984line114>
bq.  >
bq.  >     I don't think this should be necessary.  We just want to propagate the partition column predicate (whatever it is) from the base table query to the index table query; partition pruning on the index table query will do the rest of the work.
bq.  >     
bq.  >     In other words, if the original query had
bq.  >     
bq.  >     part_key=<whatever>
bq.  >     
bq.  >     we want to preserve that on the index table query.  That's what the code is already supposed to be doing before your change; was it not working?
bq.  >

This code is to prevent automatic usage from kicking in if the index has not been built on the partition specified in the partition predicate. (i.e. if the index has only been built on partition ds=foo, and the query is select key from src where ds=bar; We do not want to execute an index query in this case. It seems like adding a test for bitmaps specifically to mirror index_auto_unused.q(which is where this functionality is tested for Compact indices) would be a good idea.


- Syed


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review825
-----------------------------------------------------------


On 2011-06-11 19:05:42, Syed Albiz wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-06-11 19:05:42)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.      https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
bq.    ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
bq.    ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
bq.    ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Syed
bq.  
bq.



> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036709#comment-13036709 ] 

Marquis Wang commented on HIVE-2036:
------------------------------------

Russell is right. hive.index.compact.file is deprecated and replaced with hive.index.blockfilter.file (I think). I kept the former around for backwards-compatibility reasons, but we should try to avoid using it.

> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051368#comment-13051368 ] 

jiraposter@reviews.apache.org commented on HIVE-2036:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
-----------------------------------------------------------

(Updated 2011-06-17 22:34:18.950303)


Review request for hive and John Sichi.


Changes
-------

added comments, only push filter expr into TS operator when automatic indexing is turned on.


Summary
-------

Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
    https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 95fef73 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java d22654b 
  ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto.q.out 713bb40 
  ql/src/test/results/clientpositive/index_auto_file_format.q.out 894a556 
  ql/src/test/results/clientpositive/index_auto_multiple.q.out 27092dc 
  ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
  ql/src/test/results/clientpositive/index_auto_unused.q.out 8a1eda5 
  ql/src/test/results/clientpositive/index_bitmap3.q.out dadfa77 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
-------

Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.


Thanks,

Syed



> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046278#comment-13046278 ] 

jiraposter@reviews.apache.org commented on HIVE-2036:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review785
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java
<https://reviews.apache.org/r/857/#comment1681>

    I think that should be a period instead of a comma in "indexes, if"



ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java
<https://reviews.apache.org/r/857/#comment1682>

    How exactly are they combined?  This Javadoc should be written as a contract between the optimizer and the index plugin author, so that the author knows exactly how to interpret the inputs and also what is going to be done with the output.
    



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
<https://reviews.apache.org/r/857/#comment1683>

    Why do you need to use toArray here?  indexCols.keySet is already a collection.



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
<https://reviews.apache.org/r/857/#comment1684>

    Why are you converting the search conditions back into predicate form here?  Wouldn't it be easier to analyze them as search conditions?


- John


On 2011-06-08 00:22:37, Syed Albiz wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-06-08 00:22:37)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.      https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
bq.    ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Syed
bq.  
bq.



> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045718#comment-13045718 ] 

jiraposter@reviews.apache.org commented on HIVE-2036:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
-----------------------------------------------------------

(Updated 2011-06-08 00:22:37.292935)


Review request for hive and John Sichi.


Changes
-------

Addressed comments. Still does not propagate partition predicates to every single index sub-query, but it does ensure that predicates are only applied to indexes for which there are matching columns. After looking at the behavior of CompactIndexHandler on partitioned tables (and in testcase index_auto_partitioned.q) I can't quite see how the CompactIndexHandler identifies and propagates partitioning predicates correctly.


Summary
-------

Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
    https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-----

  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
-------

Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.


Thanks,

Syed



> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "Syed S. Albiz (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Syed S. Albiz updated HIVE-2036:
--------------------------------

    Status: Patch Available  (was: Open)

> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch, HIVE-2036.8.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "Syed S. Albiz (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Syed S. Albiz updated HIVE-2036:
--------------------------------

    Attachment: HIVE-2036.1.patch

> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "Syed S. Albiz (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Syed S. Albiz updated HIVE-2036:
--------------------------------

    Attachment: HIVE-2036.8.patch

> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch, HIVE-2036.8.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "Syed S. Albiz (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Syed S. Albiz updated HIVE-2036:
--------------------------------

    Attachment: HIVE-2036.3.patch

This patch is still WIP, there are a couple of issues I know still need correcting. In particular, the index_auto_unused.q testcase fails, since I updated the partition predicates to propagate properly, there was no check to make sure that the index was built on the partition being queried (but the testcase would still pass since partition predicates weren't propagated anyway)

I probably also want to refactor the logic in IndexWhereProcessor before this is ready.

> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047029#comment-13047029 ] 

jiraposter@reviews.apache.org commented on HIVE-2036:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
-----------------------------------------------------------

(Updated 2011-06-10 06:35:32.125295)


Review request for hive and John Sichi.


Changes
-------

Based on a discussion with yongqian, I re-implemented the predicate decomposition into two steps, computing the overall residual predicate from the union of all columns in the available indexes, and then computing the predicates to apply to each index individually. Additionally I have also extended the functionality to pass in partition columns to allowColumnNames and added/extended the testcases to check that partition predicates are propagated correctly. This required adding a check in IndexWhereProcessor.java that the correct FilterOperator was passed to the process(...) method (apparently a duplicate FilterOperator that does not have the entire predicate gets created).


Summary
-------

Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
    https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
  ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
-------

Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.


Thanks,

Syed



> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Sichi updated HIVE-2036:
-----------------------------

       Resolution: Fixed
    Fix Version/s: 0.8.0
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

Committed.  Thanks Syed!


> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>             Fix For: 0.8.0
>
>         Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch, HIVE-2036.8.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052197#comment-13052197 ] 

John Sichi commented on HIVE-2036:
----------------------------------

I mean, once the latest patch gets uploaded.

> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048977#comment-13048977 ] 

jiraposter@reviews.apache.org commented on HIVE-2036:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
-----------------------------------------------------------

(Updated 2011-06-14 04:05:43.158797)


Review request for hive and John Sichi.


Changes
-------

Removed redundant check on partition predicate (which is done in IndexWhereProcessor). The reason this was causing problems was that when the index was being built, the query generated to build the index was run through the optimizer and at this stage the optimizer thought that the index was already built and had the partition. A simpler solution is to just disable index query optimization for building indexes. 


Summary
-------

Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
    https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
  ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
-------

Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.


Thanks,

Syed



> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036449#comment-13036449 ] 

Marquis Wang commented on HIVE-2036:
------------------------------------

Making notes on how to do this:

One of the difficult/different parts about using bitmap indexes is that the only time they become useful is when multiple indexes are combined. Thus, you need a query that joins the various bitmap index tables and returns the blocks that contain the rows we want.

Thus the two parts to writing the automatic use index handler for bitmap indexes are:

1. Figuring out what indexes to use:

As mentioned above, you may need to extend the IndexPredicateAnalyzer to support ORs and possibly to return a tree of predicates (I don't think it already does this).

2. Building a query that accesses the index tables:

This is an example query that I know works for querying the index tables in the query

{noformat}
SELECT * FROM lineitem WHERE  L_QUANTITY = 50.0 AND L_DISCOUNT = 0.08 AND L_TAX = 0.01;
{noformat}

{noformat}
SELECT bucketname AS `_bucketname`, COLLECT_SET(offset) as `_offsets`
FROM (SELECT
        `_bucketname` AS bucketname, `_offset` AS offset
      FROM
        (SELECT ab.`_bucketname`, ab.`_offset`, EWAH_BITMAP_AND(ab.bitmap, c.`_bitmaps`) as bitmap FROM
          (SELECT a.`_bucketname`, b.`_offset`, EWAH_BITMAP_AND(a.`_bitmaps`, b.`_bitmaps`) as bitmap FROM 
            (SELECT * FROM default__lineitem_quantity__ WHERE L_QUANTITY = 50.0) a JOIN 
            (SELECT * FROM default__lineitem_discount__ WHERE L_DISCOUNT = 0.08) b 
                ON a.`_bucketname` = b.`_bucketname` AND a.`_offset` = b.`_offset`) ab JOIN
              (SELECT * FROM default__lineitem_tax__ WHERE L_TAX = 0.01) c
                ON ab.`_bucketname` = c.`_bucketname` AND ab.`_offset` = c.`_offset`) abc 
      WHERE 
        NOT EWAH_BITMAP_EMPTY(abc.bitmap)
) t
GROUP BY bucketname;
{noformat}

This format is perfect for joining any number of AND predicates. I'm pretty sure you can figure out how to expand them to include OR predicates and different grounping of predicates as well. If you make any changes/extensions to the format you should be sure to test them to make sure they have the performance characteristics you want.

> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Jeffrey Lym
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048854#comment-13048854 ] 

jiraposter@reviews.apache.org commented on HIVE-2036:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review825
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java
<https://reviews.apache.org/r/857/#comment1790>

    I don't think this should be necessary.  We just want to propagate the partition column predicate (whatever it is) from the base table query to the index table query; partition pruning on the index table query will do the rest of the work.
    
    In other words, if the original query had
    
    part_key=<whatever>
    
    we want to preserve that on the index table query.  That's what the code is already supposed to be doing before your change; was it not working?
    


- John


On 2011-06-11 19:05:42, Syed Albiz wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-06-11 19:05:42)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.      https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
bq.    ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
bq.    ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
bq.    ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Syed
bq.  
bq.



> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045577#comment-13045577 ] 

John Sichi commented on HIVE-2036:
----------------------------------

I've added some review board comments; I'll probably have some more once we've resolved these.

> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049436#comment-13049436 ] 

jiraposter@reviews.apache.org commented on HIVE-2036:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
-----------------------------------------------------------

(Updated 2011-06-14 21:26:21.276789)


Review request for hive and John Sichi.


Changes
-------

Addressed comments, added some more commenting for why we use indexes.get(0) in IndexWhereProcessor as that seemed a bit unclear


Summary
-------

Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
    https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
  ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
-------

Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.


Thanks,

Syed



> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045578#comment-13045578 ] 

jiraposter@reviews.apache.org commented on HIVE-2036:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review773
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java
<https://reviews.apache.org/r/857/#comment1666>

    Update Javadoc and param name, including an explanation of what handler is supposed to do when multiple indexes are passed in.



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
<https://reviews.apache.org/r/857/#comment1675>

    I'm confused by the logic here.  You are throwing together all of the columns for all of the indexes, but we need to keep them segregated, don't we?  Each subquery should only contain references to the columns relevant to the corresponding index.
    
    (But the partitioning predicates need to be applied to each index.)
    



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java
<https://reviews.apache.org/r/857/#comment1668>

    Why is this public instead of private?



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java
<https://reviews.apache.org/r/857/#comment1667>

    Use HiveUtils.unparseIdentifier



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java
<https://reviews.apache.org/r/857/#comment1669>

    Why do we need this class at all?  The superclass already uses hive.index.blockfilter.file by default.
    



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
<https://reviews.apache.org/r/857/#comment1672>

    Seems like we should only be looking at the indexes on the table accessed by this table scan.  (This comment is retroactive to the original version of the file.)
    



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
<https://reviews.apache.org/r/857/#comment1673>

    Seems like the costing comment below applies to this too.



ql/src/test/queries/clientpositive/index_bitmap3.q
<https://reviews.apache.org/r/857/#comment1670>

    Why do we need this setting at all?  (I'm not sure why it was there in the original version of the file.)


- John


On 2011-06-06 21:37:38, Syed Albiz wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-06-06 21:37:38)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.      https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
bq.    ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
bq.    ql/src/test/queries/clientpositive/index_bitmap3.q 508eb94 
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Syed
bq.  
bq.



> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037081#comment-13037081 ] 

John Sichi commented on HIVE-2036:
----------------------------------

Yeah, starting off with just AND is probably already a good-sized chunk of work.


> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052196#comment-13052196 ] 

John Sichi commented on HIVE-2036:
----------------------------------

+1.  Will commit when tests pass.

> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "Russell Melick (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036685#comment-13036685 ] 

Russell Melick commented on HIVE-2036:
--------------------------------------

To expand a bit on Marquis' comments.

In CompactIndexHandler.getIndexPredicateAnalyzer(), we instantiate a predicate analyzer.  My theory is that you're going to want a whole new PredicateAnalyzer class to deal with bitmaps, and then you'll instantiate it in a very similar way inside BitmapIndexHandler.  You can also see here how we only search for columns on which we have indexes.  This is going to need to be modified, since it currently only allows columns from a single index.

You may also want to rewrite some of the logic in IndexWhereProcessor.process():110.  It currently loops through every index available and asks it to do a rewrite.  Perhaps it should loop through every index type and try to find the rewrites possible only using indexes of that type.

If you look at IndexPredicateAnalyzer:123, you can see where it's making sure that all the parent operators are AND operations.  It should be easy to modify this to allow OR operations, but I'm not sure that simply allowing them and using the current system will maintain logical correctness.  It's probably better to start off with just AND's.

The pushedPredicate is the important thing returned by the predicate analyzer.  The pushed predicate is what it was able to recognize/process.  That's the tree you'll want to use to generate the bitmap query.  The residual predicate is what it couldn't process. There's a separate JIRA open (HIVE-2115) to use the residual to cut down on remaining work.

The query generation lives in the IndexHandlers.generateIndexQuery(...).  You'll definitely need more logic than the simple call to decomposedPredicate.pushedPredicate.getExprString() that is in the CompactIndexHandler.

There are a few spots where hive.index.compact.file is used.  These may need generalized.  However, Marquis may have already taken care of this with the bitmap stuff.  I don't remember what the new name for it was (I think it's hive.index.blockfilter.file), but it's probably easiest to look in one of his unit tests for it.

The last thing I can think of is that having multiple index types on a single table, or queries that use multiple tables may become an issue.  I created HIVE-2128 to deal with the multiple tables.

Good luck!

> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Jeffrey Lym
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047980#comment-13047980 ] 

jiraposter@reviews.apache.org commented on HIVE-2036:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
-----------------------------------------------------------

(Updated 2011-06-11 19:05:42.241706)


Review request for hive and John Sichi.


Changes
-------

Fix index_auto_unused.q testcase by adding a check for partitions in the index and ensuring that only partitions actually in the index are used to compute index predicates.


Summary
-------

Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
    https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
  ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
-------

Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.


Thanks,

Syed



> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046277#comment-13046277 ] 

John Sichi commented on HIVE-2036:
----------------------------------

Added a few new comments on review board.

> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048857#comment-13048857 ] 

jiraposter@reviews.apache.org commented on HIVE-2036:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review826
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
<https://reviews.apache.org/r/857/#comment1792>

    Don't bother with empty return statements.


- John


On 2011-06-11 19:05:42, Syed Albiz wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-06-11 19:05:42)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.      https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
bq.    ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
bq.    ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
bq.    ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Syed
bq.  
bq.



> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050610#comment-13050610 ] 

jiraposter@reviews.apache.org commented on HIVE-2036:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review856
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
<https://reviews.apache.org/r/857/#comment1865>

    Need to update this comment now, explaining why we don't even look for the filter operator any more.


- John


On 2011-06-15 23:46:24, Syed Albiz wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-06-15 23:46:24)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.      https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
bq.    ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
bq.    ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 95fef73 
bq.    ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java d22654b 
bq.    ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
bq.    ql/src/test/results/clientpositive/index_auto.q.out 713bb40 
bq.    ql/src/test/results/clientpositive/index_auto_file_format.q.out 894a556 
bq.    ql/src/test/results/clientpositive/index_auto_multiple.q.out 27092dc 
bq.    ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
bq.    ql/src/test/results/clientpositive/index_auto_unused.q.out 8a1eda5 
bq.    ql/src/test/results/clientpositive/index_bitmap3.q.out dadfa77 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Syed
bq.  
bq.



> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045140#comment-13045140 ] 

jiraposter@reviews.apache.org commented on HIVE-2036:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
-----------------------------------------------------------

Review request for hive and John Sichi.


Summary
-------

Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
    https://issues.apache.org/jira/browse/HIVE-2036


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
  ql/src/test/queries/clientpositive/index_bitmap3.q 508eb94 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
-------

Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.


Thanks,

Syed



> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050150#comment-13050150 ] 

jiraposter@reviews.apache.org commented on HIVE-2036:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
-----------------------------------------------------------

(Updated 2011-06-15 23:46:24.176586)


Review request for hive and John Sichi.


Changes
-------

Used setFilterExpr on the TableScanDesc to propagate the complete original predicate as the partition predicate was getting removed by the PartitionConditionRemover from the FilterOperator


Summary
-------

Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
    https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 95fef73 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java d22654b 
  ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto.q.out 713bb40 
  ql/src/test/results/clientpositive/index_auto_file_format.q.out 894a556 
  ql/src/test/results/clientpositive/index_auto_multiple.q.out 27092dc 
  ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
  ql/src/test/results/clientpositive/index_auto_unused.q.out 8a1eda5 
  ql/src/test/results/clientpositive/index_bitmap3.q.out dadfa77 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
-------

Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.


Thanks,

Syed



> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049324#comment-13049324 ] 

jiraposter@reviews.apache.org commented on HIVE-2036:
-----------------------------------------------------



bq.  On 2011-06-13 22:57:46, John Sichi wrote:
bq.  > ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java, line 114
bq.  > <https://reviews.apache.org/r/857/diff/4/?file=20984#file20984line114>
bq.  >
bq.  >     I don't think this should be necessary.  We just want to propagate the partition column predicate (whatever it is) from the base table query to the index table query; partition pruning on the index table query will do the rest of the work.
bq.  >     
bq.  >     In other words, if the original query had
bq.  >     
bq.  >     part_key=<whatever>
bq.  >     
bq.  >     we want to preserve that on the index table query.  That's what the code is already supposed to be doing before your change; was it not working?
bq.  >
bq.  
bq.  Syed Albiz wrote:
bq.      This code is to prevent automatic usage from kicking in if the index has not been built on the partition specified in the partition predicate. (i.e. if the index has only been built on partition ds=foo, and the query is select key from src where ds=bar; We do not want to execute an index query in this case. It seems like adding a test for bitmaps specifically to mirror index_auto_unused.q(which is where this functionality is tested for Compact indices) would be a good idea.

The logic for making sure that the necessary index partitions exist is already present in IndexWhereProcessor.checkPartitionsCoveredByIndex.  If that's not working, we should fix it; it should not be necessary to change the predicate analyzer at all.


- John


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review825
-----------------------------------------------------------


On 2011-06-14 04:05:43, Syed Albiz wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-06-14 04:05:43)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.      https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION 
bq.    ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 
bq.    ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d 
bq.    ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a 
bq.    ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.    ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION 
bq.    ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.    ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Syed
bq.  
bq.



> Update bitmap indexes for automatic usage
> -----------------------------------------
>
>                 Key: HIVE-2036
>                 URL: https://issues.apache.org/jira/browse/HIVE-2036
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Syed S. Albiz
>         Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch
>
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.  The bitmap code will need to be extended after it is committed to enable automatic use of indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira