You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Marquis Wang (JIRA)" <ji...@apache.org> on 2010/11/20 03:21:13 UTC

[jira] Created: (HIVE-1803) Implement bitmap indexing in Hive

Implement bitmap indexing in Hive
---------------------------------

                 Key: HIVE-1803
                 URL: https://issues.apache.org/jira/browse/HIVE-1803
             Project: Hive
          Issue Type: New Feature
            Reporter: Marquis Wang
            Assignee: Marquis Wang


Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066147#comment-13066147 ] 

John Sichi commented on HIVE-1803:
----------------------------------

@Siddharth:  please open a new JIRA issue, including exact steps to reproduce your problem.


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>             Fix For: 0.8.0
>
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.13.patch, HIVE-1803.14.patch, HIVE-1803.14.patch, HIVE-1803.15.patch, HIVE-1803.15.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Skye Berghel (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Skye Berghel updated HIVE-1803:
-------------------------------

    Attachment: HIVE-1803.4.patch

A patch that uses a compressed bitmap to save space without losing much computing time for AND and OR operations.

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018970#comment-13018970 ] 

John Sichi commented on HIVE-1803:
----------------------------------

OK, let's see if this one can pass tests before the next conflict gets committed :)


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018988#comment-13018988 ] 

John Sichi commented on HIVE-1803:
----------------------------------

I'm already seeing at least one test failures:  auto_join28.q, which recently got its log updated (Friday) so you probably rebased before that.  It's really difficult to get something like this through without locking down svn for a few days...

You have a point about the unit tests; we would have to special-case the new column and leaving the existing ones as is.  Ugly any way you slice it.


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Jeff Hammerbacher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12999656#comment-12999656 ] 

Jeff Hammerbacher commented on HIVE-1803:
-----------------------------------------

Hey,

I came across a Daniel Lemire project recently that may be of use here: http://code.google.com/p/javaewah.

Later,
Jeff

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, bitmap_index_1.png, bitmap_index_2.png
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment: unit-tests.patch
                HIVE-1803.11.patch

New patch that fixes the minor javadocs comments from patch 10.

A unit-tests patch that updates all the unit tests that were affected by the virtual column change.

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019069#comment-13019069 ] 

John Sichi commented on HIVE-1803:
----------------------------------

OK, here's the simplest approach I can come up with given the way internal column names currently work:

* introduce yet another config parameter hive.exec.rowoffset=true/false (default false)
* only support the new virtual column for queries where this is true (get rid of static field VirtualColumn.registry and replace it with a getRegistry method which takes in a conf)
* set this in the conf used for compiling the internal statement (but don't modify the top-level conf)

(Ideally, we wouldn't need the config param; we would do this automatically based on the presence of a reference to the VC in the query.  Also ideally, at execution time we would avoid any counter-increment overhead when this is false.)


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Status: Patch Available  (was: Open)

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023095#comment-13023095 ] 

John Sichi commented on HIVE-1803:
----------------------------------

I noticed that in one of my attempts with your previous patches.  I think it's the exact same problem you were hitting very early on in the clinic in the Hudson setup, having to do with the stats temp database not getting deleted?


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023097#comment-13023097 ] 

Marquis Wang commented on HIVE-1803:
------------------------------------

I don't see anything that needs to be deleted in my checkout. Where is the stats temp database? Also, if you think it might just be something on our side, can you just run the tests and see if it passes for you? When I ran them I didn't see any other issues besides those, I don't think.

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment: HIVE-1803.2.patch

We fixed the problem in BitmapCollectSet by looking at the PercentileApprox UDAF to figure out how to use an array an input to a UDAF.

This new patch is a working implementation of bitmap indexing. The new test index_bitmap.q shows how to use the index. However, I am unable to add the test itself, and get errors when I run 

ant test -Dtestcase=TestCliDriver -Dqfile=index_bitmap.q -Doverwrite=true -Dtest.silent=false

It says 


Exception: java.lang.RuntimeException: The table default__srcpart_srcpart_index_proj__ is an index table. Please do drop index instead.

wrt to the ALTER INDEX REBUILD line in the test.

We're pretty confused about whether we're doing the new test incorrectly and would appreciate any help.

While we're working to get around that we're also going to go ahead and work on a compressed bitmap, since this implementation does no compression.

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, bitmap_index_1.png, bitmap_index_2.png
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment: unit-tests.2.patch

New unit tests patch that should fix some more tests.

John, I didn't see any failures in TestMTQueries even before adding this new patch. I'm not sure why that would be, but I definitely fixed some things in the other two tests.

Also this patch only includes the unit tests, so you will need to include patch 11 as well.

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Sichi updated HIVE-1803:
-----------------------------

    Status: Open  (was: Patch Available)

When I run ant test, I get a ton of failures with internal column names being different due to the introduction of the new ROWOFFSET virtual column.  The same thing happened to Yongqiang with HIVE-417.

So either you'll need to figure out a way to hide the number adjustment, or you'll have to submit one more (giant) patch with all of the testlog updates (which you can obtain by running ant test -Doverwrite=true and then verifying that all of the changes are expected).

Example from auto_join1.q:

{noformat}
    [junit] 67c67
    [junit] <               outputColumnNames: _col0, _col6
    [junit] ---
    [junit] >               outputColumnNames: _col0, _col5
    [junit] 73c73
    [junit] <                       expr: _col6
    [junit] ---
    [junit] >                       expr: _col5
    [junit] 143c143
    [junit] <               outputColumnNames: _col0, _col6
    [junit] ---
    [junit] >               outputColumnNames: _col0, _col5
{noformat}

Everything else in the patch was fine, except for a few nits:

* Javadoc for GenericUDFEWAHBitmapOr/And has the old class name.
* For BitmapAnd Javadoc, some spaces are missing in front of the stars.


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment: HIVE-1803.12.patch

New patch that implements John's suggestions about adding the hive.exec.rowoffset configuration variable.

This patch fixes the issues with column numbers in explains. John, I'm still seeing some test failures in tests such as combine2.q, bucketmapjoin1.q, bucketmapjoin4.q. It looks like one of the numRows outputs is saying zero rows instead of some non-zero number before in an explain in each of these tests. I'm not really sure what could be causing this and don't see anything in this patch that can affect these tests. Do you have any ideas?

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988914#comment-12988914 ] 

John Sichi commented on HIVE-1803:
----------------------------------

Could you describe the problem you're seeing and provide an example of how to reproduce it?


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, bitmap_index_1.png, bitmap_index_2.png
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Status: Patch Available  (was: Open)

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12999657#comment-12999657 ] 

Marquis Wang commented on HIVE-1803:
------------------------------------

Thanks Jeff. We've actually seen this and have a patch in the works (next couple days) that uses it.

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, bitmap_index_1.png, bitmap_index_2.png
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Sichi updated HIVE-1803:
-----------------------------

    Component/s: Indexing

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment: HIVE-1803.14.patch
                HIVE-1803.14.patch

The issue with the last patch was the order in which VirtualColumn.getRegistry().iterator() was returning. The old code stored the virtual column registry as a HashMap, so I added the columns to the registry in the order the HashMap would have returned them.

This patch fixes that. I'm still seeing errors in groupby1.q through groupby6.q. It looks like various numbers are returning wrong, but it doesn't appear to be related to the virtual columns. I can't really tell whether there is a pattern to it.

can you take a look?

{noformat}
    [junit] >          <string>CNTR_NAME_GBY_28_NUM_INPUT_ROWS</string> 
    [junit] 1345c1341
    [junit] <          <string>CNTR_NAME_GBY_4_NUM_OUTPUT_ROWS</string>
    [junit] ---
    [junit] >          <string>CNTR_NAME_GBY_28_NUM_OUTPUT_ROWS</string> 
    [junit] 1348c1344
    [junit] <          <string>CNTR_NAME_GBY_4_TIME_TAKEN</string>
    [junit] ---
    [junit] >          <string>CNTR_NAME_GBY_28_TIME_TAKEN</string> 
    [junit] 1351c1347
    [junit] <          <string>CNTR_NAME_GBY_4_FATAL_ERROR</string>
    [junit] ---
    [junit] >          <string>CNTR_NAME_GBY_28_FATAL_ERROR</string> 
{/noformat}

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.13.patch, HIVE-1803.14.patch, HIVE-1803.14.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010482#comment-13010482 ] 

He Yongqiang commented on HIVE-1803:
------------------------------------

Did an offline discussion with namit on this jira. 

The basic question is how to use this bitmap indexing. Given there are millions of rows in one block, the block will contain all distinct values this column has. So the bitmap index will not be very useful. A possibly use case maybe do a bitmap and/or. eg, need to find out all records about Male in Japan. Male and Japan are both bitmap indexed. what we can do today is to first do a JOIN and BITMAP AND operation on the 2 index tables, and then find all the matching blocks, which is ok, but there requires a join operation. If we can support an bitmap index with more than 1 index columns, it will help in this case. I mean each index column in the index table has its own bitmap. Eg, FILE_NAME, BLK_OFFSET, GENDER, bitmapForGENDER, COUNTY, bitmapForCountry. bitmapForGENDER will have two bitmaps internally, one for Male, one for Female. And bitmapForCountry will have bitmaps for each country.

And if hive can support skip rows, the bitmap index will be very useful. I mean with bitmap indexing, block pruning maybe not good enough. For example, in a block, we only find the row1, row3, lastRow satisfy the predicate. We can just skip row2, and row4 to lastRow-1.


what do you think?

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment: bitmap_index_1.png

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: bitmap_index_1.png, bitmap_index_2.png
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Status: Patch Available  (was: Open)

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment: HIVE-1803.15.patch
                HIVE-1803.15.patch

New patch that updates the groupby tests in TestParse.

The number from the operator ID was not consistent, it gives different results when I run just one test at a time vs. all the tests at once, which is why I thought they needed to be updated. The result as it was before works for those tests still.

Another thing needed to be changed for me though, for the groupby tests:

{noformat}
@@ -521,7 +521,8 @@
                        <string>sum</string> 
                       </void> 
                       <void property="mode"> 
-                       <object class="org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator$Mode" method="valueOf"> 
+                       <object class="java.lang.Enum" method="valueOf"> 
+                        <class>org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator$Mode</class> 
                         <string>PARTIAL1</string> 
                        </object> 
                       </void>
{noformat}

The new patch updates those tests.


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.13.patch, HIVE-1803.14.patch, HIVE-1803.14.patch, HIVE-1803.15.patch, HIVE-1803.15.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Status: Patch Available  (was: Open)

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Skye Berghel (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Skye Berghel updated HIVE-1803:
-------------------------------

    Attachment: HIVE-1803.1.patch

Marquis and I have prepared a preliminary patch. BitmapCollectSet is currently not working---does anyone have any pointers?

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, bitmap_index_1.png, bitmap_index_2.png
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Sichi updated HIVE-1803:
-----------------------------

    Status: Open  (was: Patch Available)

Review comments in

https://reviews.apache.org/r/466/


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023634#comment-13023634 ] 

John Sichi commented on HIVE-1803:
----------------------------------

The number comes from the operator ID; I'm not sure why those would have shifted, but if it's consistent, then give me a patch that just updates the .q.out files and if that passes for me, we'll call it a day.


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.13.patch, HIVE-1803.14.patch, HIVE-1803.14.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13024638#comment-13024638 ] 

John Sichi commented on HIVE-1803:
----------------------------------

With patch 15, everything passed for me *except* the groupby* tests, so I think it's something about the Java version you're using (unrelated to your change).  When I revert those .q.xml files, TestParse passes.  I'm going to re-run the full ant test with those reverted, and assuming that passes, +1.


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.13.patch, HIVE-1803.14.patch, HIVE-1803.14.patch, HIVE-1803.15.patch, HIVE-1803.15.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment: HIVE-1803.10.patch

Update patch to include more missing javadocs.

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017088#comment-13017088 ] 

John Sichi commented on HIVE-1803:
----------------------------------

That's what I did, and the conflicts match files which were in very recent commits.  

Are you sure you did svn update?  If you're using git, there may be some lag in the replica.


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment: bitmap_index_2.png

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: bitmap_index_1.png, bitmap_index_2.png
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Sichi updated HIVE-1803:
-----------------------------

    Status: Open  (was: Patch Available)

Latest review comments are in:

https://reviews.apache.org/r/530/


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023121#comment-13023121 ] 

John Sichi commented on HIVE-1803:
----------------------------------

Yeah, I can give it a try and see if it passes.

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Status: Patch Available  (was: Open)

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.13.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment: HIVE-1803.8.patch

New patch with minimal changes (got rid of some unused imports)

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Sichi updated HIVE-1803:
-----------------------------

       Resolution: Fixed
    Fix Version/s: 0.8.0
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

Committed.  Thanks Marquis!


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>             Fix For: 0.8.0
>
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.13.patch, HIVE-1803.14.patch, HIVE-1803.14.patch, HIVE-1803.15.patch, HIVE-1803.15.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment: unit-tests.3.patch

New patch for unit tests that hopefully shouldn't conflict this time.

I looked into changing the code so that the outputColumnNames in explains are not affected by virtual columns, but didn't really get anywhere. Besides, wouldn't I have the same problem with commits since the unit tests were changed for the first two virtual columns added?

I figured I'd go ahead and submit this patch again and if you thought I should keep on looking into that you can not accept it. :-)

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Status: Patch Available  (was: Open)

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment:     (was: bitmap_index_2.png)

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: bitmap_index_1.png, bitmap_index_2.png
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Skye Berghel (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Skye Berghel updated HIVE-1803:
-------------------------------

    Attachment: javaewah.jar

Attaching the javaewah.jar file that will go in lib.

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017069#comment-13017069 ] 

Marquis Wang commented on HIVE-1803:
------------------------------------

I re-pulled from trunk and made a new patch and there was no difference between the two. If you have the original unit-tests.patch applied then this patch will fail. Can you try patching HIVE-1803.11.patch followed by unit-tests.2.patch on a clean checkout?

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment: HIVE-1803.6.patch

Another patch where the bitmap_and and bitmap_or tests use two columns instead of faking it with two indexes on one column.

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Status: Patch Available  (was: Open)

John, I'm resubmitting the patch for inclusion and opened a new ticket for creating row-level indexing.

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010471#comment-13010471 ] 

He Yongqiang commented on HIVE-1803:
------------------------------------

I did not do a detailed look, overall the patch looks good. It follows the steps of existing compact index. 

But can you do some tests on the javaewah stuff to see its performance? And what features in javawah are used by this jira?  For compression, if we can just compress the bitmap using simple RLE, how much differences are there? What i mean, also not sure about, is javawah may have a lot features that we do not need, but that add some overhead to the code.

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Status: Patch Available  (was: Open)

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment: HIVE-1803.7.patch

New patch which I believe takes care of all the issues in the review for patch 6.

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017066#comment-13017066 ] 

John Sichi commented on HIVE-1803:
----------------------------------

OK, maybe the TestMTQueries failure was a side-effect of the other failures...I'll retry with your latest patch.


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment: HIVE-1803.3.patch

New patch that includes tests and fixes problems John mentioned.

Bitmap UDFs are "hidden" by giving them leading underscores, since we do need to expose them for using the index for now. When we get automatic usage working, we can hide them more effectively.



> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, bitmap_index_1.png, bitmap_index_2.png
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment: HIVE-1803.13.patch

New patch that updates HADOOP_CLASSPATH and doesn't change tests except adding new tests and show_functions.q. Fingers crossed for this one passing. I'm optimistic.

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.13.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003735#comment-13003735 ] 

John Sichi commented on HIVE-1803:
----------------------------------

Table src has two columns (key and value).  The value is equivalent to the key.  For srcbucket, the value is one plus the key, so at least they're not exactly the same.


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934088#action_12934088 ] 

Marquis Wang commented on HIVE-1803:
------------------------------------

Added a proposed design document on Hive wiki at http://wiki.apache.org/hadoop/Hive/IndexDev/Bitmap

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: bitmap_index_1.png, bitmap_index_2.png
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023519#comment-13023519 ] 

John Sichi commented on HIVE-1803:
----------------------------------

Everything passed...eeexcept TestParse, where I got 44 failures.  Example diff below; somehow the output seems to have gotten reordered, and it's in the area of virtual columns.  Best is to find out why and fix it, otherwise make sure it's deterministic and then update .q.out.

{noformat}
    [junit] diff -b -I'\(\(<java version=".*" class="java.beans.XMLDecoder">\)\|
\(<string>.*/tmp/.*</string>\)\|\(<string>file:.*</string>\)\|\(<string>pfile:.*
</string>\)\|\(<string>[0-9]\{10\}</string>\)\|\(<string>/.*/warehouse/.*</strin
g>\)\)' /data/users/jsichi/open/test-trunk/build/ql/test/logs/positive/case_sens
itivity.q.xml /data/users/jsichi/open/test-trunk/ql/src/test/results/compiler/pl
an/case_sensitivity.q.xml
    [junit] 1209c1209
    [junit] <                    <string>INPUT__FILE__NAME</string> 
    [junit] ---
    [junit] >                    <string>BLOCK__OFFSET__INSIDE__FILE</string> 
    [junit] 1215c1215,1219
    [junit] <                    <object idref="PrimitiveTypeInfo0"/> 
    [junit] ---
    [junit] >                    <object class="org.apache.hadoop.hive.serde2.ty
peinfo.PrimitiveTypeInfo"> 
    [junit] >                     <void property="typeName"> 
    [junit] >                      <string>bigint</string> 
    [junit] >                     </void> 
    [junit] >                    </object> 
    [junit] 1225c1229
    [junit] <                    <string>BLOCK__OFFSET__INSIDE__FILE</string> 
    [junit] ---
    [junit] >                    <string>INPUT__FILE__NAME</string> 
    [junit] 1231,1235c1235
    [junit] <                    <object class="org.apache.hadoop.hive.serde2.ty
peinfo.PrimitiveTypeInfo"> 
    [junit] <                     <void property="typeName"> 
    [junit] <                      <string>bigint</string> 
{noformat}


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.13.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Skye Berghel (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Skye Berghel updated HIVE-1803:
-------------------------------

    Attachment: javaewah.jar

javaewah.jar file (a dependency for the compressed bitmap)

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Sichi updated HIVE-1803:
-----------------------------

    Status: Open  (was: Patch Available)

Some other stuff got committed in between which is causing conflicts when I try patch -p0 < unit-tests.2.patch

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Sichi updated HIVE-1803:
-----------------------------

    Status: Open  (was: Patch Available)

I'm getting test failures still.

* TestMinimrCliDriver:join1
* TestMTQueries:testMTQueries1
* TestParse:  44/45 tests failed

These all need fixes before commit.


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "siddharth ramanan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066011#comment-13066011 ] 

siddharth ramanan commented on HIVE-1803:
-----------------------------------------

Hi, I am trying to put an index for a string column in a table, I am getting NumberFormatException with respect to _offsets in the index table when I run the query? Can you give me suggestions as to how to do this?

Thanks,
Siddharth

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>             Fix For: 0.8.0
>
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.13.patch, HIVE-1803.14.patch, HIVE-1803.14.patch, HIVE-1803.15.patch, HIVE-1803.15.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991563#comment-12991563 ] 

John Sichi commented on HIVE-1803:
----------------------------------

I tried it, and I did get that error, but not in the REBUILD, which worked fine.  So you may be misreading the output.

The error I'm getting is 

{noformat}
FAILED: Error in semantic analysis: line 2:65 Expression Not In Group By Key `_bucketname`
{noformat}

This error is correct since the statement below has `_bucketname` in the SELECT list but not in the GROUP BY.

{noformat}
INSERT OVERWRITE DIRECTORY "/tmp/index_test_index_result" SELECT `_bucketname`, COLLECT_BITMAP_SET(`_offset`, `_bitmaps`) as `_offsets` FROM default__srcpart_srcpart_index_proj__ x WHERE x.key=100 AND x.ds = '2008-04-08' GROUP BY x.key, x.ds;
{noformat}

As a result, the test bails out in the middle (without doing any of its DROP INDEX cleanup).  After that, what happens is that the test framework tries to clean up by dropping all existing tables, but it isn't smart enough to know that it shouldn't try to drop index tables directly.  (We could use a separate patch for that.)

{noformat}
    [junit] org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: The table default__srcpart_srcpart_index_proj__ is an index table. Please do drop index instead.
    [junit] 	at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:739)
    [junit] 	at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:716)
    [junit] 	at org.apache.hadoop.hive.ql.QTestUtil.clearTestSideEffects(QTestUtil.java:333)
    [junit] 	at org.apache.hadoop.hive.cli.TestCliDriver.setUp(TestCliDriver.java:55)
    [junit] 	at junit.framework.TestCase.runBare(TestCase.java:125)
...
{noformat}


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, bitmap_index_1.png, bitmap_index_2.png
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Sichi updated HIVE-1803:
-----------------------------

    Status: Open  (was: Patch Available)

OK, I dug into it and found out that it was a problem with HADOOP_CLASSPATH preventing derby.jar getting loaded (so stats couldn't be written from Hadoop tasks, hence numRows=0).

The existing HADOOP_CLASSPATH was already incorrect, but the problem was only exposed by the addition of the javaewah-0.2.jar.  It was using commas for separators instead of colons (and it should not have been using file: at all!).

Here's the correct format with which I was able to pass a few failing tests I tried individually:

{noformat}
      <env key="HADOOP_CLASSPATH" value="${test.src.data.dir}/conf:${build.dir.\
hive}/dist/lib/derby.jar:${build.dir.hive}/dist/lib/javaewah-0.2.jar"/>
{noformat}

Can you give me another patch which fixes this and omits all .q.out updates for existing tests unless they need it?  Fingers crossed that will be the last one.


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Status: Patch Available  (was: Open)

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Sichi updated HIVE-1803:
-----------------------------

    Status: Open  (was: Patch Available)

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Sichi updated HIVE-1803:
-----------------------------

    Status: Open  (was: Patch Available)

New review board entry (I failed trying to update the old one with the new patch):

https://reviews.apache.org/r/481/


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023355#comment-13023355 ] 

John Sichi commented on HIVE-1803:
----------------------------------

Meh, I'm still getting numRows failures myself.  I noticed that your patch includes some changes to existing test outputs (e.g. bucketmapjoin1.q.out) where it is setting the expected numRows to 0; you should have reverted those before generating the patch.  But the failure I got was in another existing test (filter_join_breaktask).  I'm trying again after reverting the ones you changed (in case the failure I saw was a side effect), but I'm pessimistic; I'm wondering if something innocuous about the change is somehow exposing some existing non-determinism.


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-1803:
-----------------------------

    Status: Open  (was: Patch Available)

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment: bitmap_index_2.png

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: bitmap_index_1.png, bitmap_index_2.png, bitmap_index_2.png
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011380#comment-13011380 ] 

John Sichi commented on HIVE-1803:
----------------------------------

Right, without row-level skipping, the main use case is AND/OR for block filtering.

I'd suggest we get this committed without row-level skipping, and then create a followup for that.  Besides AND/OR, having the bitmap index build/access code committed will be useful for others working on related issues such as automatic usage in the WHERE clause.


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment: HIVE-1803.5.patch

New patch that fixes issues raised. I added some comments/questions to the ReviewBoard.

Are there one or more tables that are already used for testing that have two columns that I can use for testing bitmap_and and bitmap_or?

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Sichi updated HIVE-1803:
-----------------------------

    Attachment: JavaEWAH_20110304.zip

Uploading a .zip of the source for reference.

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1803) Implement bitmap indexing in Hive

Posted by "John Sichi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992822#comment-12992822 ] 

John Sichi commented on HIVE-1803:
----------------------------------

Some review comments:

* Need to factor out all that code duplicated from compact index handler; share it in package org.apache.hadoop.hive.ql.index.  Use abstract classes in cases where behavior needs to be overridden, otherwise, just share concrete classes there.
* If we're going to publish the new UDF's as visible out of the box (not just internal to the index implementation) then they need unit tests of their own, as well as some documentation about the representation on which they operate (maybe best to wait and see how compression shakes out first).  Also, for the ones that turn out to be not generally applicable, then they need to be named more specifically.
* For dense bitmaps, I think you can probably use java.util.BitSet instead of rolling so much of your own (at least for ones where you have control over the bit array representation)
* The name attribute in the annotation for GenericUDAFCollectBitmapSet is incorrect.
* In HiveIndex.java, the symbol should be just BITMAP_TABLE (not BITMAP_SUMMARY_TABLE) since the bitmap is actually quite detailed.


> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, bitmap_index_1.png, bitmap_index_2.png
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Attachment: HIVE-1803.9.patch

Uploaded new patch that addresses John's comments on patch 8.

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

Posted by "Marquis Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marquis Wang updated HIVE-1803:
-------------------------------

    Status: Patch Available  (was: Open)

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.patch
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira