You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Clint Morgan (JIRA)" <ji...@apache.org> on 2008/09/13 00:09:44 UTC

[jira] Created: (HBASE-883) Secondary Indexes

Secondary Indexes
-----------------

                 Key: HBASE-883
                 URL: https://issues.apache.org/jira/browse/HBASE-883
             Project: Hadoop HBase
          Issue Type: New Feature
          Components: client, regionserver
            Reporter: Clint Morgan
            Assignee: Clint Morgan


I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Updated: (HBASE-883) Secondary Indexes

Posted by stack <st...@duboce.net>.
Ding, Hui wrote:
> So can this be checked out through trunk now? 
>   
Yes.  Let us know if its not working for you.
St.Ack



RE: [jira] Updated: (HBASE-883) Secondary Indexes

Posted by Andrew Purtell <ap...@yahoo.com>.
> From: Ding, Hui <hu...@sap.com>
> Subject: RE: [jira] Updated: (HBASE-883) Secondary Indexes
> To: hbase-dev@hadoop.apache.org
> Date: Monday, November 17, 2008, 11:17 AM
>
> So can this be checked out through trunk now? 

Yes.

   - Andy



      

RE: [jira] Updated: (HBASE-883) Secondary Indexes

Posted by "Ding, Hui" <hu...@sap.com>.
So can this be checked out through trunk now? 

-----Original Message-----
From: Andrew Purtell (JIRA) [mailto:jira@apache.org] 
Sent: Monday, November 17, 2008 10:50 AM
To: hbase-dev@hadoop.apache.org
Subject: [jira] Updated: (HBASE-883) Secondary Indexes


     [
https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.
plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-883:
---------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Committed to TRUNK. Passed all local tests.

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch,
hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch,
hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a
separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


RE: secondary index using lucene WAS -> Re: [jira] Commented: (HBASE-883) Secondary Indexes

Posted by "Ding, Hui" <hu...@sap.com>.
Thank you Stack! 

-----Original Message-----
From: stack [mailto:stack@duboce.net] 
Sent: Friday, October 10, 2008 8:54 PM
To: hbase-dev@hadoop.apache.org
Subject: secondary index using lucene WAS -> Re: [jira] Commented:
(HBASE-883) Secondary Indexes

(I changed the subject so the folks who might be working on a secondary 
index using lucene might notice Ding Hui's question -- St.Ack)

Ding, Hui wrote:
> I am wondering what's the status of this doing secondary indexing with
> lucene?
> Is there anyway to follow with the progress?
>
> Thx! 
>
> -----Original Message-----
> From: stack (JIRA) [mailto:jira@apache.org] 
> Sent: Thursday, September 25, 2008 2:58 PM
> To: hbase-dev@hadoop.apache.org
> Subject: [jira] Commented: (HBASE-883) Secondary Indexes
>
>
>     [
>
https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.
>
plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634648#
> action_12634648 ] 
>
> stack commented on HBASE-883:
> -----------------------------
>
> Hey Clint:
>
> Highlevel, talk on the list has made mention of secondary indices done
> using lucene rather than keeping a second table.  Because of this, one
> suggestion to consider -- non-binding, just a suggestion -- is that
you
> might include the type of your secondary index -- i.e. table index --
> when naming classes, etc.  For example, IndexedRegionServer should
> perhaps become TableIndexedRegionServer (bit of a mouthful -- maybe
you
> have a better name) or maybe better, just change your package name --
> use tableindexed or tindexed instead of indexed -- so its clear how
your
> secondary index is implemented.
>
> In HTD:
>
> + Is this inclusion intentional: '+  public static final String
> ROW_KEY_COMPARATOR = "ROW_KEY_COMPARATOR";'?  Is this leak from
another
> patch?  Same for HSK.java.... and setRowKeyComparator in HTD, etc.
> + You have to copy '+  static private byte[] format(final int number)
{'
> from PerformanceEvaluation because its inaccessible?  I'd suggest
change
> access on PE so you don't have to duplicate.
>
> ... more to follow
>
>
>   
>> Secondary Indexes
>> -----------------
>>
>>                 Key: HBASE-883
>>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>>             Project: Hadoop HBase
>>          Issue Type: New Feature
>>          Components: client, regionserver
>>            Reporter: Clint Morgan
>>            Assignee: Clint Morgan
>>         Attachments: hbase-883.patch
>>
>>
>> I'm working on a secondary index impl. The basic idea is to maintain
a
>>     
> separate table per index.
>
>   


secondary index using lucene WAS -> Re: [jira] Commented: (HBASE-883) Secondary Indexes

Posted by stack <st...@duboce.net>.
(I changed the subject so the folks who might be working on a secondary 
index using lucene might notice Ding Hui's question -- St.Ack)

Ding, Hui wrote:
> I am wondering what's the status of this doing secondary indexing with
> lucene?
> Is there anyway to follow with the progress?
>
> Thx! 
>
> -----Original Message-----
> From: stack (JIRA) [mailto:jira@apache.org] 
> Sent: Thursday, September 25, 2008 2:58 PM
> To: hbase-dev@hadoop.apache.org
> Subject: [jira] Commented: (HBASE-883) Secondary Indexes
>
>
>     [
> https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.
> plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634648#
> action_12634648 ] 
>
> stack commented on HBASE-883:
> -----------------------------
>
> Hey Clint:
>
> Highlevel, talk on the list has made mention of secondary indices done
> using lucene rather than keeping a second table.  Because of this, one
> suggestion to consider -- non-binding, just a suggestion -- is that you
> might include the type of your secondary index -- i.e. table index --
> when naming classes, etc.  For example, IndexedRegionServer should
> perhaps become TableIndexedRegionServer (bit of a mouthful -- maybe you
> have a better name) or maybe better, just change your package name --
> use tableindexed or tindexed instead of indexed -- so its clear how your
> secondary index is implemented.
>
> In HTD:
>
> + Is this inclusion intentional: '+  public static final String
> ROW_KEY_COMPARATOR = "ROW_KEY_COMPARATOR";'?  Is this leak from another
> patch?  Same for HSK.java.... and setRowKeyComparator in HTD, etc.
> + You have to copy '+  static private byte[] format(final int number) {'
> from PerformanceEvaluation because its inaccessible?  I'd suggest change
> access on PE so you don't have to duplicate.
>
> ... more to follow
>
>
>   
>> Secondary Indexes
>> -----------------
>>
>>                 Key: HBASE-883
>>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>>             Project: Hadoop HBase
>>          Issue Type: New Feature
>>          Components: client, regionserver
>>            Reporter: Clint Morgan
>>            Assignee: Clint Morgan
>>         Attachments: hbase-883.patch
>>
>>
>> I'm working on a secondary index impl. The basic idea is to maintain a
>>     
> separate table per index.
>
>   


RE: [jira] Commented: (HBASE-883) Secondary Indexes

Posted by "Ding, Hui" <hu...@sap.com>.
I am wondering what's the status of this doing secondary indexing with
lucene?
Is there anyway to follow with the progress?

Thx! 

-----Original Message-----
From: stack (JIRA) [mailto:jira@apache.org] 
Sent: Thursday, September 25, 2008 2:58 PM
To: hbase-dev@hadoop.apache.org
Subject: [jira] Commented: (HBASE-883) Secondary Indexes


    [
https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.
plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634648#
action_12634648 ] 

stack commented on HBASE-883:
-----------------------------

Hey Clint:

Highlevel, talk on the list has made mention of secondary indices done
using lucene rather than keeping a second table.  Because of this, one
suggestion to consider -- non-binding, just a suggestion -- is that you
might include the type of your secondary index -- i.e. table index --
when naming classes, etc.  For example, IndexedRegionServer should
perhaps become TableIndexedRegionServer (bit of a mouthful -- maybe you
have a better name) or maybe better, just change your package name --
use tableindexed or tindexed instead of indexed -- so its clear how your
secondary index is implemented.

In HTD:

+ Is this inclusion intentional: '+  public static final String
ROW_KEY_COMPARATOR = "ROW_KEY_COMPARATOR";'?  Is this leak from another
patch?  Same for HSK.java.... and setRowKeyComparator in HTD, etc.
+ You have to copy '+  static private byte[] format(final int number) {'
from PerformanceEvaluation because its inaccessible?  I'd suggest change
access on PE so you don't have to duplicate.

... more to follow


> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>         Attachments: hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a
separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-883) Secondary Indexes

Posted by "Clint Morgan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12643372#action_12643372 ] 

Clint Morgan commented on HBASE-883:
------------------------------------

>> Secondary indexes run on top of a transactional hbase? Thats needed because insert into secondary is transactional with primary table?

Yeah thats correct. I want transactional behavior to happen first, then indexes get updated iff a transaction is committed. 

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-883) Secondary Indexes

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12643744#action_12643744 ] 

stack commented on HBASE-883:
-----------------------------

Clint:

This won't work, right:

{code}
+    } 
+    if (regionInfo != null && regionInfo.getTableDesc().getRowKeyComparator() != null) {
+      return regionInfo.getTableDesc().getRowKeyComparator().compare(rowA, rowB);
     }
{code}

i.e. a table-specific comparator. Since the .META. and -ROOT- use different comparator, there is going to be a disagreement, someday.

Remove it from above and from HTableDescriptor?

Minor: The below should be wrapped in an if (LOG.isDebugEnable()) check:

{code}
+    LOG.debug("Index [" + indexSpec.getIndexId() + "] adding new entry ["
+        + Bytes.toString(indexUpdate.getRow()) + "] for row ["
+        + Bytes.toString(row) + "]");
{code}

Otherwise you're paying string creation and then just throwing it all way when running at INFO level.

Minor: Can you not do putAll below?

{code}
+    for(IndexSpecification index : indexes) {
+      this.indexes.put(index.getIndexId(), index);
+    }
{code}

Minor: Missing javadoc on this class
{code}
+public class ReverseByteArrayComparator implements WritableComparator<byte[]> {

{code}

Minor: All data members in IndexSpecification look like they should be final; IndexSpecification looks immutable.

Minor: When in hbase-land, its better to use HbaseObjectWritable rather than ObjectWritable.  You'll get a little performance boost.  You'll likely have to add new codes for your new classes; see head of HbaseObjectWritable.

On above, the minor's are not important.  Just FYI.  I do think that removing the comparator important because could propagate wrong impression -- that an table-specific comparator is a possibility (unless you know something I don't).

Thanks Clint.


> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-883) Secondary Indexes

Posted by "Clint Morgan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Clint Morgan updated HBASE-883:
-------------------------------

    Attachment: hbase-883.patch

Latest version, fixed bug related to index maintenance, added a minimal package.html doc

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-883) Secondary Indexes

Posted by "Clint Morgan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630694#action_12630694 ] 

Clint Morgan commented on HBASE-883:
------------------------------------

I'm on vacation next week, but plan to wrap this up the week after.

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>         Attachments: hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-883) Secondary Indexes

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-883:
------------------------

    Attachment: secondary.patch

Andrew, I want to backout the stuff around comparators.  I made a comment above but the committed patch still had them in there.  They don't work, is my understanding, and as is, they are responsible for about 18% of the allocations on jgrays HRS.

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, secondary.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-883) Secondary Indexes

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-883.
-------------------------

    Resolution: Fixed

Backed out custom comparators; resolving issue again.

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, secondary.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-883) Secondary Indexes

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634648#action_12634648 ] 

stack commented on HBASE-883:
-----------------------------

Hey Clint:

Highlevel, talk on the list has made mention of secondary indices done using lucene rather than keeping a second table.  Because of this, one suggestion to consider -- non-binding, just a suggestion -- is that you might include the type of your secondary index -- i.e. table index -- when naming classes, etc.  For example, IndexedRegionServer should perhaps become TableIndexedRegionServer (bit of a mouthful -- maybe you have a better name) or maybe better, just change your package name -- use tableindexed or tindexed instead of indexed -- so its clear how your secondary index is implemented.

In HTD:

+ Is this inclusion intentional: '+  public static final String ROW_KEY_COMPARATOR = "ROW_KEY_COMPARATOR";'?  Is this leak from another patch?  Same for HSK.java.... and setRowKeyComparator in HTD, etc.
+ You have to copy '+  static private byte[] format(final int number) {' from PerformanceEvaluation because its inaccessible?  I'd suggest change access on PE so you don't have to duplicate.

... more to follow


> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>         Attachments: hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-883) Secondary Indexes

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644413#action_12644413 ] 

Andrew Purtell commented on HBASE-883:
--------------------------------------

We have a maintenance window this afternoon. I'll take the opportunity to set this up for indefinite use on our "big" 25 node cluster. There's one thing we do in particular where I'd like to change the result ordering of a scanner using a secondary index. Will report back if there are any problems.


> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-883) Secondary Indexes

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638787#action_12638787 ] 

Andrew Purtell commented on HBASE-883:
--------------------------------------

Hi Clint. Does it have to be that transaction classes inherit from index classes? Is it possible to have transactions without indexes? Wouldn't having the index classes inherit from the transactional ones accomplish the same thing? Or am I missing something?

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-883) Secondary Indexes

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-883:
---------------------------------

    Fix Version/s:     (was: 0.20.0)
                   0.19.0
           Status: Patch Available  (was: Open)

I can walk this through. 

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-883) Secondary Indexes

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648849#action_12648849 ] 

stack commented on HBASE-883:
-----------------------------

Thanks for checking in Clint.  Thanks for fixing Andrew.

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, secondary.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-883) Secondary Indexes

Posted by "Clint Morgan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Clint Morgan updated HBASE-883:
-------------------------------

    Attachment: hbase-883.patch

In progress, incomplete, version with a working test. The test should be enough to communicate the API and semantics. Feedback welcomed.

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>         Attachments: hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-883) Secondary Indexes

Posted by "Clint Morgan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Clint Morgan updated HBASE-883:
-------------------------------

    Attachment: hbase-883.patch


Thanks for the review, responded to most of it:

The stuff dealing with RowKeyCompartors is my half-arse attempt at
661. It seems to work but I have not tried it when regions start to
split, which i think is where the issues will arise. 

I need RowKeyComparators, because some indexes (those that go in
reverse order, or deserialize the index keys (column values from base
table) to order by EG long values) need to sort non-lexicographically.

This WritableComparator is diff that the one in hadoop. Mine just brings the
two interfaces together, while the hadoop version is a general class
for comparing byte[]s.

Made PerfEval.fomat public

Changed package names to tableindexed

Regards including the index specs in HTD, it made sense to me. I see
the pollution concern, but indexes seem like a fundamental part of the
tables meta-data. I can rework in subsequent patch If need be.

This version has transactional tables/regions inherit from the
indexed tables/regions. This will allow the index updates to have
proper transactional behavior (when updated as part of a transaction).

I've tested this in my object-datastore layer above hbase, and passes some basic
tests. There is a reasonable deal of logic that goes into the
RowKeyGenerator as it must carefully build the index keys to get the
desired ordering. I'm building my index keys prefixed with the same
prefix used in the original keys to maintain "sharding". This
way I can do an quick index scan for data from a give domain (EG, for
a specific customer).

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>         Attachments: hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-883) Secondary Indexes

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-883:
---------------------------------

    Attachment: hbase-883.patch

Attached updated patch that addresses all of my comments. I've been running this privately for some time without problems.

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-883) Secondary Indexes

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648217#action_12648217 ] 

stack commented on HBASE-883:
-----------------------------

Want to commit it to TRUNK then Andrew?
St.Ack

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-883) Secondary Indexes

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-883:
---------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Committed to TRUNK. Passed all local tests.

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Reopened: (HBASE-883) Secondary Indexes

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reopened HBASE-883:
-------------------------


Reopening until I back out comparators.

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, secondary.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-883) Secondary Indexes

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648820#action_12648820 ] 

Andrew Purtell commented on HBASE-883:
--------------------------------------

+1


> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, secondary.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-883) Secondary Indexes

Posted by "Clint Morgan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Clint Morgan updated HBASE-883:
-------------------------------

    Attachment: hbase-883.patch

There was an index maintenance bug introduced in last patch, this fixes it.

(IndexedRegion:123) oldEntry.getValue() -> oldEntry.getKey()

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Reopened: (HBASE-883) Secondary Indexes

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell reopened HBASE-883:
----------------------------------


Third time will be the charm. We took out the custom comparators because we can see them acting as a tax on the rest of HRS: 5-10% CPU, 20% of all object allocation (on a jgray server). But the custom comparators are used to reverse a test to provide descending sort orders. For now we'll degrade the secondary index capability somewhat by fixing it up to work with the custom comparators hence no descending sort ordering. 

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, secondary.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HBASE-883) Secondary Indexes

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644456#action_12644456 ] 

apurtell edited comment on HBASE-883 at 10/31/08 4:28 PM:
----------------------------------------------------------------

o.a.h.h.client.tableindexed.IndexScanner.ScannerWrapper needs next(int).

A suggestion:

{code}

    /** {@inheritDoc} */
    public RowResult next() throws IOException {
        RowResult[] result = next(1);
        if (result == null || result.length < 1)
          return null;
        return result[0];
    }

    /** {@inheritDoc} */
    public RowResult[] next(int nbRows) throws IOException {
      RowResult[] indexResult = indexScanner.next(nbRows);
      if (indexResult == null) {
        return null;
      }
      RowResult[] result = new RowResult[indexResult.length];
      for (int i = 0; i < indexResult.length; i++) {
        RowResult row = indexResult[i];
        byte[] baseRow = row.get(INDEX_BASE_ROW_COLUMN).getValue();
        LOG.debug("next index row [" + Bytes.toString(row.getRow())
            + "] -> base row [" + Bytes.toString(baseRow) + "]");
        HbaseMapWritable<byte[], Cell> colValues =
          new HbaseMapWritable<byte[], Cell>();
        if (columns != null && columns.length > 0) {
          LOG.debug("Going to base table for remaining columns");
          RowResult baseResult = IndexedTable.this.getRow(baseRow, columns);
          colValues.putAll(baseResult);
        }
        for (Entry<byte[], Cell> entry : row.entrySet()) {
          byte[] col = entry.getKey();
          if (HStoreKey.matchingFamily(INDEX_COL_FAMILY_NAME, col)) {
            continue;
          }
          colValues.put(col, entry.getValue());
        }
        result[i] = new RowResult(baseRow, colValues);
      }
      return result;
    }

{code}

      was (Author: apurtell):
    o.a.h.h.client.tableindexed.IndexScanner.ScannerWrapper needs next(int).

A suggestion:

{code}

    /** {@inheritDoc} */
    public RowResult next() throws IOException {
        RowResult[] result = next(1);
        if (result == null)
          return null;
        return result[0];
    }

    /** {@inheritDoc} */
    public RowResult[] next(int nbRows) throws IOException {
      RowResult[] indexResult = indexScanner.next(nbRows);
      if (indexResult == null) {
        return null;
      }
      RowResult[] result = new RowResult[indexResult.length];
      for (int i = 0; i < indexResult.length; i++) {
        RowResult row = indexResult[i];
        byte[] baseRow = row.get(INDEX_BASE_ROW_COLUMN).getValue();
        LOG.debug("next index row [" + Bytes.toString(row.getRow())
            + "] -> base row [" + Bytes.toString(baseRow) + "]");
        HbaseMapWritable<byte[], Cell> colValues =
          new HbaseMapWritable<byte[], Cell>();
        if (columns != null && columns.length > 0) {
          LOG.debug("Going to base table for remaining columns");
          RowResult baseResult = IndexedTable.this.getRow(baseRow, columns);
          colValues.putAll(baseResult);
        }
        for (Entry<byte[], Cell> entry : row.entrySet()) {
          byte[] col = entry.getKey();
          if (HStoreKey.matchingFamily(INDEX_COL_FAMILY_NAME, col)) {
            continue;
          }
          colValues.put(col, entry.getValue());
        }
        result[i] = new RowResult(baseRow, colValues);
      }
      return result;
    }

{code}
  
> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-883) Secondary Indexes

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell resolved HBASE-883.
----------------------------------

    Resolution: Fixed

Committed fixes as described in previous comment to trunk. Passes all local tests.

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, secondary.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-883) Secondary Indexes

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644462#action_12644462 ] 

Andrew Purtell commented on HBASE-883:
--------------------------------------

Another suggestion is to remove the @inheritDoc tag at o.a.h.h.client.tableindexed.IndexedTable.java:56 to fix a javadoc warning. 

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HBASE-883) Secondary Indexes

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648828#action_12648828 ] 

apurtell edited comment on HBASE-883 at 11/18/08 4:16 PM:
----------------------------------------------------------------

Third time will be the charm. We took out the custom comparators because we can see them acting as a tax on the rest of HRS: 5-10% CPU, 20% of all object allocation (on a jgray server). But the custom comparators are used to reverse a test to provide descending sort orders. For now we'll degrade the secondary index capability somewhat by fixing it up to work without the custom comparators, hence no descending sort ordering. 

      was (Author: apurtell):
    Third time will be the charm. We took out the custom comparators because we can see them acting as a tax on the rest of HRS: 5-10% CPU, 20% of all object allocation (on a jgray server). But the custom comparators are used to reverse a test to provide descending sort orders. For now we'll degrade the secondary index capability somewhat by fixing it up to work with the custom comparators hence no descending sort ordering. 
  
> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, secondary.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-883) Secondary Indexes

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12643131#action_12643131 ] 

stack commented on HBASE-883:
-----------------------------

Clint: Secondary indexes run on top of a transactional hbase?  Thats needed because insert into secondary is transactional with primary table?  This patch makes some changes to a few core classes but they are a few only and they look like they do not require migration scripts because the classes migrate themselves.  I like the package.html documentation.

Has anyone else tried Clint's patch?  Does it work for them?

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-883) Secondary Indexes

Posted by "Clint Morgan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Clint Morgan updated HBASE-883:
-------------------------------

    Attachment: hbase-883.patch

Latest version. Completes the support for deletes, fixes a spelling mistake.

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-883) Secondary Indexes

Posted by "Clint Morgan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648837#action_12648837 ] 

Clint Morgan commented on HBASE-883:
------------------------------------

Thanks for pushing this through guys, I've been working on some other things. Sorry for lack of response...

WRT the custom comparator "tax". Was this with the latest version of the patch? You pay the tax even when no customer comparator is set? Because I noticed a similar issue in older version of patch and fixed it (in 17/Oct/08 patch).

But anyway, that impl of custom comparators does not work due to META issues that stack mentioned above...

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, secondary.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-883) Secondary Indexes

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644456#action_12644456 ] 

Andrew Purtell commented on HBASE-883:
--------------------------------------

o.a.h.h.client.tableindexed.IndexScanner.ScannerWrapper needs next(int).

A suggestion:

{code}

    /** {@inheritDoc} */
    public RowResult next() throws IOException {
        RowResult[] result = next(1);
        if (result == null)
          return null;
        return result[0];
    }

    /** {@inheritDoc} */
    public RowResult[] next(int nbRows) throws IOException {
      RowResult[] indexResult = indexScanner.next(nbRows);
      if (indexResult == null) {
        return null;
      }
      RowResult[] result = new RowResult[indexResult.length];
      for (int i = 0; i < indexResult.length; i++) {
        RowResult row = indexResult[i];
        byte[] baseRow = row.get(INDEX_BASE_ROW_COLUMN).getValue();
        LOG.debug("next index row [" + Bytes.toString(row.getRow())
            + "] -> base row [" + Bytes.toString(baseRow) + "]");
        HbaseMapWritable<byte[], Cell> colValues =
          new HbaseMapWritable<byte[], Cell>();
        if (columns != null && columns.length > 0) {
          LOG.debug("Going to base table for remaining columns");
          RowResult baseResult = IndexedTable.this.getRow(baseRow, columns);
          colValues.putAll(baseResult);
        }
        for (Entry<byte[], Cell> entry : row.entrySet()) {
          byte[] col = entry.getKey();
          if (HStoreKey.matchingFamily(INDEX_COL_FAMILY_NAME, col)) {
            continue;
          }
          colValues.put(col, entry.getValue());
        }
        result[i] = new RowResult(baseRow, colValues);
      }
      return result;
    }

{code}

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-883) Secondary Indexes

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634700#action_12634700 ] 

stack commented on HBASE-883:
-----------------------------

+ This looks like it could soon become annoying '+        LOG.info("Index [" + indexSpec.getIndexId()'... should be DEBUG-level at least.
+ How can we make it so you don't have to do this to HTD:

{code}
+  private final Map<String, IndexSpecification> indexes =
+    new HashMap<String, IndexSpecification>();
{code}

It pollutes HTD with indices.  Should we add a map that subclasses such as this Indexer can stuff things like indexes into?  Rather than have each specialization add to HTD?
+ More pollution from another patch: WritableComparator.java... and besides, ain't this up in hadoop?  And ReverseByteArrayComparator.  (These new classes are lacking licenses).

I didn't try it but patch looks generally good to me. 

One thought is that this fancy feature really should be an option on default hbase.  Creating your table, it'd be an option spec'ing secondary indices.  But that can wait.  Lets do it this way where its as a subclass first.  If the demand, we can move it back into core hbase.




> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>         Attachments: hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-883) Secondary Indexes

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-883:
------------------------

    Fix Version/s: 0.19.0

Lets add this to 0.19.0.

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-883) Secondary Indexes

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-883:
------------------------

    Fix Version/s:     (was: 0.19.0)
                   0.20.0

Moving out of 0.19 unless Clint shows up soon addressing Andrew's feedback.

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>             Fix For: 0.20.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-883) Secondary Indexes

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell reassigned HBASE-883:
------------------------------------

    Assignee: Andrew Purtell  (was: Clint Morgan)

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-883) Secondary Indexes

Posted by "Clint Morgan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Clint Morgan updated HBASE-883:
-------------------------------

    Attachment: hbase-883.patch

Andrew: you're right that I don't need to have the transactional stuff inherit from the indexed stuff. My intuition to do this came from the need to have transactional logic happen first, and then the indexing stuff only happen as a transaction is successfully committed. However, I can make this work with the normal polymorphic mechanism...

This patch does that, as well as address a couple of performance issues I found with profiling.

> Secondary Indexes
> -----------------
>
>                 Key: HBASE-883
>                 URL: https://issues.apache.org/jira/browse/HBASE-883
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Clint Morgan
>            Assignee: Clint Morgan
>         Attachments: hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch, hbase-883.patch
>
>
> I'm working on a secondary index impl. The basic idea is to maintain a separate table per index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.