You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Michael Bieniosek (JIRA)" <ji...@apache.org> on 2008/08/27 22:37:44 UTC

[jira] Created: (HBASE-847) new API: HTable.getRow with numVersion specified

new API: HTable.getRow with numVersion specified
------------------------------------------------

                 Key: HBASE-847
                 URL: https://issues.apache.org/jira/browse/HBASE-847
             Project: Hadoop HBase
          Issue Type: New Feature
          Components: client
    Affects Versions: 0.2.0
            Reporter: Michael Bieniosek


I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-847) new API: HTable.getRow with numVersion specified

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman reassigned HBASE-847:
-----------------------------------

    Assignee: Doğacan Güney  (was: Jim Kellerman)

> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
>                 Key: HBASE-847
>                 URL: https://issues.apache.org/jira/browse/HBASE-847
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Michael Bieniosek
>            Assignee: Doğacan Güney
>             Fix For: 0.19.0
>
>         Attachments: HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-847) new API: HTable.getRow with numVersion specified

Posted by "Doğacan Güney (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652558#action_12652558 ] 

Doğacan Güney commented on HBASE-847:
-------------------------------------

Thanks for comments, stack.

I am OK with this issue (or HBASE-44 etc) being fixed in 0.19 or 0.20. 

> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
>                 Key: HBASE-847
>                 URL: https://issues.apache.org/jira/browse/HBASE-847
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Michael Bieniosek
>            Assignee: Doğacan Güney
>            Priority: Critical
>             Fix For: 0.19.0
>
>         Attachments: HBASE-847_v2.patch, HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-847) new API: HTable.getRow with numVersion specified

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652514#action_12652514 ] 

stack commented on HBASE-847:
-----------------------------

Patch looks good.  Let me study it more and try it locally and try and get it into 0.19.0.  Good stuff.

> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
>                 Key: HBASE-847
>                 URL: https://issues.apache.org/jira/browse/HBASE-847
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Michael Bieniosek
>            Assignee: Doğacan Güney
>            Priority: Critical
>             Fix For: 0.19.0
>
>         Attachments: HBASE-847_v2.patch, HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-847) new API: HTable.getRow with numVersion specified

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman reassigned HBASE-847:
-----------------------------------

    Assignee: Jim Kellerman

> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
>                 Key: HBASE-847
>                 URL: https://issues.apache.org/jira/browse/HBASE-847
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Michael Bieniosek
>            Assignee: Jim Kellerman
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-847) new API: HTable.getRow with numVersion specified

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman updated HBASE-847:
--------------------------------

    Priority: Critical  (was: Major)

> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
>                 Key: HBASE-847
>                 URL: https://issues.apache.org/jira/browse/HBASE-847
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Michael Bieniosek
>            Assignee: Doğacan Güney
>            Priority: Critical
>             Fix For: 0.19.0
>
>         Attachments: HBASE-847_v2.patch, HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-847) new API: HTable.getRow with numVersion specified

Posted by "Doğacan Güney (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633495#action_12633495 ] 

Doğacan Güney commented on HBASE-847:
-------------------------------------

Again, thanks for comments. I will update as you suggested with a new patch.

Btw, a question: Do you think it is a good idea to change Cell so that if it stores multiple <timestamp, value> pairs, those pairs are sorted? I mean, the value with the latest timestamp will be returned first during an iteration?

> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
>                 Key: HBASE-847
>                 URL: https://issues.apache.org/jira/browse/HBASE-847
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Michael Bieniosek
>            Assignee: Jim Kellerman
>             Fix For: 0.19.0
>
>         Attachments: HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HBASE-847) new API: HTable.getRow with numVersion specified

Posted by "Doğacan Güney (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633004#action_12633004 ] 

dogacan edited comment on HBASE-847 at 9/20/08 12:40 PM:
---------------------------------------------------------------

Patch for the issue.

OK, this is my first big(-ish) patch, so I am sure I am missing something :)

Anyway, updates hbase as Jim Kellerman suggested. RowResult#getRow-s don't have any documentation yet. I will update them with a later patch.

I also want to update scanners so that you can ask for multiple versions from them too (not done yet).

 (Also includes patch from HBASE-892.)

      was (Author: dogacan):
    Patch for the issue.

OK, this is my first big(-ish) patch, so I am sure I am missing something :)

Anyway, updates hbase as Jim Kellerman suggested. RowResult#getRow-s don't have any documentation yet. I will update them with a later patch.

I also want to update scanners so that you can ask for multiple versions from them too (not done yet).

 
  
> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
>                 Key: HBASE-847
>                 URL: https://issues.apache.org/jira/browse/HBASE-847
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Michael Bieniosek
>            Assignee: Jim Kellerman
>             Fix For: 0.19.0
>
>         Attachments: HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-847) new API: HTable.getRow with numVersion specified

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633522#action_12633522 ] 

Jim Kellerman commented on HBASE-847:
-------------------------------------

> Doğacan Güney - 22/Sep/08 01:58 PM
> 
> Btw, a question: Do you think it is a good idea to change Cell so that if it stores
> multiple <timestamp, value> pairs, those pairs are sorted? I mean, the value
> with the latest timestamp will be returned first during an iteration?

That would be nice, but will require substantial changes to HStore.{getFull,getFullFromMapFile}
and Memcache.getFull

At first glance, however, changes are required there just to be able to get multiple versions in the first place.

> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
>                 Key: HBASE-847
>                 URL: https://issues.apache.org/jira/browse/HBASE-847
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Michael Bieniosek
>            Assignee: Jim Kellerman
>             Fix For: 0.19.0
>
>         Attachments: HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-847) new API: HTable.getRow with numVersion specified

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627539#action_12627539 ] 

Jim Kellerman commented on HBASE-847:
-------------------------------------

What needs to be done:

o.a.h.h.ipc.HRegionInterface:
- bump versionID
- change:
{code}
public RowResult getRow(final byte[] regionName, final byte[] row, final byte[][] columns, final long ts, final long lockId)

// to:

public RowResult getRow(final byte[] regionName, final byte[] row, final byte[][] columns, final long ts, final int numVersions, final long lockId)
{code}

o.a.h.h.client.HTable:
- add overloads to getRow:
{code}
public RowResult getRow(String row, int numVersions)
public RowResult getRow(String row, long timestamp, int numVersions)
public RowResult getRow(String row, String[] columns, int numVersions)
public RowResult getRow(String row, String[] columns, long timestamp, int numVersions)
public RowResult getRow(String row, String[] columns, long timestamp, int numVersions, RowLock rowLock)

public RowResult getRow(byte[] row, int numVersions)
public RowResult getRow(byte[] row, long timestamp, int numVersions)
public RowResult getRow(byte[] row, byte[][] columns, int numVersions)
public RowResult getRow(byte[] row, byte[][] columns, long timestamp, int numVersions)
{code}
- replace:
{code}
public RowResult getRow(byte[] row, byte[][] columns, long timestamp, RowLock rowLock)

// with:

public RowResult getRow(byte[] row, byte[][] columns, long timestamp, int numVersions, RowLock rowLock)
{code}

All getRow(String...) methods should call:

{code}
public RowResult getRow(String row, String[] columns, long timestamp, int numVersions, RowLock rowLock)

// which calls:

public RowResult getRow(byte[] row, byte[][] columns, long timestamp, int numVersions, RowLock rowLock)
{code}

Similarly all getRow(byte[]...) methods should call:

{code}
public RowResult getRow(byte[] row, byte[][] columns, long timestamp, int numVersions, RowLock rowLock)
{code}

which will use the new getRow api in HRegionInterface described above.

Modify HRegionServer.getRow to match the change in HRegionInterface. This will require corresponding changes to HRegion.getFull, HStore.{getFull,getFullFromMapFile} and Memcache.{getFull,internalGetFull}

Multiple values and timestamps for the same column:family can be stored in a single Cell using either of the constructors:

{code}
Cell(String[] vals, long[] ts)
Cell(byte[][] vals, long[] ts)
{code}


> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
>                 Key: HBASE-847
>                 URL: https://issues.apache.org/jira/browse/HBASE-847
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Michael Bieniosek
>            Assignee: Jim Kellerman
>             Fix For: 0.19.0
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-847) new API: HTable.getRow with numVersion specified

Posted by "Doğacan Güney (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doğacan Güney updated HBASE-847:
--------------------------------

    Attachment: HBASE_847.patch

Patch for the issue.

OK, this is my first big(-ish) patch, so I am sure I am missing something :)

Anyway, updates hbase as Jim Kellerman suggested. RowResult#getRow-s don't have any documentation yet. I will update them with a later patch.

I also want to update scanners so that you can ask for multiple versions from them too (not done yet).

 

> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
>                 Key: HBASE-847
>                 URL: https://issues.apache.org/jira/browse/HBASE-847
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Michael Bieniosek
>            Assignee: Jim Kellerman
>             Fix For: 0.19.0
>
>         Attachments: HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-847) new API: HTable.getRow with numVersion specified

Posted by "Doğacan Güney (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doğacan Güney updated HBASE-847:
--------------------------------

    Attachment: HBASE-847_v2.patch

New version of patch. Same as the last one except

- Added a new test case (TestGetMultipleVersions)
- Changed Cell to keep a reverse sorted map of timestamp->value. This way, a cell is guaranteed to return latest timestamp at the top.
- Also changed iteration. Cell now iterates over Entry<Long, byte[]>'s. Nothing in hbase code uses cell iteration anyway (and it didn't work just a while back:). Still, I am open to suggestions.
- Added javadoc for new overloads

There is a small bug. If, say, your table is configured to keep last 3 versions and you have just written code that makes 5 updates to a row/column (with timestamps, t1, t2, t3, t4, t5.) Now if you try asking for 5 versions, you will only get t5, t4 and t3. But if you ask for 5 versions starting from t4, you will get t4, t3, -t2- (at least until table is compacted). I don't know if this will be too much of a problem. I also should note that HTable#get also behaves like this.

About subtasks: I think HBASE-857 and HBASE-44 are covered. I am not sure about HBASE-31. Is it useful to get just timestamps and not values?


> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
>                 Key: HBASE-847
>                 URL: https://issues.apache.org/jira/browse/HBASE-847
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Michael Bieniosek
>            Assignee: Doğacan Güney
>             Fix For: 0.19.0
>
>         Attachments: HBASE-847_v2.patch, HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-847) new API: HTable.getRow with numVersion specified

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman updated HBASE-847:
--------------------------------

    Fix Version/s: 0.19.0

> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
>                 Key: HBASE-847
>                 URL: https://issues.apache.org/jira/browse/HBASE-847
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Michael Bieniosek
>            Assignee: Jim Kellerman
>             Fix For: 0.19.0
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-847) new API: HTable.getRow with numVersion specified

Posted by "Doğacan Güney (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634559#action_12634559 ] 

Doğacan Güney commented on HBASE-847:
-------------------------------------

Do people think this should wait after HBASE-880 since that issue will change all APIs anyway or shall I work on a new patch now?

> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
>                 Key: HBASE-847
>                 URL: https://issues.apache.org/jira/browse/HBASE-847
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Michael Bieniosek
>            Assignee: Jim Kellerman
>             Fix For: 0.19.0
>
>         Attachments: HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-847) new API: HTable.getRow with numVersion specified

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633470#action_12633470 ] 

Jim Kellerman commented on HBASE-847:
-------------------------------------

Patch does not apply. Patches must be in svn diff format to be accepted.

Please add a test case to demonstrate that getting multiple versions works (should also include multiple versions with timestamp specified)

Please do not include a patch for HBASE-52 and HBASE-33 in this patch. Even though they are similar, changes to scanners are more difficult. We try to limit the scope of a single patch in general.

Insure that the sub issues of this Jira, HBASE-857, HBASE-31 and HBASE-44 are addressed.

Thanks.

> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
>                 Key: HBASE-847
>                 URL: https://issues.apache.org/jira/browse/HBASE-847
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Michael Bieniosek
>            Assignee: Jim Kellerman
>             Fix For: 0.19.0
>
>         Attachments: HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-847) new API: HTable.getRow with numVersion specified

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-847.
-------------------------

    Resolution: Fixed

Committed.  Thanks for the patch Doğacan.

> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
>                 Key: HBASE-847
>                 URL: https://issues.apache.org/jira/browse/HBASE-847
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Michael Bieniosek
>            Assignee: Doğacan Güney
>            Priority: Critical
>             Fix For: 0.19.0
>
>         Attachments: HBASE-847_v2.patch, HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.