You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Michael Bieniosek (JIRA)" <ji...@apache.org> on 2008/08/27 22:37:44 UTC
[jira] Created: (HBASE-847) new API: HTable.getRow with numVersion
specified
new API: HTable.getRow with numVersion specified
------------------------------------------------
Key: HBASE-847
URL: https://issues.apache.org/jira/browse/HBASE-847
Project: Hadoop HBase
Issue Type: New Feature
Components: client
Affects Versions: 0.2.0
Reporter: Michael Bieniosek
I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HBASE-847) new API: HTable.getRow with numVersion
specified
Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jim Kellerman reassigned HBASE-847:
-----------------------------------
Assignee: Doğacan Güney (was: Jim Kellerman)
> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
> Key: HBASE-847
> URL: https://issues.apache.org/jira/browse/HBASE-847
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client
> Affects Versions: 0.2.0
> Reporter: Michael Bieniosek
> Assignee: Doğacan Güney
> Fix For: 0.19.0
>
> Attachments: HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-847) new API: HTable.getRow with
numVersion specified
Posted by "Doğacan Güney (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652558#action_12652558 ]
Doğacan Güney commented on HBASE-847:
-------------------------------------
Thanks for comments, stack.
I am OK with this issue (or HBASE-44 etc) being fixed in 0.19 or 0.20.
> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
> Key: HBASE-847
> URL: https://issues.apache.org/jira/browse/HBASE-847
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client
> Affects Versions: 0.2.0
> Reporter: Michael Bieniosek
> Assignee: Doğacan Güney
> Priority: Critical
> Fix For: 0.19.0
>
> Attachments: HBASE-847_v2.patch, HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-847) new API: HTable.getRow with
numVersion specified
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652514#action_12652514 ]
stack commented on HBASE-847:
-----------------------------
Patch looks good. Let me study it more and try it locally and try and get it into 0.19.0. Good stuff.
> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
> Key: HBASE-847
> URL: https://issues.apache.org/jira/browse/HBASE-847
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client
> Affects Versions: 0.2.0
> Reporter: Michael Bieniosek
> Assignee: Doğacan Güney
> Priority: Critical
> Fix For: 0.19.0
>
> Attachments: HBASE-847_v2.patch, HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HBASE-847) new API: HTable.getRow with numVersion
specified
Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jim Kellerman reassigned HBASE-847:
-----------------------------------
Assignee: Jim Kellerman
> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
> Key: HBASE-847
> URL: https://issues.apache.org/jira/browse/HBASE-847
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client
> Affects Versions: 0.2.0
> Reporter: Michael Bieniosek
> Assignee: Jim Kellerman
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-847) new API: HTable.getRow with numVersion
specified
Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jim Kellerman updated HBASE-847:
--------------------------------
Priority: Critical (was: Major)
> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
> Key: HBASE-847
> URL: https://issues.apache.org/jira/browse/HBASE-847
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client
> Affects Versions: 0.2.0
> Reporter: Michael Bieniosek
> Assignee: Doğacan Güney
> Priority: Critical
> Fix For: 0.19.0
>
> Attachments: HBASE-847_v2.patch, HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-847) new API: HTable.getRow with
numVersion specified
Posted by "Doğacan Güney (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633495#action_12633495 ]
Doğacan Güney commented on HBASE-847:
-------------------------------------
Again, thanks for comments. I will update as you suggested with a new patch.
Btw, a question: Do you think it is a good idea to change Cell so that if it stores multiple <timestamp, value> pairs, those pairs are sorted? I mean, the value with the latest timestamp will be returned first during an iteration?
> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
> Key: HBASE-847
> URL: https://issues.apache.org/jira/browse/HBASE-847
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client
> Affects Versions: 0.2.0
> Reporter: Michael Bieniosek
> Assignee: Jim Kellerman
> Fix For: 0.19.0
>
> Attachments: HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (HBASE-847) new API: HTable.getRow
with numVersion specified
Posted by "Doğacan Güney (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633004#action_12633004 ]
dogacan edited comment on HBASE-847 at 9/20/08 12:40 PM:
---------------------------------------------------------------
Patch for the issue.
OK, this is my first big(-ish) patch, so I am sure I am missing something :)
Anyway, updates hbase as Jim Kellerman suggested. RowResult#getRow-s don't have any documentation yet. I will update them with a later patch.
I also want to update scanners so that you can ask for multiple versions from them too (not done yet).
(Also includes patch from HBASE-892.)
was (Author: dogacan):
Patch for the issue.
OK, this is my first big(-ish) patch, so I am sure I am missing something :)
Anyway, updates hbase as Jim Kellerman suggested. RowResult#getRow-s don't have any documentation yet. I will update them with a later patch.
I also want to update scanners so that you can ask for multiple versions from them too (not done yet).
> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
> Key: HBASE-847
> URL: https://issues.apache.org/jira/browse/HBASE-847
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client
> Affects Versions: 0.2.0
> Reporter: Michael Bieniosek
> Assignee: Jim Kellerman
> Fix For: 0.19.0
>
> Attachments: HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-847) new API: HTable.getRow with
numVersion specified
Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633522#action_12633522 ]
Jim Kellerman commented on HBASE-847:
-------------------------------------
> Doğacan Güney - 22/Sep/08 01:58 PM
>
> Btw, a question: Do you think it is a good idea to change Cell so that if it stores
> multiple <timestamp, value> pairs, those pairs are sorted? I mean, the value
> with the latest timestamp will be returned first during an iteration?
That would be nice, but will require substantial changes to HStore.{getFull,getFullFromMapFile}
and Memcache.getFull
At first glance, however, changes are required there just to be able to get multiple versions in the first place.
> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
> Key: HBASE-847
> URL: https://issues.apache.org/jira/browse/HBASE-847
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client
> Affects Versions: 0.2.0
> Reporter: Michael Bieniosek
> Assignee: Jim Kellerman
> Fix For: 0.19.0
>
> Attachments: HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-847) new API: HTable.getRow with
numVersion specified
Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627539#action_12627539 ]
Jim Kellerman commented on HBASE-847:
-------------------------------------
What needs to be done:
o.a.h.h.ipc.HRegionInterface:
- bump versionID
- change:
{code}
public RowResult getRow(final byte[] regionName, final byte[] row, final byte[][] columns, final long ts, final long lockId)
// to:
public RowResult getRow(final byte[] regionName, final byte[] row, final byte[][] columns, final long ts, final int numVersions, final long lockId)
{code}
o.a.h.h.client.HTable:
- add overloads to getRow:
{code}
public RowResult getRow(String row, int numVersions)
public RowResult getRow(String row, long timestamp, int numVersions)
public RowResult getRow(String row, String[] columns, int numVersions)
public RowResult getRow(String row, String[] columns, long timestamp, int numVersions)
public RowResult getRow(String row, String[] columns, long timestamp, int numVersions, RowLock rowLock)
public RowResult getRow(byte[] row, int numVersions)
public RowResult getRow(byte[] row, long timestamp, int numVersions)
public RowResult getRow(byte[] row, byte[][] columns, int numVersions)
public RowResult getRow(byte[] row, byte[][] columns, long timestamp, int numVersions)
{code}
- replace:
{code}
public RowResult getRow(byte[] row, byte[][] columns, long timestamp, RowLock rowLock)
// with:
public RowResult getRow(byte[] row, byte[][] columns, long timestamp, int numVersions, RowLock rowLock)
{code}
All getRow(String...) methods should call:
{code}
public RowResult getRow(String row, String[] columns, long timestamp, int numVersions, RowLock rowLock)
// which calls:
public RowResult getRow(byte[] row, byte[][] columns, long timestamp, int numVersions, RowLock rowLock)
{code}
Similarly all getRow(byte[]...) methods should call:
{code}
public RowResult getRow(byte[] row, byte[][] columns, long timestamp, int numVersions, RowLock rowLock)
{code}
which will use the new getRow api in HRegionInterface described above.
Modify HRegionServer.getRow to match the change in HRegionInterface. This will require corresponding changes to HRegion.getFull, HStore.{getFull,getFullFromMapFile} and Memcache.{getFull,internalGetFull}
Multiple values and timestamps for the same column:family can be stored in a single Cell using either of the constructors:
{code}
Cell(String[] vals, long[] ts)
Cell(byte[][] vals, long[] ts)
{code}
> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
> Key: HBASE-847
> URL: https://issues.apache.org/jira/browse/HBASE-847
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client
> Affects Versions: 0.2.0
> Reporter: Michael Bieniosek
> Assignee: Jim Kellerman
> Fix For: 0.19.0
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-847) new API: HTable.getRow with numVersion
specified
Posted by "Doğacan Güney (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Doğacan Güney updated HBASE-847:
--------------------------------
Attachment: HBASE_847.patch
Patch for the issue.
OK, this is my first big(-ish) patch, so I am sure I am missing something :)
Anyway, updates hbase as Jim Kellerman suggested. RowResult#getRow-s don't have any documentation yet. I will update them with a later patch.
I also want to update scanners so that you can ask for multiple versions from them too (not done yet).
> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
> Key: HBASE-847
> URL: https://issues.apache.org/jira/browse/HBASE-847
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client
> Affects Versions: 0.2.0
> Reporter: Michael Bieniosek
> Assignee: Jim Kellerman
> Fix For: 0.19.0
>
> Attachments: HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-847) new API: HTable.getRow with numVersion
specified
Posted by "Doğacan Güney (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Doğacan Güney updated HBASE-847:
--------------------------------
Attachment: HBASE-847_v2.patch
New version of patch. Same as the last one except
- Added a new test case (TestGetMultipleVersions)
- Changed Cell to keep a reverse sorted map of timestamp->value. This way, a cell is guaranteed to return latest timestamp at the top.
- Also changed iteration. Cell now iterates over Entry<Long, byte[]>'s. Nothing in hbase code uses cell iteration anyway (and it didn't work just a while back:). Still, I am open to suggestions.
- Added javadoc for new overloads
There is a small bug. If, say, your table is configured to keep last 3 versions and you have just written code that makes 5 updates to a row/column (with timestamps, t1, t2, t3, t4, t5.) Now if you try asking for 5 versions, you will only get t5, t4 and t3. But if you ask for 5 versions starting from t4, you will get t4, t3, -t2- (at least until table is compacted). I don't know if this will be too much of a problem. I also should note that HTable#get also behaves like this.
About subtasks: I think HBASE-857 and HBASE-44 are covered. I am not sure about HBASE-31. Is it useful to get just timestamps and not values?
> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
> Key: HBASE-847
> URL: https://issues.apache.org/jira/browse/HBASE-847
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client
> Affects Versions: 0.2.0
> Reporter: Michael Bieniosek
> Assignee: Doğacan Güney
> Fix For: 0.19.0
>
> Attachments: HBASE-847_v2.patch, HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-847) new API: HTable.getRow with numVersion
specified
Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jim Kellerman updated HBASE-847:
--------------------------------
Fix Version/s: 0.19.0
> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
> Key: HBASE-847
> URL: https://issues.apache.org/jira/browse/HBASE-847
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client
> Affects Versions: 0.2.0
> Reporter: Michael Bieniosek
> Assignee: Jim Kellerman
> Fix For: 0.19.0
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-847) new API: HTable.getRow with
numVersion specified
Posted by "Doğacan Güney (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634559#action_12634559 ]
Doğacan Güney commented on HBASE-847:
-------------------------------------
Do people think this should wait after HBASE-880 since that issue will change all APIs anyway or shall I work on a new patch now?
> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
> Key: HBASE-847
> URL: https://issues.apache.org/jira/browse/HBASE-847
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client
> Affects Versions: 0.2.0
> Reporter: Michael Bieniosek
> Assignee: Jim Kellerman
> Fix For: 0.19.0
>
> Attachments: HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-847) new API: HTable.getRow with
numVersion specified
Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633470#action_12633470 ]
Jim Kellerman commented on HBASE-847:
-------------------------------------
Patch does not apply. Patches must be in svn diff format to be accepted.
Please add a test case to demonstrate that getting multiple versions works (should also include multiple versions with timestamp specified)
Please do not include a patch for HBASE-52 and HBASE-33 in this patch. Even though they are similar, changes to scanners are more difficult. We try to limit the scope of a single patch in general.
Insure that the sub issues of this Jira, HBASE-857, HBASE-31 and HBASE-44 are addressed.
Thanks.
> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
> Key: HBASE-847
> URL: https://issues.apache.org/jira/browse/HBASE-847
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client
> Affects Versions: 0.2.0
> Reporter: Michael Bieniosek
> Assignee: Jim Kellerman
> Fix For: 0.19.0
>
> Attachments: HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HBASE-847) new API: HTable.getRow with numVersion
specified
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack resolved HBASE-847.
-------------------------
Resolution: Fixed
Committed. Thanks for the patch Doğacan.
> new API: HTable.getRow with numVersion specified
> ------------------------------------------------
>
> Key: HBASE-847
> URL: https://issues.apache.org/jira/browse/HBASE-847
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client
> Affects Versions: 0.2.0
> Reporter: Michael Bieniosek
> Assignee: Doğacan Güney
> Priority: Critical
> Fix For: 0.19.0
>
> Attachments: HBASE-847_v2.patch, HBASE_847.patch
>
>
> I'd like to be able to call HTable.getRow with numVersions, and get multiple versions for each column.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.