You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Lars Hofhansl (Issue Comment Edited) (JIRA)" <ji...@apache.org> on 2012/01/22 23:00:40 UTC

[jira] [Issue Comment Edited] (HBASE-5229) Explore building blocks for "multi-row" local transactions.

    [ https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190796#comment-13190796 ] 

Lars Hofhansl edited comment on HBASE-5229 at 1/22/12 9:59 PM:
---------------------------------------------------------------

Here's a patch that works. Regions are correctly addressed using the row key. Intra-row scans are limited to a single row.
The region scanner will correctly skip all stores for families before the seekTo KV.
The first matching store will use the seekTo KV, the others will seek to the beginning of the row (since their family sorts after the seekTo key).

The model possible here would be that a "transactional" table ends up being a row in HBase (i.e. the row-key is the tablename), and a prefix of the columns defines "rows" inside that table, the rest of the columns defines the actual "columns" of that row.

With this patch that is possible (set seekTo in Scan.java to seek inside a row, and enable batching).

This patch provides no way to set a stop point inside a row. The client would need to set batching to reasonable amount (to avoid too many roundtrips and at the same not to return too many unnecessary KVs).
Also with a stop point, we could prune all stores whose family is past the stop point (just like this patch prunes all stores with families before the stop point).
Because RegionScannerImpl and StoreScaner are inherently the row based the refactoring would be non-trivial and risky.

I will now explore what it would take to define a grouping prefix in HTableDefinition.
                
      was (Author: lhofhansl):
    Here's a patch that works. Regions are correctly addressed using the row key.
The region scanner will correctly skip all stores for families before the seekTo KV.
The first matching store will use the seekTo KV, the others will seek to the beginning of the row (since their family sorts after the seekTo key).

The model possible here would be that a "transactional" table ends up being a row in HBase (i.e. the row-key is the tablename), and a prefix of the columns defines "rows" inside that table, the rest of the columns defines the actual "columns" of that row.

With this patch that is possible (set seekTo in Scan.java to seek inside a row, and enable batching).

This patch provides no way to set a stop point inside a row. The client would need to set batching to reasonable amount (to avoid too many roundtrips and at the same not to return too many unnecessary KVs).
Also with a stop point, we could prune all stores whose family is past the stop point (just like this patch prunes all stores with families before the stop point).
Because RegionScannerImpl and StoreScaner are inherently the row based the refactoring would be non-trivial and risky.

I will now explore what it would take to define a grouping prefix in HTableDefinition.
                  
> Explore building blocks for "multi-row" local transactions.
> -----------------------------------------------------------
>
>                 Key: HBASE-5229
>                 URL: https://issues.apache.org/jira/browse/HBASE-5229
>             Project: HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5229-seekto-v2.txt, 5229-seekto.txt, 5229.txt
>
>
> HBase should provide basic building blocks for multi-row local transactions. Local means that we do this by co-locating the data. Global (cross region) transactions are not discussed here.
> After a bit of discussion two solutions have emerged:
> 1. Keep the row-key for determining grouping and location and allow efficient intra-row scanning. A client application would then model tables as HBase-rows.
> 2. Define a prefix-length in HTableDescriptor that defines a grouping of rows. Regions will then never be split inside a grouping prefix.
> #1 is true to the current storage paradigm of HBase.
> #2 is true to the current client side API.
> I will explore these two with sample patches here.
> --------------------
> Was:
> As discussed (at length) on the dev mailing list with the HBASE-3584 and HBASE-5203 committed, supporting atomic cross row transactions within a region becomes simple.
> I am aware of the hesitation about the usefulness of this feature, but we have to start somewhere.
> Let's use this jira for discussion, I'll attach a patch (with tests) momentarily to make this concrete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira