You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Erik Holstad (JIRA)" <ji...@apache.org> on 2009/03/12 18:20:50 UTC

[jira] Commented: (HBASE-880) Improve the current client API by creating new container classes

    [ https://issues.apache.org/jira/browse/HBASE-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681402#action_12681402 ] 

Erik Holstad commented on HBASE-880:
------------------------------------

The changes proposed for 880 might seem big or even radical to some people :)
But it is basically 2 big thing that we want to change with this new Api.

1.The introduction of a new Family class. This will help the user understand
that there is a difference when asking for more columns in the same family and
introducing a new family to search. This is important to help the user to
realize that families are stored together and that is therefore more effective
to ask for columns from the same family than from different.

2. Dividing the get calls into different categories depending on how they will
be executed. This change has 2 benefits, one is the same as in the first change,
helping the user to understand what queries that are treated the same and which
queries that are effective and which ones that are not. The second benefit is
for the people that will manage and write the code. Since the queries are split
into groups that are to be implemented in very similar ways it is going to be
easier to make optimizations and reason about the code for each query group,
which will lead to better and faster code.

Right now we propose 4 different get groups. They are grouped together  depending
on their early out possibilities, so that each group can early out in the same
way. this is very convenient when wanting to add new get calls, they can be put
into any of the existing groups if the match the early out pattern or a new
group can be created. 

The current groups are:
GetColumns, early out as soon as all the columns are found

GetFamilies, can never be earlied out since you don't know how many columns
there are in a family.

GetRange, can be earlied out as soon as the storefile with ts< than the one
asked for is finished

GetTop, can be earlied out as soon as the maxNr is reached

> Improve the current client API by creating new container classes
> ----------------------------------------------------------------
>
>                 Key: HBASE-880
>                 URL: https://issues.apache.org/jira/browse/HBASE-880
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.0
>
>         Attachments: 880.patch, 880proposal4plus-v2.patch, 880proposal4plus.patch, 880proposal5-v2.patch, 880proposal5-v2.png, 880proposal5.patch, 880proposal5.png, hbase-880-patch.jpg, hbase-880-proposal4.patch, HBASE-880-proposal6-v2.txt, HBASE-880-proposal6-v3.txt, hbase-880-v1.patch, hbase-880-v2.patch, hbase_client_classes.png, NewCilentAPIProposoal4.gif, proposal2.jpg, proposed.jpg
>
>   Original Estimate: 240h
>  Remaining Estimate: 240h
>
> The current API does not scale very well. For each new feature, we have to add many methods to take care of all the overloads. Also, the need to batch row operations (gets, inserts, deletes) implies that we have to manage some "entities" like we are able to do with BatchUpdate but not with the other operations. The RowLock should be an attribute of such an entity.
> The scope of this jira is only to replace current API with another feature-compatible one, other methods will be added in other issues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.