You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jesse Yates (Commented) (JIRA)" <ji...@apache.org> on 2011/12/13 02:51:30 UTC

[jira] [Commented] (HBASE-4999) Constraints - Enhance checkAndPut to do atomic arbitrary constraint checks

    [ https://issues.apache.org/jira/browse/HBASE-4999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168050#comment-13168050 ] 

Jesse Yates commented on HBASE-4999:
------------------------------------

Looking through the architecture for checkAndPut (as well as following up the discussion here: http://search-hadoop.com/m/SgP0l1gb0TD), I think we can support this fairly easily. 

My thought would be that we essentially have a CheckCurrentConstraint that looks something like:
{code}
public class CheckCurrentConstraint{
	public abstract void check(Put p, Result r) throws ConstraintException;
}
{code}

So you have the current Put we want to make and then the actual row that we pull from the table in doing the check.

This would be run in preCheckAndPut (or just preCheck). There may need to be a little jiggering in the HRegion around when this is is actually run, to ensure that we actually obtain the row lock. 

However, since the row lock would be taken for the row we are checking, no other puts are going to interfere and since we can use the MVCC to get concurrent reads out of the DB, the most up-to-date version should be retrievable without a problem.

I'm not a big fan of passing in the constraint on the client side and then running it on the server - that seems to break a lot of the intended functionality of constraints which should essentially act as a safeguard on your table. They should be something always run to make sure bad things aren't put into your table. Right now, they are able to use the configuration to make them highly maleable to running on different CFs and CQs (or not), but these should be things over the lifetime of the table. I can see a use case where client-side specification might be useful occasionally, but IMHO the general case is that it should be far more common to just set up the constraints once on the table according to organization policy and then modify them as necessary.

Common constraints should be added later when we actually figure out what common use cases are for constraints - as a new feature we want to make sure we don't cowboy in and start adding excessive code willy-nilly. It tends to be a lot harder to remove code once its in, rather than add it later.
 
                
> Constraints - Enhance checkAndPut to do atomic arbitrary constraint checks
> --------------------------------------------------------------------------
>
>                 Key: HBASE-4999
>                 URL: https://issues.apache.org/jira/browse/HBASE-4999
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, coprocessors
>    Affects Versions: 0.92.0
>            Reporter: Suraj Varma
>              Labels: CAS, checkAndPut, constraints
>             Fix For: 0.94.0
>
>
> Related work: HBASE-4605
> It would be great if checkAndPut (CAS) can be enhanced to not just use a value comparison as a gating factor for the put, but rather have the capability of doing arbitrary constraint checks on the column value (where the current comparinator approach is a subset of possible constraints that can be checked). Commonly used constraints (like comparisons) can be provided out of the box and we should have the ability to accept custom constraints set by the client for the checkAndPut call. 
> One use-case would be the ability to implement something like the below in HBase.
> Pseudo sql: 
> update table-name
> set column-name = new-value
> where (column-value - new-value) > threshold-value
> ... where the mutation would go through only if the specified constraint in the where clause is true.
> Current options include using a co-processor to do preCheckAndPut/postCheckAndPut constraint checks - but this is not atomic. i.e. the row lock needs to be released by the co-processor before the real checkAndPut call, thus not meeting the atomic requirement. 
> Everything above is still meant to be at row level (so, no cross-row constraint checking is implied here).
> And ideal end result would be that an HBase client would be able to specify a set of constraints on multiple column qualifiers as part of the checkAndPut call. The call goes through if all the constraints are satisfied or doesn't if any of the constraints fail. And the above checkAndPut should be atomically executed (just like current checkAndPut semantics).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira