You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Gary Helmling (Commented) (JIRA)" <ji...@apache.org> on 2011/10/20 21:38:11 UTC
[jira] [Commented] (HBASE-4605) Constraints

    [ https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131937#comment-13131937 ] 

Gary Helmling commented on HBASE-4605:
--------------------------------------

Summarizing some comments from a related thread in the dev@hbase.apache.org mailing list:

http://search-hadoop.com/m/oGvg82YEzA82

Adding special configuration hooks to {{HTableDescriptor}}, say {{HTableDescriptor.addConstraint(Class c)}}, is likely to lead to long-term clutter if we follow this strategy for each special purpose coprocessor implementation that comes up.

So instead, I'd suggest we invert this and make the coprocessor implementation responsible for setting it's configuration as attributes on {{HTableDescriptor}}.  This would follow a pattern something like the following:
{code}
HTableDescriptor htd = new HTableDescriptor(...);
Constraints.add(htd, MyConstraintImpl.class);
HBaseAdmin admin = new HBaseAdmin();
admin.createTable(htd);
{code}

The important part being that the {{Constraints}} class handles marshalling it's required configuration as attributes on {{HTableDescriptor}}, and, in addition, is free to define those configuration helper methods however best makes sense for it's own needs.  For the example above, I'm guessing this would be something like:
{code}
public static void add(HTableDescriptor htd, Class<? extends Constraint>... clz)
{code}

This keeps {{HTableDescriptor}} handling just the common needs (through the arbitrary key/value attributes), while keeping the specific configuration logic with each coprocessor implementation.

There is an additional challenge, however, in how we can cleanly incorporate access to set the configurations into the HBase shell.  Ideally, you'd be able to do something like (exact syntax aside):
{noformat}
hbase> alter 'mytable', METHOD => 'Constraints.add', VALUE => com.my.MyConstraint
{noformat}

or maybe
{noformat}
hbase> alter 'mytable', TYPE => 'Constraints', METHOD => 'add', VALUE => com.my.MyConstraint
{noformat}

I think the key problem is figuring out how the shell actually knows about the {{Constraints}} class.  Some options are:
# use the full class name, e.g. "org.apache.hadoop.hbase.coprocessor.Constraints".  This is clunky but probably simplest to do.
# allowed the Constraints class to register itself via some SPI interface that the shell code can examine for possible providers.  This would probably still need the class to be statically initialized somehow.
# provide a extensions dir for the shell:
#* extensions drop a simple jruby scriptlet in a file in the dir (say lib/ruby/hbase/ext)
#* the scriptlet does some simple registration of the available methods/commands
#* the shell code loads all files

There may be better options as well, but hopefully this can serve as the basis for some discussion.
                
> Constraints
> -----------
>
>                 Key: HBASE-4605
>                 URL: https://issues.apache.org/jira/browse/HBASE-4605
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, coprocessors
>    Affects Versions: 0.94.0
>            Reporter: Jesse Yates
>            Assignee: Jesse Yates
>         Attachments: java_Constraint_v2.patch
>
>
> From Jesse's comment on dev:
> {quote}
> What I would like to propose is a simple interface that people can use to implement a 'constraint' (matching the classic database definition). This would help ease of adoption by helping HBase more easily check that box, help minimize code duplication across organizations, and lead to easier adoption.
> Essentially, people would implement a 'Constraint' interface for checking keys before they are put into a table. Puts that are valid get written to the table, but if not people can will throw an exception that gets propagated back to the client explaining why the put was invalid.
> Constraints would be set on a per-table basis and the user would be expected to ensure the jars containing the constraint are present on the machines serving that table.
> Yes, people could roll their own mechanism for doing this via coprocessors each time, but this would make it easier to do so, so you only have to implement a very minimal interface and not worry about the specifics.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira