You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Anoop Sam John (JIRA)" <ji...@apache.org> on 2013/08/23 06:44:52 UTC

[jira] [Commented] (HBASE-9291) Enable client to setAttribute that is sent once to each region server

    [ https://issues.apache.org/jira/browse/HBASE-9291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748281#comment-13748281 ] 

Anoop Sam John commented on HBASE-9291:
---------------------------------------

Now for Phoenix scan, there are many attributes being passed via the Scan#setAttribute..  So this will be going with every RPC call (openScanner, next...)

Region level passing only once is fine James?  The use case is very much with Scan only.  So how about having a openAttrs in scan rather than just attrs?  The attrs as now, will be passed with every RPC.  But the open attrs will be passed only when the request from client side is for open scanner. These attrs can be cached along with the RegionScanner against the scannedId.  (In Trunk we already have a RegionScannerHolder object in which we can save other info like attrs also)

So this is not like pass only once for a RS, but pass for a region. 

If this looks fine with you, I can take up this and give a patch. (If u started working with this already, pls go ahead)  What do u say James?
                
> Enable client to setAttribute that is sent once to each region server
> ---------------------------------------------------------------------
>
>                 Key: HBASE-9291
>                 URL: https://issues.apache.org/jira/browse/HBASE-9291
>             Project: HBase
>          Issue Type: New Feature
>          Components: IPC/RPC
>            Reporter: James Taylor
>
> Currently a Scan and Mutation allow the client to set its own attributes that get passed through the RPC layer and are accessible from a coprocessor. This is very handy, but breaks down if the amount of information is large, since this information ends up being sent again and again to every region. Clients can work around this with an endpoint "pre" and "post" coprocessor invocation that:
> 1) sends the information and caches it on the region server in the "pre" invocation
> 2) invokes the Scan or sends the batch of Mutations, and then
> 3) removes it in the "post" invocation.
> In this case, the client is forced to identify all region servers (ideally, all region servers that will be involved in the Scan/Mutation), make extra RPC calls, manage the caching of the information on the region server, age-out the information (in case the client dies before step (3) that clears the cached information), and must deal with the possibility of a split occurring while this operation is in-progress.
> Instead, it'd be much better if an attribute could be identified as a "region server" attribute in OperationWithAttributes and the HBase RPC layer would take care of doing the above.
> The use case where the above are necessary in Phoenix include:
> 1) Hash joins, where the results of the smaller side of a join scan are packaged up and sent to each region server, and
> 2) Secondary indexing, where the metadata of knowing a) which column family/column qualifier pairs and b) which part of the row key contributes to which indexes are sent to each region server that will process a batched put.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira