You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2011/03/15 06:12:29 UTC

[jira] Commented: (HBASE-3607) Cursor functionality for results generated by Coprocessors

    [ https://issues.apache.org/jira/browse/HBASE-3607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006796#comment-13006796 ] 

stack commented on HBASE-3607:
------------------------------

So, what happens if the region moves mid-cursor-scan?

What is CursorCallable adding over and above Scanner?  Its not clear to me (Pardon me).

You are inconsistent in your formatting:

{code}
+    if(cache.size() ==0 && this.closed)
+      return null;
+    if(cache.size() ==0){//do a rpc and fetch results
+      Result[] res = this.htable.getConnection().getRegi
{code}

These are pretty radical additions:

{code}
+  /*
+   * get result from cp cursor
+   */
+  public Result[] nextCp(long cursorId, int cache) throws IOException;
+  /**
+   * closing the associated cursor object and release its region level resources
+   * @param cursorId
+   * @throws IOException
+   */
+  public void closeCp(long cursorId) throws IOException;
{code} 

Are they necessary?  Why do we have to mod the HRegion when we have CPs now?

Yeah, same for these additions to HRegionServer.

I do not see the direct benefit to all these big changes Himanshu.  Help me understand.



> Cursor functionality for results generated by Coprocessors
> ----------------------------------------------------------
>
>                 Key: HBASE-3607
>                 URL: https://issues.apache.org/jira/browse/HBASE-3607
>             Project: HBase
>          Issue Type: New Feature
>          Components: coprocessors
>            Reporter: Himanshu Vashishtha
>         Attachments: patch-2.txt
>
>
> I tried to come up with a scanner like functionality for results generated by coprocessors at region level. 
> This is just a poc, and it will be good to have your comments on it.
> It has support for both Incremental and In-memory Result sets. Attached is a patch that has a test case for an incremental result (i.e., client receives a cursorId from the CP core method, it instantiates a cursor object and iterates over the result set. He can set a cache limit on the CursorCallable object to reduce the number of rpc --> just like scanners.
> In its current state, it has some limitations too :)), like, it is region specific only, i.e., one can instantiate and use cursor at one region only (and that region is determined by the input row while instantiating the cursor). I will try to expand it so that it can have atleast a sequential access to other regions, but as I said, I want the opinion of experts to know whether this approach really makes some sense or not.
> I have tested it with the inbuilt testing framework on my laptop only.
> It will be good if I copy the use case here in the description too:
> Test table has rows like:
>  /**
>    * The scenario is that I have these rows keys in the test table:
>   'aaa-123'
>   'aaa-456'
>   'abc-111'
>   'abd-111'
>   'abd-222'
>   & I want to return:
>   ('aaa', 2)
>   ('abc', 1)
>   ('abd', 2)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira