You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by "Keith Turner (JIRA)" <ji...@apache.org> on 2012/05/08 16:49:53 UTC

[jira] [Created] (ACCUMULO-580) Make size of batch scanner client size buffer configurable

Keith Turner created ACCUMULO-580:
-------------------------------------

             Summary: Make size of batch scanner client size buffer configurable
                 Key: ACCUMULO-580
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-580
             Project: Accumulo
          Issue Type: Improvement
          Components: client
            Reporter: Keith Turner
            Assignee: Billie Rinaldi


The batch scanner has a buffer on the client side where results read from tservers are stored.  This buffer holds 1000 entries and it not configurable by the client.  When it fills up, all threads reading from tservers block.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ACCUMULO-580) Make size of batch scanner client size buffer configurable

Posted by "Billie Rinaldi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13278803#comment-13278803 ] 

Billie Rinaldi commented on ACCUMULO-580:
-----------------------------------------

Is the suggestion to add a new method to the BatchScanner interface to set the batch size?
                
> Make size of batch scanner client size buffer configurable
> ----------------------------------------------------------
>
>                 Key: ACCUMULO-580
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-580
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client
>            Reporter: Keith Turner
>            Assignee: Billie Rinaldi
>
> The batch scanner has a buffer on the client side where results read from tservers are stored.  This buffer holds 1000 entries and it not configurable by the client.  When it fills up, all threads reading from tservers block.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (ACCUMULO-580) Make size of batch scanner client size buffer configurable

Posted by "Billie Rinaldi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ACCUMULO-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Billie Rinaldi reassigned ACCUMULO-580:
---------------------------------------

    Assignee: Keith Turner  (was: Billie Rinaldi)
    
> Make size of batch scanner client size buffer configurable
> ----------------------------------------------------------
>
>                 Key: ACCUMULO-580
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-580
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client
>            Reporter: Keith Turner
>            Assignee: Keith Turner
>
> The batch scanner has a buffer on the client side where results read from tservers are stored.  This buffer holds 1000 entries and it not configurable by the client.  When it fills up, all threads reading from tservers block.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ACCUMULO-580) Make size of batch scanner client size buffer configurable

Posted by "Billie Rinaldi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280261#comment-13280261 ] 

Billie Rinaldi commented on ACCUMULO-580:
-----------------------------------------

After looking into this further, I propose moving setBatchSize and getBatchSize from the Scanner interface to the ScannerBase interface, which is shared by Scanner and BatchScanner implementations.
                
> Make size of batch scanner client size buffer configurable
> ----------------------------------------------------------
>
>                 Key: ACCUMULO-580
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-580
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client
>            Reporter: Keith Turner
>            Assignee: Billie Rinaldi
>
> The batch scanner has a buffer on the client side where results read from tservers are stored.  This buffer holds 1000 entries and it not configurable by the client.  When it fills up, all threads reading from tservers block.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ACCUMULO-580) Make size of batch scanner client size buffer configurable

Posted by "Keith Turner (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280556#comment-13280556 ] 

Keith Turner commented on ACCUMULO-580:
---------------------------------------

The intent of this fixed size buffer is to limit memory usage.  I do not think it does a very good job at that.  I am thinking this queue should be of unlimited size, however each batch scanner thread should be limited to two active batches.   So a batch scanner thread would fetch a batch and throw the entire batch on the queue.  It could then go fetch another batch.  The thread would then block until at least one of the previous two batches was consumed.  This would limit memory usage to 2*BatchSizeByte*NumThreads.  The current default batch size is 1MB, this could be set to 512K to approximate current memory usage.  

Allowing two batches per thread should allow for more parallelism and limit memory usage.

The current fixed size buffer of 1000 entries could lead to obnoxious memory usage if large key values are returned.  This is the reason I think its not a suitable solution for managing memory.


                
> Make size of batch scanner client size buffer configurable
> ----------------------------------------------------------
>
>                 Key: ACCUMULO-580
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-580
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client
>            Reporter: Keith Turner
>            Assignee: Billie Rinaldi
>
> The batch scanner has a buffer on the client side where results read from tservers are stored.  This buffer holds 1000 entries and it not configurable by the client.  When it fills up, all threads reading from tservers block.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (ACCUMULO-580) Make size of batch scanner client size buffer configurable

Posted by "Keith Turner (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ACCUMULO-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Keith Turner resolved ACCUMULO-580.
-----------------------------------

       Resolution: Fixed
    Fix Version/s: 1.5.0
    
> Make size of batch scanner client size buffer configurable
> ----------------------------------------------------------
>
>                 Key: ACCUMULO-580
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-580
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client
>            Reporter: Keith Turner
>            Assignee: Keith Turner
>             Fix For: 1.5.0
>
>
> The batch scanner has a buffer on the client side where results read from tservers are stored.  This buffer holds 1000 entries and it not configurable by the client.  When it fills up, all threads reading from tservers block.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira