You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Keith Turner (JIRA)" <ji...@apache.org> on 2019/04/23 15:21:00 UTC

[jira] [Resolved] (ACCUMULO-4154) Improve batch writer

     [ https://issues.apache.org/jira/browse/ACCUMULO-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Keith Turner resolved ACCUMULO-4154.
------------------------------------
    Resolution: Duplicate

https://github.com/apache/accumulo/issues/1120

> Improve batch writer
> --------------------
>
>                 Key: ACCUMULO-4154
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4154
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Keith Turner
>            Priority: Major
>
> The batch writer currently has two drawbacks :
>  * It waits for its memory to be half full and then bins mutations for send threads.  I don't think this is optimal.   Think it would be better to keep the send threads busy.  As soon as there are mutation start working on them. If the send threads can not keep up, then work will naturally build up (w/o waiting for memory to be .5 full)
>  * The flush method blocks threads trying to add anything to the batch writer.
> Thinking of implementing the following model for the batch writer, which is similar to how the conditional writer works.
>   * Have a queue that all incoming mutations are added to.
>   * Have a queue per tablet server
>   * Have a single thread thats constantly taking batches of mutations off the incoming queue, binning them, and placing them on tablet server queues.
>   * When a send thread becomes idle, have it select and reserver the tablet server queue with the most work on it.
>   * when mutations fail, send threads can add them back to the incoming queue
> To get better flushing behavior, as each mutation is added to the batch writer it can be assigned a one up counter.   We can keep track of the minimum in progress mutation.  Flush can inspect this counter and wait for the minimum active mutation to reach a certain count.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)