You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Josh Elser (JIRA)" <ji...@apache.org> on 2016/03/02 20:22:18 UTC

[jira] [Commented] (ACCUMULO-4156) Tunable replication frequency

    [ https://issues.apache.org/jira/browse/ACCUMULO-4156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15176288#comment-15176288 ] 

Josh Elser commented on ACCUMULO-4156:
--------------------------------------

{quote}
I don't have a specific implementation in mind, but I'd like to see a solution that involves isolating the work down to specific table events such as time-since-last-replication and data-added-since-last-replication.

Josh Elser has had some ideas about doing things incrementally within WAL files (ie, replicating between two sync points) that can also help with this. 
{quote}

I wish I remembered a bit more the model of "doing this safely", replicating from some offsetA to offsetB in a WAL, but my brain has evicted what little I once had figured out. The original design was meant to work like this (proactively replicate the data once it was synced to the WAL -- as this is the point we are guaranteed that the data is "written"), but there was something I had run into along the way. I wish I remembered what exactly it was, but it would be great to remove the little flag that ignores replication until a WAL is "closed" (impossible to be used by any tserver anymore). Maybe it was related to the lack of implicit entries in a WAL? We don't explicitly track how many entries are in a WAL now (just an "infinite length" equating to reading the entire WAL for replication); that would make it very difficult to track this. If we could keep a simple one-level index somewhere (byte offset to WAL entry record offset), that might be enough.

It might be easy to force a roll of WALs from some client admin API, but that also has local write performance implications. I think we'd need to think about it from both sides: operational use and developer enablement/ease-of-use.

> Tunable replication frequency
> -----------------------------
>
>                 Key: ACCUMULO-4156
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4156
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.7.1
>            Reporter: William Slacum
>             Fix For: 1.8.0
>
>
> Currently, replication happens when a write ahead log file is closed. The only parameter to toggle when this event occurs is write ahead log size, and is only applicable to the tablet servers themselves.
> By default this means that when replication happens isn't tied to the table it is configured on, but also exogenous factors such as total write load and failures. If a system receives ~100MB/day/TServer, and the WAL size is its default 1GB, it will take 10 days for any replication event to occur. Another possibility is that an unreplicated table is receiving many writes, which will cause more frequent replication events, but proportionally the work will involve less data for the table being replicated.
> I don't have a specific implementation in mind, but I'd like to see a solution that involves isolating the work down to specific table events such as time-since-last-replication and data-added-since-last-replication.
> [~elserj] has had some ideas about doing things incrementally within WAL files (ie, replicating between two sync points) that can also help with this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)