You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Eric Newton (JIRA)" <ji...@apache.org> on 2014/12/18 18:14:13 UTC
[jira] [Comment Edited] (ACCUMULO-3423) speed up WAL roll-overs
[ https://issues.apache.org/jira/browse/ACCUMULO-3423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14250982#comment-14250982 ]
Eric Newton edited comment on ACCUMULO-3423 at 12/18/14 5:13 PM:
-----------------------------------------------------------------
I have prototyped the approach.
* eliminated the lazy update of log entries for every tablet
* open the next WAL file asynchronously
* pre-write the meta entries for the new log
I wrote a test that compares the performance of continuous ingest using a very small WAL rollover, and one that will not rollover at all.
The methodology of the test:
#) Set the WAL size to 10M
#) Create a table with 50 splits per tablet server
#) Wait for balance
#) Time 2M continuous ingest entries
#) Drop the table
#) Take the average of three attempts
#) Reset the WAL size to 1G, restart the tablet servers
#) Perform the same ingest tests
There are some minor configuration adjustments for the test (otherwise, just standard MAC):
{noformat}
tserver.wal.replication=1
table.minc.logs.max=100
gc.file.archive=false
{noformat}
Before the changes, the WAL roll-over caused the small WAL test to run at 130% of the large WAL test.
Afterward, they are 108 - 117%.
I haven't written the recovery code, file GC modifications or dealt with backwards compatibility.
The extremely small WAL ensures lots of rollovers, but the #tablets/tserver is reasonable.
was (Author: ecn):
I have prototyped the approach.
* eliminated the lazy update of log entries to every tablet server
* open the next WAL file asynchronously
* pre-wrote the Meta entries for the new log
I wrote a test that compares the performance of continuous ingest using a very small WAL rollover, and one that will not rollover at all.
The methodology of the test:
#) Set the WAL size to 10M
#) Create a table with 50 splits per tablet server
#) Wait for balance
#) Time a run a continuous ingest of 2 million entries
#) Drop the table
#) Take the average of three attempts
#) Reset the WAL size to 1G, restart the tablet servers
#) Perform the same ingest tests
There are some minor configuration adjustments for the test (otherwise, just standard MAC):
{noformat}
tserver.wal.replication=1
table.minc.logs.max=100
gc.file.archive=false
{noformat}
Before the changes, the generation of the WAL entries caused the small WAL test to run at 130% of the large WAL test.
Afterward, they are 108 - 117%.
I haven't written the recovery code, file GC modifications or dealt with backwards compatibility.
The extremely small WAL ensures lots of rollovers, but the #tablets/tserver is reasonable.
> speed up WAL roll-overs
> -----------------------
>
> Key: ACCUMULO-3423
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3423
> Project: Accumulo
> Issue Type: Bug
> Components: master, tserver
> Reporter: Eric Newton
>
> After reading the proposal on HBASE-10278, I realized there are many ways to make the Accumulo WAL roll-over faster.
> # Open two WALogs, but use only one until it reaches the WALog roll-over size
> # Rollover consists only of swapping the writers
> # WALog roll consists of the final close, which can happen in parallel
> # Don't mark the tablets with log entries: they are already marked with the tserver
> # The tserver can make notes about the logs-in-use in the metadata table(s) as part of opening the log.
> # The master can copy the log entries to tablets while unassigning them, piggybacking on the unassigment mutation.
> # Tablet servers can remove their current log entries from the metadata tables when they have no tablets using them.
> There are two issues:
> # tablets will have an empty file in recovery, nearly all the time, but the recovery code already handles that case.
> # presently, a tablet doesn't have a marker for a log it did not use. Many more tablets will attempt to recover when it is unnecessary.
> This would also address ACCUMULO-2889.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)