You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Yu Li (JIRA)" <ji...@apache.org> on 2017/02/22 06:15:44 UTC
[jira] [Comment Edited] (HBASE-17471) Region Seqid will be out of order in WAL if using mvccPreAssign

    [ https://issues.apache.org/jira/browse/HBASE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877550#comment-15877550 ] 

Yu Li edited comment on HBASE-17471 at 2/22/17 6:15 AM:
--------------------------------------------------------

Here is the performance data before/after change here in our customized 1.1.2 with PCIe-SSD (with HBASE-17676 as well), which shows no regression:

||Case||Throughput(ops/sec)||AverageLatency(us)||
|before|127708|4983|
|after|127608|4987|

Test environment:
{noformat}
Hardware:
4 physical client node, 1 single RS, 3 Datanodes
1 PCIe-SSD, 10 SATA disks

YCSB configurations:
8 YCSB processes on each client node
operationcount=20000000
threadcount=20 (overall 4*8*20=640 threads against the single RS)
insertproportion=1

HBase configurations:
hbase.hregion.memstore.flush.size => 268435456
hbase.regionserver.handler.count => 192
hbase.wal.storage.policy => ALL_SSD

table schema:
{NAME => 'cf', DATA_BLOCK_ENCODING => 'DIFF', VERSIONS=> '1', 
COMPRESSION => 'SNAPPY', IN_MEMORY => 'false', BLOCKCACHE => 'true'},
{SPLITS => (1..9).map {|i| "user#{1000+i*(9999-1000)/9}"}, DURABILITY=>'SYNC_WAL',
METADATA => {'hbase.hstore.block.storage.policy' => 'ALL_SSD'}}
{noformat}


was (Author: carp84):
Here is the performance data before/after change here in our customized 1.1.2 (with HBASE-17676 as well), which shows no regression:

||Case||Throughput(ops/sec)||AverageLatency(us)||
|before|127708|4983|
|after|127608|4987|

Test environment:
{noformat}
Hardware:
4 physical client node, 1 single RS, 3 Datanodes
1 PCIe-SSD, 10 SATA disks

YCSB configurations:
8 YCSB processes on each client node
operationcount=20000000
threadcount=20 (overall 4*8*20=640 threads against the single RS)
insertproportion=1

HBase configurations:
hbase.hregion.memstore.flush.size => 268435456
hbase.regionserver.handler.count => 192
hbase.wal.storage.policy => ALL_SSD

table schema:
{NAME => 'cf', DATA_BLOCK_ENCODING => 'DIFF', VERSIONS=> '1', 
COMPRESSION => 'SNAPPY', IN_MEMORY => 'false', BLOCKCACHE => 'true'},
{SPLITS => (1..9).map {|i| "user#{1000+i*(9999-1000)/9}"}, DURABILITY=>'SYNC_WAL',
METADATA => {'hbase.hstore.block.storage.policy' => 'ALL_SSD'}}
{noformat}

> Region Seqid will be out of order in WAL if using mvccPreAssign
> ---------------------------------------------------------------
>
>                 Key: HBASE-17471
>                 URL: https://issues.apache.org/jira/browse/HBASE-17471
>             Project: HBase
>          Issue Type: Bug
>          Components: wal
>    Affects Versions: 2.0.0, 1.4.0
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>            Priority: Critical
>         Attachments: HBASE-17471-duo.patch, HBASE-17471-duo-v1.patch, HBASE-17471-duo-v2.patch, HBASE-17471.patch, HBASE-17471.tmp, HBASE-17471.v2.patch, HBASE-17471.v3.patch, HBASE-17471.v4.patch, HBASE-17471.v5.patch, HBASE-17471.v6.patch
>
>
>  mvccPreAssign was brought by HBASE-16698, which truly improved the performance of writing, especially in ASYNC_WAL scenario. But mvccPreAssign was only used in {{doMiniBatchMutate}}, not in Increment/Append path. If Increment/Append and batch put are using against the same region in parallel, then seqid of the same region may not monotonically increasing in the WAL. Since one write path acquires mvcc/seqid before append, and the other acquires in the append/sync consume thread.
> The out of order situation can easily reproduced by a simple UT, which was attached in the attachment. I modified the code to assert on the disorder: 
> {code}
>     if(this.highestSequenceIds.containsKey(encodedRegionName)) {
>       assert highestSequenceIds.get(encodedRegionName) < sequenceid;
>     }
> {code}
> I'd like to say, If we allow disorder in WALs, then this is not a issue. 
> But as far as I know, if {{highestSequenceIds}} is not properly set, some WALs may not archive to oldWALs correctly.
> which I haven't figure out yet is that, will disorder in WAL cause data loss when recovering from disaster? If so, then it is a big problem need to be fixed.
> I have fix this problem in our costom1.1.x branch, my solution is using mvccPreAssign everywhere, making it un-configurable. Since mvccPreAssign it is indeed a better way than assign seqid in the ringbuffer thread while keeping handlers waiting for it.
> If anyone think it is doable, then I will port it to branch-1 and master branch and upload it. 
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)