You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Tomu Tsuruhara (JIRA)" <ji...@apache.org> on 2017/04/04 02:03:41 UTC

[jira] [Created] (HBASE-17871) scan#setBatch(int) call leads wrong result of VerifyReplication

Tomu Tsuruhara created HBASE-17871:
--------------------------------------

             Summary: scan#setBatch(int) call leads wrong result of VerifyReplication
                 Key: HBASE-17871
                 URL: https://issues.apache.org/jira/browse/HBASE-17871
             Project: HBase
          Issue Type: Bug
    Affects Versions: 2.0.0, 1.4.0
            Reporter: Tomu Tsuruhara
            Assignee: Tomu Tsuruhara
            Priority: Minor


VerifyReplication tool printed weird logs.

{noformat}
2017-04-03 23:30:50,252 ERROR [main] org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: CONTENT_DIFFERENT_ROWS, rowkey=a00001001930000
2017-04-03 23:30:50,280 ERROR [main] org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: ONLY_IN_PEER_TABLE_ROWS, rowkey=a00001001930000
2017-04-03 23:30:50,387 ERROR [main] org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: CONTENT_DIFFERENT_ROWS, rowkey=a00001003850000
2017-04-03 23:30:50,414 ERROR [main] org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: ONLY_IN_PEER_TABLE_ROWS, rowkey=a00001003850000
2017-04-03 23:30:50,480 ERROR [main] org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: CONTENT_DIFFERENT_ROWS, rowkey=a00001005320000
2017-04-03 23:30:50,508 ERROR [main] org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: ONLY_IN_PEER_TABLE_ROWS, rowkey=a00001005320000
{noformat}

Here, each bad rows were marked as both {{CONTENT_DIFFERENT_ROWS}} and {{ONLY_IN_PEER_TABLE_ROWS}}.
This should never happen so I took a look at code and found scan.setBatch call.

{code}
    @Override
    public void map(ImmutableBytesWritable row, final Result value,
                    Context context)
        throws IOException {
      if (replicatedScanner == null) {
	    ...
        final Scan scan = new Scan();
        scan.setBatch(batch);
{code}

As stated in HBASE-16376, {{scan#setBatch(int)}} call implicitly allows scan results to be partial.

Since {{VerifyReplication}} is assuming each {{scanner.next()}} call returns entire row,
partial results break compare logic.

We should avoid setBatch call here.
Thanks to RPC chunking (explained in this blog https://blogs.apache.org/hbase/entry/scan_improvements_in_hbase_1),
it's safe and acceptable I think.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)