You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Maddineni Sukumar (JIRA)" <ji...@apache.org> on 2017/04/27 06:41:04 UTC

[jira] [Comment Edited] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

    [ https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15986079#comment-15986079 ] 

Maddineni Sukumar edited comment on HBASE-16466 at 4/27/17 6:40 AM:
--------------------------------------------------------------------

Ted, I have below perf numbers as of now. Will get numbers on load impact. 

I did below tests on a 8 node cluster using a Phoenix table with SALT_BUCKETS 16. 

Rows               Normal Approach        Snapshots approach
---------------------------------------------------------------------
1million            1min16sec	             36sec
10million	         6min15sec	            1min13sec
500million	 5hours20mins	     8mins40secs

With snapshots I am able to complete VerifyReplication job in 8 minutes instead of 5 hours using normal table scan approach. 


was (Author: sukunaidu@gmail.com):
Ted, I have below perf numbers as of now. Will get numbers on load impact. 

I did below tests on a 8 node cluster using a Phoenix table with SALT_BUCKETS 16. 

Rows      NORMAL        WITH_SNAPSHOTS
-------------------------------------------------------
1m           1m16s            36s
10m         6m15s            1m13s
500m        5h20m30s      8m40s   

With snapshots I am able to complete VerifyReplication job in 8 minutes instead of 5 hours using normal table scan approach. 

> HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-16466
>                 URL: https://issues.apache.org/jira/browse/HBASE-16466
>             Project: HBase
>          Issue Type: Improvement
>          Components: hbase
>    Affects Versions: 0.98.21
>            Reporter: Sukumar Maddineni
>            Assignee: Maddineni Sukumar
>             Fix For: 1.3.1
>
>         Attachments: HBASE-16466.branch-1.3.001.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If you  want to run VerifyReplication multiple times on a production live cluster with large tables then it creates extra load on HBase layer. So if we implement snapshot based support then both in source and target we can read data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)