You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Maddineni Sukumar (JIRA)" <ji...@apache.org> on 2017/04/27 06:41:04 UTC
[jira] [Comment Edited] (HBASE-16466) HBase snapshots support in
VerifyReplication tool to reduce load on live HBase cluster with large
tables
[ https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15986079#comment-15986079 ]
Maddineni Sukumar edited comment on HBASE-16466 at 4/27/17 6:40 AM:
--------------------------------------------------------------------
Ted, I have below perf numbers as of now. Will get numbers on load impact.
I did below tests on a 8 node cluster using a Phoenix table with SALT_BUCKETS 16.
Rows Normal Approach Snapshots approach
---------------------------------------------------------------------
1million 1min16sec 36sec
10million 6min15sec 1min13sec
500million 5hours20mins 8mins40secs
With snapshots I am able to complete VerifyReplication job in 8 minutes instead of 5 hours using normal table scan approach.
was (Author: sukunaidu@gmail.com):
Ted, I have below perf numbers as of now. Will get numbers on load impact.
I did below tests on a 8 node cluster using a Phoenix table with SALT_BUCKETS 16.
Rows NORMAL WITH_SNAPSHOTS
-------------------------------------------------------
1m 1m16s 36s
10m 6m15s 1m13s
500m 5h20m30s 8m40s
With snapshots I am able to complete VerifyReplication job in 8 minutes instead of 5 hours using normal table scan approach.
> HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables
> --------------------------------------------------------------------------------------------------------
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
> Issue Type: Improvement
> Components: hbase
> Affects Versions: 0.98.21
> Reporter: Sukumar Maddineni
> Assignee: Maddineni Sukumar
> Fix For: 1.3.1
>
> Attachments: HBASE-16466.branch-1.3.001.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If you want to run VerifyReplication multiple times on a production live cluster with large tables then it creates extra load on HBase layer. So if we implement snapshot based support then both in source and target we can read data from snapshots which reduces load on HBase
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)