You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Nick Dimiduk (Jira)" <ji...@apache.org> on 2020/07/14 21:32:00 UTC

[jira] [Updated] (HBASE-24302) Add an "ignoreTimestamps" option (defaulted to false) to HashTable/SyncTable tool

     [ https://issues.apache.org/jira/browse/HBASE-24302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nick Dimiduk updated HBASE-24302:
---------------------------------
    Fix Version/s:     (was: 2.4.0)

> Add an "ignoreTimestamps" option (defaulted to false) to HashTable/SyncTable tool
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-24302
>                 URL: https://issues.apache.org/jira/browse/HBASE-24302
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 3.0.0-alpha-1, 2.3.0, 2.2.5
>            Reporter: Wellington Chevreuil
>            Assignee: Wellington Chevreuil
>            Priority: Major
>             Fix For: 3.0.0-alpha-1, 2.3.0, 2.2.5
>
>
> Currently, when hashing and comparing values between a source and a target table, HashTable/SyncTable always consider cell timestamp values. However, cell timestamp values are not always relevant for client applications, so these use cases could benefit of a more flexible comparison logic where timestamps could be ignored.
> For such scenarios, HashTable/SyncTable could have better performance, since cells with only timestamps diverging would not be copied. 
> Another case that would benefit from this option is when bulk deletes are wrongly applied at target. At the moment, HashTable/SyncTable on it's own is not capable of syncing back the clusters, as the source Puts would have an older TS than the delete markers in the target. That would require target to complete major compaction on the whole table before HashTable/SyncTable could be run.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)