You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Andrew Kyle Purtell (Jira)" <ji...@apache.org> on 2020/11/23 19:38:00 UTC

[jira] [Resolved] (HBASE-24877) Add option to avoid aborting RS process upon uncaught exceptions happen on replication source

     [ https://issues.apache.org/jira/browse/HBASE-24877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Kyle Purtell resolved HBASE-24877.
-----------------------------------------
    Fix Version/s: 2.4.0
                   3.0.0-alpha-1
       Resolution: Fixed

PRs were merged to master and branch-2. Resolving. File new issues for any further backports.

> Add option to avoid aborting RS process upon uncaught exceptions happen on replication source
> ---------------------------------------------------------------------------------------------
>
>                 Key: HBASE-24877
>                 URL: https://issues.apache.org/jira/browse/HBASE-24877
>             Project: HBase
>          Issue Type: Improvement
>          Components: Replication
>    Affects Versions: 3.0.0-alpha-1, 2.4.0
>            Reporter: Wellington Chevreuil
>            Assignee: Wellington Chevreuil
>            Priority: Major
>             Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Currently, we abort entire RS process if any uncaught exceptions happens on ReplicationSource initialization. This may be too extreme on certain deployments, where custom replication endpoint implementations may choose to do so when remote peers are unavailable, but source cluster shouldn't be brought down entirely. Similarly, source reader and shipper threads would cause RS to abort on any runtime exception occurrence while running. 
> This patch adds configuration option (false by default, to keep the original behaviour), to avoid aborting entire RS processes under these conditions. Instead, if ReplicationSource initialization fails with a RuntimeException, it keeps retrying the source startup. In the case of readers/shippers runtime errors, it refreshes the replication source, terminating current source and its readers/shippers and creating new ones.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)