You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by "Rakesh R (JIRA)" <ji...@apache.org> on 2014/03/07 07:59:43 UTC

[jira] [Commented] (BOOKKEEPER-733) Improve ReplicationWorker to handle the urLedgers which already have same leder replica in hand

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13923636#comment-13923636 ] 

Rakesh R commented on BOOKKEEPER-733:
-------------------------------------

Hi All,

Following are few cases where the target bookie is not able to proceed happily with the rereplication procedures. 

I'm trying to put together all such cases where we come across. My intention is simple to make everyone aware about the cases and I hope this will help us to reach to a common conclusion.

+Case-1)+ Already have replica(am part of all the ledger fragments)
+Case-2)+ BKException - BKReadException, BKBookieHandleNotAvailableException
	- quorum lost (Thanks a lot Ivan for bringing this scenario, where the ledger losts the quorum and hanging around for re-replication.)
	- slow bookies and not get enough responses et.

+Case-3)+ Other BKExceptions (if anything requires special attention).

Please see the initial draft proposal where I'm trying to address the cases very specifically by introducing diff return codes. One reason for specific handling is, these exceptions are known to the AutoRecovery module and can esaily build intelligence out of this. Like I can utilize zk watch notification mechanism or wait for configured retry intervals to trigger me et. I agree, specific handling should not leave any loopholes.

I'd like to see the feedback and if agrees would explore more on this appraoch.

*Proposal:*
Introduce return code while releasing the lock like: LedgerUnderreplicationManager#releaseUnderreplicatedLedger(ledgerId, rc)
ReturnCode:REPLICA_EXISTS
ReturnCode:READ_FAILURE
ReturnCode:FAILED
ReturnCode:OK

Based on the rc ZkLedgerUnderreplicationManager can build intelligence to handle specific cases like:

Say, ZkLedgerUnderreplicationManager wil maintain a map say visitedLedgers - <ReturnCode vs ListOfLedgers>
# *Case-1)* Already have replica(am part of all the ledger fragments)
Add/update collection to represents 'existingLedgers' in ZkLedgerUnderreplicationManager and put the entry into 'visitedLedgers'
_RW Thread:_
	step-1) On receving rc, he will add into this list.
	step-2) Add a watcher to this ledger for further cleanups.
	step-3) While getLedgerToRereplicate(), he will use 'existingLedgers' and skip this ledger for now. So the unnecessary looping will be avoided for this ledger.
_ZK Watcher Thread:_ 
Now on any NodeDeleted/NodeDataChanged, it will remove the ledger from the list, considering that if ledger still exists as underreplicated. Now RW will be able to recheck again to see any fragments can be rereplicated to me.
# *Case-2)* BKException - BKReadException, BKBookieHandleNotAvailableException
Add/update collection to represents 'errLedgers' in ZkLedgerUnderreplicationManager and put the entry into 'visitedLedgers'
_RW Thread:_
	step-1) On receiving rc, he will add into this list.
	step-2) Add a watcher to this ledger for further cleanups.
	step-3) Here the idea is to postpone the ledger replication after some interval. Define the next interval, where he needs to consider this ledger to check for rereplication. If this errLedger reaches the interval, it will just remove it from the 'errLedgers', so that it will be available for rereplication phase.
	step-4) While getLedgerToRereplicate(), he will use 'errLedgers' and skip this ledger for now. So the unnecessary looping will be avoided for this ledger.
_ZK Watcher Thread:_
               Now on any NodeDeleted/NodeDataChanged, it will remove the ledger from the list, considering that the ledger can be rechecked again. This will occur when the ledger is rereplicated by other guys. Or Auditor has reported few more bookie failures for this ledger et.
# *Case-3)* Other BKExceptions (if anything requires special attention).
As of know, I didn't see any extra handling needed for this. It can follow the same as Case-2

Thanks,
Rakesh

> Improve ReplicationWorker to handle the urLedgers which already have same leder replica in hand
> -----------------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-733
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-733
>             Project: Bookkeeper
>          Issue Type: Bug
>          Components: bookkeeper-auto-recovery
>    Affects Versions: 4.2.2, 4.3.0
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>
> +Scenario:+
> Step1 : Have three bookies BK1, BK2, BK3
> Step2 : Have written ledgers with quorum 2
> Step3 : Unfortunately BK2 and BK3 both went down for few moments.
> The following logs are flooded in BK1 autorecovery logs. RW is trying to replicate the ledgers, but it simply skip this fragment and moves to next cycle when it sees a replica found in his hand. IMO, we should have a mechanism in place to avoid unnecessary cycles.
> {code}
> 2014-02-18 21:47:55,140 - ERROR - [New I/O client boss #2-1:PerChannelBookieClient$1@230] - Could not connect to bookie: [id: 0x00ba679e]/10.18.170.130:15002, current state CONNECTING : 
> java.net.ConnectException: Connection refused: no further information
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
> 	at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:401)
> 	at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:370)
> 	at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:292)
> 	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
> 	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
> 	at java.lang.Thread.run(Thread.java:619)
> 2014-02-18 21:47:55,140 - INFO  - 2014-02-18 21:59:33,215 - DEBUG  - [ReplicationWorker:ReplicationWorker@182] - Target Bookie[10.18.170.130:15003] found in the fragment ensemble: [10.18.170.130:15003, 10.18.170.130:15001, 10.18.170.130:15002]
> [ReplicationWorker:PerChannelBookieClient@194] - Connecting to bookie: 10.18.170.130:15002
> 2014-02-18 21:47:56,162 - ERROR - [New I/O client boss #2-1:PerChannelBookieClient$1@230] - Could not connect to bookie: [id: 0x0003f377]/10.18.170.130:15002, current state CONNECTING : 
> java.net.ConnectException: Connection refused: no further information
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
> 	at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:401)
> 	at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:370)
> 	at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:292)
> 	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
> 	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
> 	at java.lang.Thread.run(Thread.java:619)
> 2014-02-18 21:59:33,215 - DEBUG  - [ReplicationWorker:ReplicationWorker@182] - Target Bookie[10.18.170.130:15003] found in the fragment ensemble: [10.18.170.130:15003, 10.18.170.130:15001, 10.18.170.130:15002]
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)