You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@bookkeeper.apache.org by "Rakesh R (JIRA)" <ji...@apache.org> on 2012/05/26 11:09:22 UTC

[jira] [Created] (BOOKKEEPER-272) Introduce chain of bookies for distributing the re-replication

Rakesh R created BOOKKEEPER-272:
-----------------------------------

             Summary: Introduce chain of bookies for distributing the re-replication
                 Key: BOOKKEEPER-272
                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
             Project: Bookkeeper
          Issue Type: Sub-task
          Components: bookkeeper-server
            Reporter: Rakesh R
            Assignee: Rakesh R


The idea is to form logical chain and each Bookie will be taking care each other. On any Bookie failure, his neighbour will act and initiate re-replication.

For example:
Assume we have 1,2,3,4,5,6 bookies and will be forming the following chain

01 <- 02 <- 03 <- 04 <- 05 <- 06 <- 01

Here, each one should take care of my immediate predecessor node. The lowest node should always care the highest node and is forming the logical closed chain.

Reference docs attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438856#comment-13438856 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

Thanks again Ivan for detailed review. Could you please give few more info on the following:

bq.In Auditor, don't call getChildren from process(). Instead all it just after the take() in main loop. In general, you shouldn't call any blocking methods from the zk event handler thread
Oh yeah, Thanks for bringing this good point. Correct, this will delay other watch notifications also.

bq.Logic change in ZkLedgerUnderreplicationManager is wrong [previous was wrong also]. the break should be a return.
If I understand correctly, on NodeExistsException we are trying to append the missingReplica to the underreplicated ledger so that will notifies about one more down bookie which contains the ledger copy.
If we return simply, then the setData() method will not be called and there is a chance of missing the info about second replica.

For Ex: L00001 ensemble BK1, BK2, BK3. 
Say BK1 fails initially, then will markUnderreplicated ledger as L000001(BK1 as the data).
Now again BK2 has failed, then while creating will get NEE, so we will append BK2 also like: L000001(BK1 BK2).

I think "break; statement" is making sense and after that the duplicate entry addition should be removed as per my latest patch.
Am I missing anything?

bq.There should be a main method in AutoRecoveryManager. 
Yeah I'll add main method also. But what about retaining start() and stop() method as public. In future this will allow others(any external entity) to manage the recovery process easily ?

bq.I think AutoRecoveryManager should be responsible for running recoveryworker as well. 
Ofcourse, I'll integrate RW also be initialized as part of ARM.

bq.auditorElector should never be null, unless initialization fails. If initialization fails, start and stop should never be run. Perhaps we should us guava service [1] here as I also suggested to Uma for BOOKKEEPER-248.

I just added null check, since start() and stop() methods are public.


I'll rework on other points.
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.3.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Ivan Kelly (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440199#comment-13440199 ] 

Ivan Kelly commented on BOOKKEEPER-272:
---------------------------------------

{quote}
L0001 contains only BK1. while marking missingReplica of BK2, got NEE. Assume there is already a missing replica. Silently return as you told
{quote}
No, In this case we need to add BK2 as a missingReplica also.

{quote}
L0001 contains only BK2. while marking missingReplica of BK2, got NEE. Assume only single auditor and no other is marking. So again we need to merge to the zkMetadata and update in zk.
{quote}
In this case we want to return. BK2 is already marked as a missingReplica, so it will be picked up for replication anyhow.

The necessary change is:
{code}
                } catch (KeeperException.NodeExistsException nee) {
                    Stat s = zkc.exists(znode, false);
                    if (s == null) {
                        continue;
                    }
                    try {
                        byte[] bytes = zkc.getData(znode, false, s);
                        TextFormat.merge(new String(bytes, UTF8), builder);
                        UnderreplicatedLedgerFormat data = builder.build();
                        for (String r : data.getReplicaList()) {
                            if (r.equals(missingReplica)) {
-                                break; // nothing to add
+                                return; // nothing to add
                            }
                        }
                        builder.addReplica(missingReplica);
                        zkc.setData(znode,
                                    TextFormat.printToString(builder.build()).getBytes(UTF8),
                                    s.getVersion());
                    } catch (KeeperException.NoNodeException nne) {
                        continue;
                    } catch (KeeperException.BadVersionException bve) {
                        continue;
                    } catch (TextFormat.ParseException pe) {
                        throw new ReplicationException.UnavailableException(
                                "Invalid data found", pe);
                    }
                }
                break;
            }
{code}



                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-auto-recovery
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.3.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294548#comment-13294548 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

Oh, seems that I did't explain clearly at my comment.
{quote}
Im not sure what you mean here. Having the failed bookie stored in the data is useful for debugging purposes, but we should do a check on the ledger beforehand anyhow to determine what to recover. Are you trying to avoid another read to the ledger znode?
{quote}
My point is, rather than reading with bookieclient and finding the badReplicas, why don't we find the fragments using the ledgermetadata present in zookeeper as we already knows the failed bookie.
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Ivan Kelly (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294606#comment-13294606 ] 

Ivan Kelly commented on BOOKKEEPER-272:
---------------------------------------

I think we should probe in any case though. For example, if a bookie has restarted, it's znode will disappear momentarily. All ledgers will be marked for checking by the auditor, but they're still available after the bookie comes back up.

                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Ivan Kelly (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438689#comment-13438689 ] 

Ivan Kelly commented on BOOKKEEPER-272:
---------------------------------------

I've had another look over this patch. I have some comments.

Logic change in ZkLedgerUnderreplicationManager is wrong [previous was wrong also]. the break should be a return.

Auditor will be a top level daemon, so it should own its create it's own LedgerManagers and ZooKeeper client. The only thing passed in to is should be the configuration.

There should be a main method in AutoRecoveryManager. I think AutoRecoveryManager should be responsible for running recoveryworker as well. Im not sure whether the node which runs auditor should run a recovery worker at the same time. This is an open question.

auditorElector should never be null, unless initialization fails. If initialization fails, start and stop should never be run. Perhaps we should us guava service [1] here as I also suggested to Uma for BOOKKEEPER-248.

Pattern "ELECTION_PATH + PATH_SEPARATOR + vote" is used in a lot of places. It would be nice to move this into a method "String getVotePath(String vote);"

In Auditor, don't call getChildren from process(). Instead all it just after the take() in main loop. In general, you shouldn't call any blocking methods from the zk event handler thread.

                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.3.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (BOOKKEEPER-272) Introduce chain of bookies for distributing the re-replication

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rakesh R updated BOOKKEEPER-272:
--------------------------------

    Attachment: BOOKKEEPER-272.patch
    
> Introduce chain of bookies for distributing the re-replication
> --------------------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.patch
>
>
> The idea is to form logical chain and each Bookie will be taking care each other. On any Bookie failure, his neighbour will act and initiate re-replication.
> For example:
> Assume we have 1,2,3,4,5,6 bookies and will be forming the following chain
> 01 <- 02 <- 03 <- 04 <- 05 <- 06 <- 01
> Here, each one should take care of my immediate predecessor node. The lowest node should always care the highest node and is forming the logical closed chain.
> Reference docs attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Introduce chain of bookies for distributing the re-replication

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13286688#comment-13286688 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

Here's a rough draft attached for reviewing the approach. Please have a look and appreciate your comments/thoughts.
                
> Introduce chain of bookies for distributing the re-replication
> --------------------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.patch
>
>
> The idea is to form logical chain and each Bookie will be taking care each other. On any Bookie failure, his neighbour will act and initiate re-replication.
> For example:
> Assume we have 1,2,3,4,5,6 bookies and will be forming the following chain
> 01 <- 02 <- 03 <- 04 <- 05 <- 06 <- 01
> Here, each one should take care of my immediate predecessor node. The lowest node should always care the highest node and is forming the logical closed chain.
> Reference docs attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Ivan Kelly (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293507#comment-13293507 ] 

Ivan Kelly commented on BOOKKEEPER-272:
---------------------------------------

{quote}
I have used the predecessor watching approach, used to avoid the herd effect with zookeeper leader election. Rule is, bookie will be watching to my predecessor bookie based on the ephemeral seq id. At any point of time, least ephemeral znode bookie only will get the chance to become Auditor. So I thought it would be more efficient to watch on my predecessor bookie. How does it sound?
{quote}
This is fine. I just thought having all watch for children changed would make simpler code. I don't have a strong opinion on this though.

{quote}
Hope you meant, after the auditor election I'm keeping the auditor's myInfo to the auditor election path(/ledgers/auditor/election). At present there is no logic, I've just kept for debugging purpose only(using ZK znodes).
{quote}
Ok. Thats fine. Before releasing i'd like to convert the data stored in the znode as a protobuf text serialization. I've already done a conversion of ledger metadata but havent' generated a patch/jira yet. I've made these changes to make moving between bk versions easier. This is fine as it is for now though.

{quote}
Here I'd like to keep the generate & publishing suspected ledgers before entering to the waitForNotification(). Consider the scenario where the auditor bookie(say BK1) has failed and new auditor(BK2) comes, he will not recieve any notifications for the BK1 failures(from the available bookie's path).
{quote}
Yup, that sounds good.
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (BOOKKEEPER-272) Introduce chain of bookies for distributing the re-replication

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rakesh R updated BOOKKEEPER-272:
--------------------------------

    Attachment: BOOKKEEPER-272.2.patch
    
> Introduce chain of bookies for distributing the re-replication
> --------------------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.patch
>
>
> The idea is to form logical chain and each Bookie will be taking care each other. On any Bookie failure, his neighbour will act and initiate re-replication.
> For example:
> Assume we have 1,2,3,4,5,6 bookies and will be forming the following chain
> 01 <- 02 <- 03 <- 04 <- 05 <- 06 <- 01
> Here, each one should take care of my immediate predecessor node. The lowest node should always care the highest node and is forming the logical closed chain.
> Reference docs attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rakesh R updated BOOKKEEPER-272:
--------------------------------

    Attachment: BOOKKEEPER-272.Auditor.1.patch
    
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (BOOKKEEPER-272) Introduce chain of bookies for distributing the re-replication

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rakesh R updated BOOKKEEPER-272:
--------------------------------

    Attachment: BOOKKEEPER-272.1.patch

Modified the ServerConfiguration file and fixed complilation issues.
                
> Introduce chain of bookies for distributing the re-replication
> --------------------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.patch
>
>
> The idea is to form logical chain and each Bookie will be taking care each other. On any Bookie failure, his neighbour will act and initiate re-replication.
> For example:
> Assume we have 1,2,3,4,5,6 bookies and will be forming the following chain
> 01 <- 02 <- 03 <- 04 <- 05 <- 06 <- 01
> Here, each one should take care of my immediate predecessor node. The lowest node should always care the highest node and is forming the logical closed chain.
> Reference docs attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Flavio Junqueira (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400853#comment-13400853 ] 

Flavio Junqueira commented on BOOKKEEPER-272:
---------------------------------------------

To summarize an offline discussion I had with Ivan, there are two points that I raised during the discussion:

# If the bookie and the auditor run on the same jvm, is it possible that runtime exceptions thrown by the auditor will kill the whole jvm and consequently the bookie as well?
# If the bookie and the auditor run on the same jvm, it might be more difficult to debug. In particular, I was wondering about how to separate the log messages for postmortem analysis and debugging in general.
# The performance of a bookie is more critical than the performance of the auditor. Running them on separate jvms might improve performance isolation.

                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Ivan Kelly (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438718#comment-13438718 ] 

Ivan Kelly commented on BOOKKEEPER-272:
---------------------------------------

I was thinking along the same lines.
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.3.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Introduce chain of bookies for distributing the re-replication

Posted by "Flavio Junqueira (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284030#comment-13284030 ] 

Flavio Junqueira commented on BOOKKEEPER-272:
---------------------------------------------

Hi Rakesh, One clarification. In the design document of BOOKKEEPER-237, we say: _Auditor would be forming the recovery chain based on the myIds added to the ZooKeeper under the zNode /ledgers/bookie/myId_. If the chain is logical and formed through zookeeper, then I don't understand why the auditor needs to be involved in the chain formation. In my understanding of the proposal, the role of the auditor is essentially to assign work (replication) to bookies. What am I missing?
                
> Introduce chain of bookies for distributing the re-replication
> --------------------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>
> The idea is to form logical chain and each Bookie will be taking care each other. On any Bookie failure, his neighbour will act and initiate re-replication.
> For example:
> Assume we have 1,2,3,4,5,6 bookies and will be forming the following chain
> 01 <- 02 <- 03 <- 04 <- 05 <- 06 <- 01
> Here, each one should take care of my immediate predecessor node. The lowest node should always care the highest node and is forming the logical closed chain.
> Reference docs attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Uma Maheswara Rao G (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400586#comment-13400586 ] 

Uma Maheswara Rao G commented on BOOKKEEPER-272:
------------------------------------------------

+1, I am also more inclined to keep in bookie process for now.
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400518#comment-13400518 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

@Ivan
I've refactored as per the comments, please go through this. 

Also, I think there is a bit cross over with BOOKKEEPER-246 patch. I'll go through it.

I just created AutoRecoveryService layer which has #start #stop #isRunning apis, just to plugin as a service.

bq.For the moment, and maybe for ever, whether the auditor runs should be configurable
I haven't included in this patch, since there is still confusions whether to treat as a plugin service or not.
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13436765#comment-13436765 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

Thanks Ivan,Uma,Flavio for your thoughts. It would be great if others could review my changes.


Following are the changes made in the latest patch:
- by considering Ivan's point, provided AutoRecoveryManager#start() and #stop() methods and removed the dependency with Bookie server. Anyone can plugin the service by invoking start() and stop() operational methods.
- I've raised another JIRA BOOKKEEPER-319 to finalize the way to manage the recovery service. 
- also corrected ZkLedgerUnderreplicationManager.java, where adding duplicate 'missingreplica' in case of an underreplicated ledger contains multiple bookie failures.

+Note:+ This patch has dependency with BOOKKEEPER-304 and to be applied after BOOKKEEPER-304 patch


                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.3.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440310#comment-13440310 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

Attached latest patch addressing Ivan's comment and also refactored tests. 

Hi Ivan, would be great to know your feedback.

Thanks,
Rakesh
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-auto-recovery
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.3.patch, BOOKKEEPER-272.4.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438865#comment-13438865 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

I also agree with you guys. Since Auditor is a stateless light weight daemon thread and only a single instance within the bk cluster, he can start RW service also along.
{quote}Oh, I was assuming that, every Recovery node will have (Auditor Participants + RW ) services running. But one will actually win for doing auditor work. But RW will do its job normally in all nodes. All other recovery node's auditor participants will look for the next election to win the Real Auditing job. 
I am not sure my assumption is same as others. Any other thoughts?
{quote}

                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.3.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Ivan Kelly (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291802#comment-13291802 ] 

Ivan Kelly commented on BOOKKEEPER-272:
---------------------------------------

New patch is a lot clearer. Comments follow;

- The changes to ServerConfiguration are unnesessary. The paths aren't actually configurable, so the shouldn't be in the configuration. Also, i don't think they should be configurable. I think that all auditor data should be under the auditor znode also. So eventually we'll have the three paths:
      /ledgers/auditor/election
      /ledgers/auditor/failedbookies
      /ledgers/auditor/underreplicated
      /ledgers/auditor/

- Election watcher does an election on the expired event. If you get this event, the bookie will shutdown (see Bookie.java)

- For the moment, and maybe for ever, whether the auditor runs should be configurable. During the course of 4.2 we're going to be building up this stuff incrementally. We should only enable it for tests which specifically use it (to start with).

- getMyInfo() should be called getIdentifier(). It would be great if you could modify registerBookie() to take a String rather than a port, so that the identifier for the bookie is only generated at one point ever.

- Collections.sort() on children isn't fully correct. V_1 & V_10 will sort before V_2, as it sorts alphabetically. Better to define a comparator, which splits on the '_', parses the number and does a comparison of that.

- Checking if you've won the election should be a matter of children.get(0).equals(myVote), once children is sorted correctly. 

- Check for specific exceptions. findbugs warns on catch (Exception e) because this also catches RuntimeExceptions. IOException is the wrong exception to use for this stuff also. Custom exceptions, defined under BookieException.

- Instead of watching a specific node, watch for the ChildrenChanged event on the zkAuditorElectPath. Even the current winner of the election can watch this. When triggered, doElection should be fine, though you should shutdown the Auditor thread if running if you loose the election. 

- I don't understand the point of writing auditor data to the auditor znode. It doesn't seem to do anything.

- The auditor code seems to check for failed bookies, and publishes these. We should skip a step here though, and instead of publishing the bookes, publish the ledgers which are on that bookie. In fact, it's much more efficient if it runs like this. If we only publish the failed bookies, each worker picking up a bookie to recover has to read all ledgers to find which ledgers to recover. If the auditor is maintaining a bookie -> ledger index, only one node needs to be reading all the ledgers.

- The run loop method of Auditor isn't a loop. I think the run loop should look as like:

{code}
public void run() {
   Set<String> bookies = getAvailableBookies();
   while (true) {
       waitForNotification();
       Set<String> newBookies = getAvailableBookies();
       Set<String> lostBookies = bookies;
       lostBookies.removeAll(newBookies);
       bookies = newBookies;

       if (lostBookies.size() > 0) {
           continue;
       }

       Map<String, List<Long>> bookie2ledgersMap = generateBookie2LedgersIndex();
       Set<Long> suspectedLedgers = HashSet<Long>();
       for (String b : lostBookies) {
           suspectedLedgers.addAll(bookie2ledgersMap.get(b));
       }
       publishSuspectedLedgers(suspectedLedgers);
   }
}
{code}

getAvailableBookies() should put a watch on the available znode, and fire a notification when triggered.
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Ivan Kelly (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440205#comment-13440205 ] 

Ivan Kelly commented on BOOKKEEPER-272:
---------------------------------------

Yes, the code I posted does exactly that. "Merging" is simply adding if it it doesn't already exist.

                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-auto-recovery
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.3.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440211#comment-13440211 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

Thanks Ivan. I'll do the changes by just return if it matches as you suggested. I'll investigate it later.

bq.Auditor will be a top level daemon, so it should own its create it's own LedgerManagers and ZooKeeper client. The only thing passed in to is should be the configuration.

Anyway we are planning to have initialization class which does starting Auditor and RW threads. Would you agree to create ledgerManagers there and pass it to the Auditor and RW daemons?

In that case, shall I keep the ctor as it is. (by taking managers)

{code}
public Auditor(String bookieIdentifier, AbstractConfiguration conf,
            ZooKeeper zkc, LedgerManager ledgerManager,
            LedgerUnderreplicationManager ledgerUnderreplicationManager) {
{code}

                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-auto-recovery
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.3.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Uma Maheswara Rao G (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uma Maheswara Rao G updated BOOKKEEPER-272:
-------------------------------------------

    Component/s:     (was: bookkeeper-server)
                 bookkeeper-auto-recovery
    
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-auto-recovery
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.3.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13292802#comment-13292802 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

Thanks Ivan, again for the detailed comments. I'd like to know more on the following:
{quote}
Instead of watching a specific node, watch for the ChildrenChanged event on the zkAuditorElectPath. Even the current winner of the election can watch this. When triggered, doElection should be fine, though you should shutdown the Auditor thread if running if you loose the election.
{quote}
I have used the predecessor watching approach, used to avoid the herd effect with zookeeper leader election. Rule is, bookie will be watching to my predecessor bookie based on the ephemeral seq id. At any point of time, least ephemeral znode bookie only will get the chance to become Auditor. So I thought it would be more efficient to watch on my predecessor bookie. How does it sound?


{quote}
I don't understand the point of writing auditor data to the auditor znode. It doesn't seem to do anything.
{quote}
Hope you meant, after the auditor election I'm keeping the auditor's myInfo to the auditor election path(/ledgers/auditor/election). At present there is no logic, I've just kept for debugging purpose only(using ZK znodes).
Below one:
{noformat}
auditorElector.zkclient.setData(auditorElector.conf
+                    .getZkAuditorElectionPath(), auditorElector.myInfo
+                    .getBytes(), -1);

{noformat}

{quote}
The run loop method of Auditor isn't a loop. I think the run loop should look as like:
{quote}
Yup, I'll remove the volatile flag and do other modifications. 
Here I'd like to keep the generate & publishing suspected ledgers before entering to the waitForNotification(). Consider the scenario where the auditor bookie(say BK1) has failed and new auditor(BK2) comes, he will not recieve any notifications for the BK1 failures(from the available bookie's path).


                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Flavio Junqueira (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400522#comment-13400522 ] 

Flavio Junqueira commented on BOOKKEEPER-272:
---------------------------------------------

{quote}
{quote}
For the moment, and maybe for ever, whether the auditor runs should be configurable
{quote}
I haven't included in this patch, since there is still confusions whether to treat as a plugin service or not.
{quote}

I think that whether the auditor runs or not should be configurable. Is that a problem?
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439427#comment-13439427 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

Hi Ivan, Hope I'm not confusing you:-)

As per the discussion/comments, now AutoRecoveryManager will have dependency with RW(that is still under dev stage) and also will make use of guava service for cleaner. Shall I move the ARM initialization phase to BOOKKEEPER-319 and will work there?

So that will make this JIRA completely independent and will discuss only the auditing job. Are you agreeing with me?

Also I will get some more time to familiarize with the guava service.
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.3.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rakesh R updated BOOKKEEPER-272:
--------------------------------

    Attachment: BOOKKEEPER-272.3.patch
    
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.3.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Introduce chain of bookies for distributing the re-replication

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284884#comment-13284884 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

Hi Flavio,

Thanks for your interest and comments, actually I was doing prototype for the 'CircularChain' algorithm without a central auditor guy. Sorry for the late reply:)

But I faced a problem in handling the following situation.

1 <- 2 <- 3 <- 4 <- 5 <- 6 <- 1

Consider the scenario where 2,3,4 went down. Now 5 got the notification and marked 4 as failed and moves to 4's neighbour 3. Whenever 5 is checking about 3's status, say 3 rejoins and is alive, this inturn will stop searching. Say, immediately 3 also went down. Here the chance of missing 2's failure is high. 

One solution that comes in my mind is, when 5 identifies 3 is alive he will add watcher to 3. Here again another problem is, consider 4 has rejoined and will also try adding watcher to 3. Now if we analyse the chain, the previous watcher added by 5 also will be there to 3(as ZK doesn't has unregister of watcher). Also, the level of watcher reformation will gradually increases.

I'm bit worrying about the chances of missing watchers and isolation/race conditions with this approach. 

If I'm having an auditor, only he will look to all and inform about failure Bookies. I feel, there is no chance of missing watchers and isolation/race conditions in this approach. Only the overhead will be Auditor election. 

He(central node) will publish about the failed bookie(through znode) and after recieving the notification anyone can acquire the lock and started re-replication and cycle will continue till complete re-replication. 

I'd like to know your opinion on handling bookie failures through central entity?.

Thanks,
Rakesh
                
> Introduce chain of bookies for distributing the re-replication
> --------------------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>
> The idea is to form logical chain and each Bookie will be taking care each other. On any Bookie failure, his neighbour will act and initiate re-replication.
> For example:
> Assume we have 1,2,3,4,5,6 bookies and will be forming the following chain
> 01 <- 02 <- 03 <- 04 <- 05 <- 06 <- 01
> Here, each one should take care of my immediate predecessor node. The lowest node should always care the highest node and is forming the logical closed chain.
> Reference docs attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13442495#comment-13442495 ] 

Hudson commented on BOOKKEEPER-272:
-----------------------------------

Integrated in bookkeeper-trunk #674 (See [https://builds.apache.org/job/bookkeeper-trunk/674/])
    BOOKKEEPER-272: Provide automatic mechanism to know bookie failures (rakeshr via ivank) (Revision 1377716)

     Result = FAILURE
ivank : 
Files : 
* /zookeeper/bookkeeper/trunk/CHANGES.txt
* /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/replication/Auditor.java
* /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/replication/AuditorElector.java
* /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/replication/AuditorBookieTest.java
* /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/replication/AuditorLedgerCheckerTest.java

                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-auto-recovery
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.3.patch, BOOKKEEPER-272.4.patch, BOOKKEEPER-272.5.patch, BOOKKEEPER-272.6.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440202#comment-13440202 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

Oh! yeah, there was a small mistake in my previous comment with BK1. Please read the scenario as:

L0001 contains only BK1. while marking missingReplica of BK2, got NEE. Assume only single auditor and no other is marking. So again we need to merge to the zkMetadata and update in zk.

                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-auto-recovery
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.3.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Introduce chain of bookies for distributing the re-replication

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290811#comment-13290811 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

I have uploaded latest patch with few test cases for knowing the algo.

??How the Bookie failure detection algo works??
This scheme is made based on distributed approach. Apart from the single bookie auditor for detecting the bookie failures all other logics are completely distributed. I'm thinking Auditor would make the detection more simple as we have existing 'available' znode in zookeeper for knowing the bookie failures.

*Following are the logical steps:*
# ??Generate BookieId:??
Each bookie will be creating a unique Id under 'bookieids' path, this is a persistent node in ZooKeeper.
Keep IP:PORT info as his data into respective 'bookieid' znode.
For ex: 0001 is the bookieId and 10.18.40.13:2181 is bokkie IP. 0001 znode contains 10.18.40.13:2181 as data.
# ??Build per bookie-ledger mappings:??
This will help to know the bookie's content very quickly and avoid parsing of all the ledgers for knowing the failed bookie's ledgers again and again. 
# ??How to build per bookie-ledger mappings:??
When creating/reforming the ensemble, the'ledgerid' is putting under the respective 'bookieid'. During ensemble formation, metadata will give us the bookies info. We will this and add the 'ledgerid' only to the respective 'bookieid'.
For ex: 
0001 is the bookieId, say it contains children znodes as ledgers like: 0001/L_001,L_005 etc. 
0002 is another, say it contains children znodes as ledgers like: 0002/L_005 etc.
# ??Elect an Auditor Bookie for the entire Bookie cluster:?? Makes it simple and minimize the duplication efforts.
Only one Auditor is monitoring the available bookies, his responsibility is to watch the bookies 'available' znode. When he detects any bookie failure through 'NodeChildrenChanged' watcher, will publish the same in 'failedbookies' path in Zookeeper. Publishing is done by creating the respective unique 'bookieid' persistent znode in the 'failedbookies' path. All the other bookies will be watching on the 'failedbookies' znode.
# ??Starts Re-replication:??
Upon receiving the 'failebookies' notification, all the bookies will compete eachother for acquiring the lock(using zk distributed lock-ephemeral znode). Whoever acquires will start re-replication, all others will look into this lock for knowing the re-replication status. When creating the lock, the replica bookie will add its IP:PORT info to it. After finishing the re-replication, he will delete the lock and so that others will takeup and continue this cycle till the end.
For ex: 0002/L_005/lock

How Re-replication works, I have commented in BookKeeper-237. Please go through the link.
https://issues.apache.org/jira/browse/BOOKKEEPER-237?focusedCommentId=13281470&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13281470

Responsible classes:
- ServerConfiguration : for zk path configurations
- LedgerCreateOp,LedgerHandle : forming bookie-ledger mappings
- Bookie : starting bookie chain
- BKLedgerMapper : for updating bookie-ledger mapping

- BookieIdGenerator : generating unique bookieId
- BookieChain : forming the detection chain
- AuditorBookieChain : by default auditor based, later it can have CircularBookieChain etc.
- AuditorElector : do auditor election
- Auditor : watch bookies
- BookieObserver : listening failedbookie notifications.

Tests:
- BookieLedgerMetadataTest
- AuditorBookieTest
- BookieDetectionTest
- BookieIdGenTest


-Rakesh

                
> Introduce chain of bookies for distributing the re-replication
> --------------------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.patch
>
>
> The idea is to form logical chain and each Bookie will be taking care each other. On any Bookie failure, his neighbour will act and initiate re-replication.
> For example:
> Assume we have 1,2,3,4,5,6 bookies and will be forming the following chain
> 01 <- 02 <- 03 <- 04 <- 05 <- 06 <- 01
> Here, each one should take care of my immediate predecessor node. The lowest node should always care the highest node and is forming the logical closed chain.
> Reference docs attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Uma Maheswara Rao G (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438711#comment-13438711 ] 

Uma Maheswara Rao G commented on BOOKKEEPER-272:
------------------------------------------------

{quote}
There should be a main method in AutoRecoveryManager. I think AutoRecoveryManager should be responsible for running recoveryworker as well. Im not sure whether the node which runs auditor should run a recovery worker at the same time. This is an open question.
{quote}
Oh, I was assuming that, every Recovery node will have (Auditor Participants + RW ) services running. But one will actually win for doing auditor work. But RW will do its job normally in all nodes. All other recovery node's auditor participants will look for the next election to win the Real Auditing job. 
I am not sure my assumption is same as others. Any other thoughts?

                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.3.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Flavio Junqueira (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400628#comment-13400628 ] 

Flavio Junqueira commented on BOOKKEEPER-272:
---------------------------------------------

bq. I think we should keep it in the bookie process and enabled by default

I've expressed this concern before that I don't understand completely the use cases for the auditor. Short-lived ledgers will not need it because the window of vulnerability is narrow. In my mind the auditor is mainly for use cases that require long-lived ledgers. What's the rationale for having it enabled by default? 

bq. There's no strong requirement for it, and it just confuses matters. 

It confuses matters to mix both. BookKeeper provides a set of guarantees, and the auditor complements that set of properties by healing the replica set of under-replicated ledgers. The auditor also does not provide any functionality in the critical path of BookKeeper client calls. Given these two observations, it makes more sense to me to keep it separated. It makes it simpler to reason about what the system is doing and may avoid undesirable interferences.   
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Ivan Kelly (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294493#comment-13294493 ] 

Ivan Kelly commented on BOOKKEEPER-272:
---------------------------------------

I don't think we need the bookie. As we need to run a check on the ledger to find which parts are underreplicated (since some segments may not include the failed bookie), we may as well just record the ledger id. Also, it'd be better to only have one worker fixing a single ledger to avoid conflicting writes when updating the ledger metadata.
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rakesh R updated BOOKKEEPER-272:
--------------------------------

    Attachment:     (was: BOOKKEEPER-272.patch)
    
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Flavio Junqueira (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400522#comment-13400522 ] 

Flavio Junqueira edited comment on BOOKKEEPER-272 at 6/25/12 3:17 PM:
----------------------------------------------------------------------

{quote}
bq. For the moment, and maybe for ever, whether the auditor runs should be configurable

I haven't included in this patch, since there is still confusions whether to treat as a plugin service or not.
{quote}

I think that whether the auditor runs or not should be configurable. Is that a problem?
                
      was (Author: fpj):
    {quote}
{quote}
For the moment, and maybe for ever, whether the auditor runs should be configurable
{quote}
I haven't included in this patch, since there is still confusions whether to treat as a plugin service or not.
{quote}

I think that whether the auditor runs or not should be configurable. Is that a problem?
                  
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Ivan Kelly (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400581#comment-13400581 ] 

Ivan Kelly commented on BOOKKEEPER-272:
---------------------------------------

For the moment, I think we should keep it in the bookie process and enabled by default. We shouldn't worry about making a plugin interface, or keeping it in a separate maven module. There's no strong requirement for it, and it just confuses matters. 

However, it is good to have a clean interface for starting and stopping the service. 

Regarding BOOKKEEPER-246, this patch should call the interface defined to record the failed ledgers. I'll take a look at this in a while. I'll also work out the order in which the patches will need to go in.
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rakesh R updated BOOKKEEPER-272:
--------------------------------

    Attachment: BOOKKEEPER-272.4.patch
    
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-auto-recovery
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.3.patch, BOOKKEEPER-272.4.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400531#comment-13400531 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

bq.I think that whether the auditor runs or not should be configurable. Is that a problem?

Hi Flavio, Following are the options to make thie feature configurable:

# Initially I was thinking to provide new configuration item 'enableAutoRecovery=true' in the bk_server.conf file. By default this feature will be disabled.
# But after seeing the comments in BOOKKEEPER-304. Whether to consider as plugin process?
https://issues.apache.org/jira/browse/BOOKKEEPER-304?focusedCommentId=13398408&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13398408
For this, we need to expose a ServicePlugin interface and will be configured in the bk_server.conf for enabling the feature.


I'm just confused to take which one is more feasible?
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rakesh R updated BOOKKEEPER-272:
--------------------------------

    Description: 
The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.

There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

  was:
The idea is to form logical chain and each Bookie will be taking care each other. On any Bookie failure, his neighbour will act and initiate re-replication.

For example:
Assume we have 1,2,3,4,5,6 bookies and will be forming the following chain

01 <- 02 <- 03 <- 04 <- 05 <- 06 <- 01

Here, each one should take care of my immediate predecessor node. The lowest node should always care the highest node and is forming the logical closed chain.

Reference docs attached in BookKeeper-237.

        Summary: Provide automatic mechanism to know bookie failures  (was: Introduce chain of bookies for distributing the re-replication)

I have edited the JIRA subjectline inorder to reduce the scope of this task.
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Ivan Kelly (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440181#comment-13440181 ] 

Ivan Kelly commented on BOOKKEEPER-272:
---------------------------------------

{quote}
If I understand correctly, on NodeExistsException we are trying to append the missingReplica to the underreplicated ledger so that will notifies about one more down bookie which contains the ledger copy.
If we return simply, then the setData() method will not be called and there is a chance of missing the info about second replica.

For Ex: L00001 ensemble BK1, BK2, BK3.
Say BK1 fails initially, then will markUnderreplicated ledger as L000001(BK1 as the data).
Now again BK2 has failed, then while creating will get NEE, so we will append BK2 also like: L000001(BK1 BK2).

I think "break; statement" is making sense and after that the duplicate entry addition should be removed as per my latest patch.
Am I missing anything?
{quote}
The point of this code is, that if the node already exists, then there is already a missing replica. We loop through the missingReplicas, to see if the new missingReplica is already there or not. If so, then we can assume someone else has reported this replica missing, so we return. 

{quote}
Yeah I'll add main method also. But what about retaining start() and stop() method as public. In future this will allow others(any external entity) to manage the recovery process easily ?
{quote}
start() and stop() as public is fine. But if initialization fails, the ctor should throw and exception. This way, the null checks are unneeded. 

Regarding Service, we can look at that again after this is in. See my comment on BOOKKEEPER-247.
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-auto-recovery
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.3.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rakesh R updated BOOKKEEPER-272:
--------------------------------

    Attachment: BOOKKEEPER-272.Auditor.patch

Thanks Ivan for the comments.I have updated the latest patch.

Auditor Bookie(single entity in bk cluster) - watching the available bookies and setup bookie failure notifications. 

Please have a look.
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Introduce chain of bookies for distributing the re-replication

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290837#comment-13290837 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

??Why BKLedgerMapping :?? Parsing of ledgers and identifying failed bookies may take long time, consider the case where lakhs of ledgers created by NN-HA edit logs. So I thought of having a reverse mapping(bk vs its ledgers). Since this mapping is constant until any re-replication, so I made it as persistent node under 'bookieid'.

??Why Auditor :?? Single instance for entire bk cluster. After analyzing multiple approaches I found this would avoid many problems, like few cases I mentioned earlier in this JIRA.Please see links:
https://issues.apache.org/jira/browse/BOOKKEEPER-272?focusedCommentId=13284884&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13284884

                
> Introduce chain of bookies for distributing the re-replication
> --------------------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.patch
>
>
> The idea is to form logical chain and each Bookie will be taking care each other. On any Bookie failure, his neighbour will act and initiate re-replication.
> For example:
> Assume we have 1,2,3,4,5,6 bookies and will be forming the following chain
> 01 <- 02 <- 03 <- 04 <- 05 <- 06 <- 01
> Here, each one should take care of my immediate predecessor node. The lowest node should always care the highest node and is forming the logical closed chain.
> Reference docs attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294839#comment-13294839 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

Its just suggestion only. I think will see whether any advantage in using failedbookies information during detection logic BOOKKEEPER-247 JIRA.

>From Auditor side, anyway it needs the data version logic to address the race condition mentioned in our above comments. So the auditor is publishing the underreplicated ledgerIds and the metadata looks like:

>> LedgerId with failedbookie details as comma separated data.

Example:
/ledgers/auditor/underreplicated/L0001 (data as -> Metadataversion, BK1, BK2)
/ledgers/auditor/underreplicated/L0002 (data as -> Metadataversion, BK1, BK2)
/ledgers/auditor/underreplicated/L0003 (data as -> Metadataversion, BK1, BK2)
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294398#comment-13294398 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

Hi,

As part of this Auditor logic, I'm planning to publish the suspected/underreplicated ledger metatdata as Ledger_BookieIP:PORT:

Assume we have three bookies BK1, BK2, BK3 and has ledgers L0001, L0002, L0003

Example:
Say, BK1 BK2 has failed and started two new bookies, then Auditor will be publishing as:
/ledgers/auditor/underreplicated/L0001_BK1
/ledgers/auditor/underreplicated/L0002_BK1
/ledgers/auditor/underreplicated/L0003_BK1
/ledgers/auditor/underreplicated/L0001_BK2
/ledgers/auditor/underreplicated/L0002_BK2
/ledgers/auditor/underreplicated/L0003_BK2

By this, I'm thinking the target re-replicators would be able to acquire distributed lock on L0001_BK1...etc and start re-replicating. Is this ok for you?

Thanks,
Rakesh
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440194#comment-13440194 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

Thanks for the clarifications. Just commented the fix I'm thinking. Could you have a look.

bq.The point of this code is, that if the node already exists, then there is already a missing replica. We loop through the missingReplicas, to see if the new missingReplica is already there or not. If so, then we can assume someone else has reported this replica missing, so we return.

so we have two scenarios:
# L0001 contains only BK1. while marking missingReplica of BK2, got NEE. Assume there is already a missing replica. Silently return as you told
# L0001 contains only BK2. while marking missingReplica of BK2, got NEE. Assume only single auditor and no other is marking. So again we need to merge to the zkMetadata and update in zk.

On NEE, 
check whether missing replica has already present in the zk urLedger metadata. If yes return otherwise merge the missing replica to the urLedger missingreplicas and call setData()

{code}
try {
       byte[] bytes = zkc.getData(znode, false, s);
       String existingMissingReplicas = new String(bytes, UTF8);
       if(existingMissingReplicas.contains(missingReplica)){
              return;
       }
       TextFormat.merge(existingMissingReplicas, builder);
       zkc.setData(znode,
                  TextFormat.printToString(builder.build()).getBytes(UTF8),
                  s.getVersion());
       return;
}catch (KeeperException.NoNodeException nne) {
{code}
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-auto-recovery
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.3.patch, BOOKKEEPER-272.Auditor.1.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Ivan Kelly (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294542#comment-13294542 ] 

Ivan Kelly commented on BOOKKEEPER-272:
---------------------------------------

{quote}
bq. I don't think we need the bookie

Here I could see one race condition. Say first Auditor is coming to publish failure of BK2 in L0001. Meantime BK4 has finished the re-replication of BK3's L0001 and about to delete the entry from /underreplicated. In this case, Auditor will silently continues by seeing L0001 and the other worker will delete the L0001 entry thinking there is no more failures.

Solution I'm thinking to check the data version before doing zk operation(similar logic we built in BKJM CurrentInProgress). I'm planning to keep data as failed bookie information.
{quote}
Yes, I think in this case, when we see BK3's failure and L0001 already exists, we should bump the version number. We shouldn't really be changing any vital data in zookeeper without checking the version number anyhow. Hopefully this will be a very rare situation anyhow, in a 3e2q ledger, two machines dropping like this would probably mean data loss.

{quote}
Yeah, I understand. I'm having one suggestion, anyway auditor knows about the failed bookies and its ledgers when publishing the underreplicated ledgers. Why don't we keep the failed bookie as data inside the underreplicated ledger. So the worker(segment checker) only looks to this bookie and get corresponding index directly from the ZK ledger metadata?.{quote}
Im not sure what you mean here. Having the failed bookie stored in the data is useful for debugging purposes, but we should do a check on the ledger beforehand anyhow to determine what to recover. Are you trying to avoid another read to the ledger znode?

                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-272) Provide automatic mechanism to know bookie failures

Posted by "Rakesh R (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294534#comment-13294534 ] 

Rakesh R commented on BOOKKEEPER-272:
-------------------------------------

{quote}
I don't think we need the bookie
{quote}
Here I could see one race condition. Say first Auditor is coming to publish failure of BK2 in L0001. Meantime BK4 has finished the re-replication of BK3's L0001 and about to delete the entry from /underreplicated. In this case, Auditor will silently continues by seeing L0001 and the other worker will delete the L0001 entry thinking there is no more failures.

Solution I'm thinking to check the data version before doing zk operation(similar logic we built in BKJM CurrentInProgress). I'm planning to keep data as failed bookie information.

{quote}
As we need to run a check on the ledger to find which parts are underreplicated (since some segments may not include the failed bookie), we may as well just record the ledger id.Also, it'd be better to only have one worker fixing a single ledger to avoid conflicting writes when updating the ledger metadata.
{quote}
Yeah, I understand. I'm having one suggestion, anyway auditor knows about the failed bookies and its ledgers when publishing the underreplicated ledgers. Why don't we keep the failed bookie as data inside the underreplicated ledger. So the worker(segment checker) only looks to this bookie and get corresponding index directly from the ZK ledger metadata?.
                
> Provide automatic mechanism to know bookie failures
> ---------------------------------------------------
>
>                 Key: BOOKKEEPER-272
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-272
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: BOOKKEEPER-272.1.patch, BOOKKEEPER-272.2.patch, BOOKKEEPER-272.Auditor.patch
>
>
> The idea is to build automatic mechanism to find out the bookie failures. Setup the bookie failure notifications to start the re-replication process.
> There are multiple approaches to findout bookie failures. Please refer the documents attached in BookKeeper-237.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira