You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by "Ivan Kelly (Created) (JIRA)" <ji...@apache.org> on 2011/11/29 18:37:39 UTC

[jira] [Created] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

EntryLogger doesn't detect when one of it's logfiles is corrupt
---------------------------------------------------------------

                 Key: BOOKKEEPER-126
                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
             Project: Bookkeeper
          Issue Type: Bug
            Reporter: Ivan Kelly
            Priority: Blocker
             Fix For: 4.1.0


If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Rakesh R (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242095#comment-13242095 ] 

Rakesh R commented on BOOKKEEPER-126:
-------------------------------------

So if I understand the conclusion correctly, we have discussed and identified two cases to be implemented as part of this jira:
# *When ledger flushing failed with IOException?*
    +Soln+ r-o mode:
    >> On IOE bookie (say, multi ledger dirs -> /tmp/bk1-data, /tmp/bk2-data etc) should see next ledger dirs for writing and mark the tried dirs as BAD_FOR_WRITE. Finally, if there is no success, then switch to r-o mode. 
    >> Also, if journal failed with IOE, immediately switch to r-o mode.
    Shall I open a subtask for the impl?
# *Ledger entries got corrupted due to disk failures or bad sectors?*
   +Soln+ scanner approach:
   IMHO, The following are the sequence of the healing procedure:
   * a) Perform scan and prepare entries owning:
    >> On startup bookie would contact ZK for the ledger metadata and on every write it would update the ledger metadata map.
    >> Special datastructure <ledgerDirId, <entryId, replica bookies>> needs to designed for the same contains ledgerId, entries owning, ledger dirs etc. ?

   * b) Read the entries and identify missing entries if any?
   Yeah, the DistributionScheduling is happening in the client side and batch reading is also good.
   I am thinking that the ledgers are local to the server and how about read them directly instead of using PerChannelBookieClient?.

   * c) Initiate re-replication:
   Corrupted bookie first identify the peer bookie which has the copy and send notification to this for re-replication. Here, it could use ZK watchers for sending the notification, for this each bookie should listen to a specfic persistent znode say 'underreplicaEntries'. The corrupted bookie should update the data <ledgerId, missingEntryIds> to 'underreplicaEntries' of the corresponding bookie which has the copy. On notification, the peer bookie should use the same logic of DistributionScheduling algo which presents in the client side. 
Is it legal, server depending on client?, otw server could randomly select a re-replica bookie and update the ZK ledger metadata?

How the ZK ledger metadata ('nextReplicaIndexToReadFrom') looks like after re-replication?
   For example:
   Say, entries 0-100 ledger metadata mapping is
   0 (A, B, C)
   50(B, C, D)
   End Ledger:100
   
   Assume, entries 30 to 39  got corrupted in B and say rereplicated to E. Is it like?
   0(A, B, C)
   30(E, B, C)
   40(B, C, D)
   50(B, C, D)

If you agree with the above approaches, probably do a detailed write-up.


   @Sijie
   another tough thing is we need to tell closed ledger from opened/in-recovery ledger, when handling last ensemble of opened/in-recovery ledger.

   I am missing something, Could you give more details on this?
                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Rakesh R (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229993#comment-13229993 ] 

Rakesh R commented on BOOKKEEPER-126:
-------------------------------------

Yes, I agree with you. Its good, if able to handle the under-replication in bookkeeper.

Following are the multiple thoughts comes to my mind, please go through this.

*Proposal-1)* As per my observation apart from 'bookie down' scenario(here it can automate admin tool), the failure of anyone of the following 'flush()' operation can leads to dataloss. Since it is async opr the client will be unaware about these failures, further entries will override the data and so only these entries needs to considered as 'under-replicated' and initiate under-replica action.
+Bookie.java+
{noformat}
try {
    ledgerCache.flushLedger(true);
} catch (IOException e) {
    LOG.error("Exception flushing Ledger", e);
    flushFailed = true;
}
try {
    entryLogger.flush();
} catch (IOException e) {
    LOG.error("Exception flushing entry logger", e);
    flushFailed = true;
}
{noformat}

*Proposal-2)* Initiate the recovery whenever the client finds any missing entries and then succesfully get the same from next bookie. 
Still there is a gap of dataloss, say some data got lost/corrupt and no read operation in near future.

*Proposal-3)* Daemon thread can be associated with every bookie and do periodical scanning of its own ledgers and its entries, if found any errors can contact ZK and tries to initiate replication of those entries.
In this case, it needs to build a mechanism to communicate between bookies, as per my understanding there is no inter-bookie protocol exists. Also the cost of scannig will be very high if the ledgers/entries are more :-)

-Rakesh
                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Sijie Guo (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233425#comment-13233425 ] 

Sijie Guo commented on BOOKKEEPER-126:
--------------------------------------

yup. scanner is a possible way to handle under-replicated blocks. following just some thoughts from mine.

an entry might be placed multiple times in different entry log files, due to journal replaying. the only referred entry position is recorded in ledger index, so scanner may scan ledger by ledger. the place to run the scanner, I guess, it would be better in GarbageCollectorThread, after gc actions (those gc ledgers we don't need to care).

when scanning a ledger, bookie server should know what entries it should own. which means, bookie server needs the distribution info of a ledger. maybe we can record what DistributionSchedule a ledger used in ledger metadata.

for inter-bookie communication, why not consider using PerChannelBookieClient? And it may be better to add a batch read op for performance consideration.

another tough thing is we need to tell closed ledger from opened/in-recovery ledger, when handling last ensemble of opened/in-recovery ledger.



                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Rakesh R (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242264#comment-13242264 ] 

Rakesh R commented on BOOKKEEPER-126:
-------------------------------------

Oh, I think I understood, for the corrupted/missing entries, the bookie should read from the brother bookie and tries a write operation to himself(no ensemble change req).
One corner case, if this re-write operation is failed in all the ledger dirs?, probably forced to choose another bookie and form a new ensemble?

Secondly, for an opened/in-recovery ledger, how about the idea to check all the brothers and if no read success corresponding to x entry, will consider (x - 1) as the last entry. Packets in flight will be considered in the next pass, is it ok?


                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Ivan Kelly (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229130#comment-13229130 ] 

Ivan Kelly commented on BOOKKEEPER-126:
---------------------------------------

I haven't seen this issue occur in the wild, but it's something we've reasoned is possible. So imagine that a logfile becomes corrupt, be it truncation or full of junk. When the bookie tries to read an entry from the bookie which was contained within the corrupted section, it will fail as if that entry did not exist. This is safe within the system, because the entry will be read from another bookie. However, we've lost a replica, so the data is underreplicated and we don't know it. For this reason, each bookie should run some sort of fsck process at an interval to ensure that everything is replicated sufficiently.
                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Ivan Kelly (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230053#comment-13230053 ] 

Ivan Kelly commented on BOOKKEEPER-126:
---------------------------------------

@Sijie
Do we kill the bookie though? I think if an many errors occurs on flushing, we should take the bookie out of rotation, as it indicates a failing disk. 

@Rakesh
Proposal-2 is interesting, but it would only run on read, by which time it could be too late. I think Proposal-3 is something we need in any case, though to start it wouldn't have to be a daemon, but a tool that an admin could run to verify the filesystem is in order.
                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Rakesh R (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233327#comment-13233327 ] 

Rakesh R commented on BOOKKEEPER-126:
-------------------------------------

Yeah. when writing, bookkeeper client not choose the r/o bookie irrespective of ledger and journal directories are in same or diff disk would be more feasible to me.


@Sijie
flushing failure will not cause any entry under-replicated. (journal replay will recover it). The case we need consider is that entries before lastLogMark. If corruption happened on these entries, they are under-replicated.

Regarding this point: corruption due to disk failure will be a corner case, but I feel it would be good to consider this also as bookkeeper is intended for very sensitive metadata (either in this jira or a separate jira task). Here it might required to have a periodic scanners and should handle under-replicated blocks.
                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Sijie Guo (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242133#comment-13242133 ] 

Sijie Guo commented on BOOKKEEPER-126:
--------------------------------------

thanks, Rakesh.

+1 for opening a new jira discussion r-o mode when IOE flushing journal/ledgers.

{quote}
b) Read the entries and identify missing entries if any?
Yeah, the DistributionScheduling is happening in the client side and batch reading is also good.
I am thinking that the ledgers are local to the server and how about read them directly instead of using PerChannelBookieClient?.
{quote}

oh, seems that I don't explain clearly at my previous comment. As my thought, bookie server would just find the corrupted/missing entries that it should own, then schedule a re-replication procedure itself to read the corrupted/missing entries from its brother bookie servers (in same quorum). so the read is a remote read from other server.

in this way, we don't even to change the metdata in zookeeper.

as the example you explain,

{quote}
Say, entries 0-100 ledger metadata mapping is
0 (A, B, C)
50(B, C, D)
End Ledger:100
{quote}

B runs a scanner itself, it found that 30-39 is corrupted/missing. it schedule a re-replication on (30-39), the re-replication would be a remote read (30-39) from C or D. we don't need to change ledger metdata, changing ledger metdata will introduce distribute consensus issue (you can refer discussion in BOOKKEEPER-112).

{quote}
@Sijie
another tough thing is we need to tell closed ledger from opened/in-recovery ledger, when handling last ensemble of opened/in-recovery ledger.

I am missing something, Could you give more details on this?
{quote}

for a closed ledger, we know the entry range of an ensemble. but for an opened/in-recovery ledger, we have no idea about the end entry of last ensemble.
                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Rakesh R (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229124#comment-13229124 ] 

Rakesh R commented on BOOKKEEPER-126:
-------------------------------------

Hi Ivan,

yeah, its pretty interesting. Could you please give more details on entry log corruption, the possible cases.

-Rakesh
                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Sijie Guo (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232011#comment-13232011 ] 

Sijie Guo commented on BOOKKEEPER-126:
--------------------------------------

> Can we narrow down to cases where IOException occurs on flushing ledger entries and bookie is still running. Only those entries would select as under-replicated, 

I think flushing failure will not cause any entry under-replicated. (journal replay will recover it). The case we need consider is that entries before lastLogMark. If corruption happened on these entries, they are under-replicated. Your proposal-2 and proposal-3 could be used on detecting/re-replicating these entries.

The only side-effect of flushing failure is all following writes may fail, but the reads could still succeed, those flushed failed data are still buffered on EntryLogger, they could be read.

If we don't shut down the bookie server, it would be still in the available list. write requests still can be sent to this bookie, but they would fail, client would choose new ensemble to write, which increase the writes latency (as what we found in BOOKKEEPER-180).

for some IOExceptions such as 'No enough disk space', we should shutdown bookie server immediately to exclude it from available list. I am not sure is there any other recoverable io exception (means first time flush failed with an IOException, second time it succeed)? If not, I think we could shutdown bookie server when encountering IOException during flushing data.
                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Ivan Kelly (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ivan Kelly updated BOOKKEEPER-126:
----------------------------------

    Fix Version/s:     (was: 4.1.0)
                   4.2.0
    
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.2.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Rakesh R (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230932#comment-13230932 ] 

Rakesh R commented on BOOKKEEPER-126:
-------------------------------------

@Sijie
I agree with you, simple logic is to shutdown the bookie when thresold reaches and give the ctrl to the bookie recovery admin tool or restart the bookie. But I just added the alternative idea of handling replica to dig more...
                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Sijie Guo (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230117#comment-13230117 ] 

Sijie Guo commented on BOOKKEEPER-126:
--------------------------------------

@Ivan

a good point. currently we don't kill the bookie.

if the ledger directory and journal directory in same disk, the journal flushing would be failed, then bookie would be killed.

I think it would be great to add such logic to shutdown bookie when encountering too many IOException during flushing. 
                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Sijie Guo (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230044#comment-13230044 ] 

Sijie Guo commented on BOOKKEEPER-126:
--------------------------------------

Thanks, Rakesh R.

For Proposal-1, flush error on SyncThread would not cause data loss. if flush error happens, we don't roll the log marker, so all the entries are still in journal files. these journal files could be replayed when bookie server restarted.
                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Sijie Guo (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233213#comment-13233213 ] 

Sijie Guo commented on BOOKKEEPER-126:
--------------------------------------

yeah. when I were working on BOOKKEEPER-180, I had considered let bookie go into readonly mode. one thing to do is to reject write requests on server side, the other thing is to let bookkeeper client not choose the readonly bookie, otherwise it would increment latency due to changing ensemble. 
                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Rakesh R (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230917#comment-13230917 ] 

Rakesh R commented on BOOKKEEPER-126:
-------------------------------------

Thanks Sijie,Ivan for the suggestions:-)

I think, the journal IOException would immediately reaches to the client as an addEntry failure, so the client would be able to act upon it.

IMO, Can we narrow down to cases where IOException occurs on flushing ledger entries and bookie is still running. Only those entries would select as under-replicated, either an external tool can be triggered or shutdown the bookie. 

Also, I am bit confused when to shutdown the bookie, how to define the threshold value for no: of IOExceptions. I feel, instead of shutdown the bookie on IOException, shall we make use of ZK metadata(the entry-bookie mappings) and identify a way using ZK(watchers) to notify peer bookies which has a replica of that entry (build inter-bookie protocol through ZK).
                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Sijie Guo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261323#comment-13261323 ] 

Sijie Guo commented on BOOKKEEPER-126:
--------------------------------------

since the fsck like tool is planned to be added in 4.2.0, how about moving this task (including its sub-tasks BOOKKEEPER-199) from 4.1.0 to 4.2.0?
                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Ivan Kelly (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232711#comment-13232711 ] 

Ivan Kelly commented on BOOKKEEPER-126:
---------------------------------------

Hmm, im not sure about shutting down now actually, because even if flushing fails, the data in the bookie which has been flushed is still valid. It might make more sense to make the bookie readonly.
                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-126) EntryLogger doesn't detect when one of it's logfiles is corrupt

Posted by "Sijie Guo (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242269#comment-13242269 ] 

Sijie Guo commented on BOOKKEEPER-126:
--------------------------------------

> One corner case, if this re-write operation is failed in all the ledger dirs?, probably forced to choose another bookie and form a new ensemble?

maybe. a simple way is to choose a bookie to replicate all entries of the ensemble contains the corrupted/missing. 

{quote}
Say, entries 0-100 ledger metadata mapping is
0 (A, B, C)
50(B, C, D)
End Ledger:100
{quote}

as your example, 30-39 is corrupted, B is in the corn case as you stated, it forced to choose another bookie E. E replicates the entries between 0~50 belongs to B. And replace (A, B, C) to (A, E, C), as what BookKeeperAdmin does.

>  Secondly, for an opened/in-recovery ledger, how about the idea to check all the brothers and if no read success corresponding to x entry, will consider (x - 1) as the last entry. Packets in flight will be considered in the next pass, is it ok?

basically seems ok.
                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the corruption. Quorum writes stops this being a problem at the moment, but we should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira