You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@bookkeeper.apache.org by GitBox <gi...@apache.org> on 2018/01/27 01:31:41 UTC

[GitHub] sijie opened a new issue #1066: Don't allow write entries with same entry id multiple times in bookie

sijie opened a new issue #1066: Don't allow write entries with same entry id multiple times in bookie
URL: https://github.com/apache/bookkeeper/issues/1066
 
 
   **FEATURE REQUEST**
   
   1. Please describe the feature you are requesting.
   
   multiple entries with same entry id can be written into one bookie in following 2 cases:
   
   - if `ensemble change` is disabled, client will reattempt writing an entry to same bookie again after failures. The failures can come from timeouts.
   - if `ensemble change` is flapping, e.g. [A, B, C] -> [A, B, D] -> [A, B, C].
   - auto recovery can write entries to bookies that used to be excluded from an ensemble.
   
   Currently bookies are allowed writing entries with same entry id multiple times. That means multiple entries of same entry id would appear in entry log files, but only one entry's location will be updated in ledger cache. 
   
   Theoticially this is okay for most of the time. However, in practice, we might want to disallow this behavior, since this would introduce potentially "inconsistency" concerns, if entry is corrupted (e.g. memory corruption, client nic corruption) between multiple retries. 
   
   An improvement can be:
   
   - enforce write side checksum #1046 
   - compare the entries if a bookie receives a duplicated entry
   - if the duplicated entry is same, respond success back without writing the duplicated entry;
   - if the duplicated entry is not same, reject this write and respond a special error code back.
   
   2. Indicate the importance of this issue to you (blocker, must-have, should-have, nice-to-have). Are you currently using any workarounds to address this issue?
   
   *must-have*
   
   3. Provide any additional detail on your proposed use case for this feature.
   
   more details about the conversation between @jvrao and @sijie
   
   ```
   jujjuri [4:55 PM]
   Hi, I have a question about bk.disableEnsembleChangeFeature.isAvailable().
   
   [4:55 PM]
   I think this is @sijie?s checkin.
   [4:56 PM]
   If this feature is enabled and on bookie write failure we are doing unsetSuccessAndSendWriteRequest
   [4:56 PM]
   it is possible that the bookie write failed for various reasons including timeout
   [4:56 PM]
   so in that case we could be sending duplicate entries to the bookie
   
   J [4:57 PM]
   joined #dev.
   
   sijie [4:57 PM]
   @jujjuri checking
   
   sijie [5:02 PM]
   @jujjuri: yes. that change is to reattempt sending the requests. because disableEnsembleChange is enabled.
   [5:03 PM]
   disableEnesembleChange is only enabled when you configure a feature provider
   [5:03 PM]
   the default feature provider disable all features.
   
   jujjuri [5:03 PM]
   I understand @sijie even if we enable that feature
   [5:03 PM]
   I am saying we could send duplicate entries to bookies
   
   sijie [5:04 PM]
   yes
   
   jujjuri [5:04 PM]
   and bookies can handle duplicate entries ?
   
   sijie [5:04 PM]
   we can send duplicate entries. but what is the concern of sending duplicate entries?
   [5:05 PM]
   yes it handles duplicate entries
   
   jujjuri [5:05 PM]
   what happens ? does it fail?
   [5:05 PM]
   the second write? and we keep retrying it?
   [5:06 PM]
   tell me a case -
   
   sijie [5:06 PM]
   the second write doesn?t fail.
   
   jujjuri [5:06 PM]
   we are trying to write to bookie1 entryId 100
   [5:06 PM]
   it failed for timeout
   [5:06 PM]
   and we resubmitted it
   
   sijie [5:07 PM]
   you might have multiple entries in entrylog files, and only one will be indexed.
   
   jujjuri [5:07 PM]
   firs write succeeded; but second write failed
   [5:07 PM]
   with entryExists or something right?
   [5:07 PM]
   hmm
   [5:07 PM]
   that is my question
   
   sijie [5:07 PM]
   no the second write doesn?t fail
   
   jujjuri [5:07 PM]
   if we endup multiple entries in the entrylog
   [5:07 PM]
   what if they are different?
   
   sijie [5:08 PM]
   why they will be different
   [5:08 PM]
   it is same situation even without this code path
   
   jujjuri [5:08 PM]
   I don't know. some bug in the client code or buffer corruption;
   
   sijie [5:08 PM]
   we will still send an entry multiple times.
   [5:08 PM]
   think about ensemble changes.
   
   jujjuri [5:08 PM]
   sure
   [5:09 PM]
   but as per metadata
   
   sijie [5:09 PM]
   an ensemble is changed from [A, B, C] to [A, B, D], then back to [A, B, C]
   
   jujjuri [5:09 PM]
   the failed bookie is not considered
   [5:09 PM]
   sure..
   [5:11 PM]
   Another Q: As part of 34e8bf200c6f3797bd6fa4c5d86646e9eb7f0d3b @ivankelly added a tracking for pendingWriteRequests in PendingAddOp.java; I don't see the use of this. Buffer if refcounted and released at BookieClient level. I am wondering if there is any real use of this counter.
   
   sijie [5:12 PM]
   so the question isn?t related to whether that feature is enabled or not. the question is more about whether the bookies are allowed have multiple entries of same entry id. that is a valid question/concern, we can enforce 1) write side checksum 2) compare the entries if received a duplicated entry 3) if an entry is same, respond successfully but don?t write again. if an entry isn?t same, reject the write and respond some error code.
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services