You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@bookkeeper.apache.org by GitBox <gi...@apache.org> on 2022/06/12 15:16:24 UTC

[GitHub] [bookkeeper] hangc0276 opened a new pull request, #3329: Fix data lose when configured multiple ledger directories

hangc0276 opened a new pull request, #3329:
URL: https://github.com/apache/bookkeeper/pull/3329

   ### Motivation
   We found one place where the bookie may lose data even though we turn on fsync for the journal.
   Condition:
   - One journal disk, and turn on fsync for the journal
   - Configure two ledger disks, ledger1, and ledger2
   
   Assume we write 100MB data into one bookie, 70MB data written into ledger1's write cache, and 30 MB data written into ledger2's write cache. Ledger1's write cache is full and triggers flush. In flushing the write cache, it will trigger a checkpoint to mark the journal’s lastMark position (100MB’s offset) and write the lastMark position into both ledger1 and ledger2's lastMark file.
   
   At this time, this bookie shutdown without flush write cache, such as shutdown by `kill -9` command, and ledger2's write cache (30MB) doesn’t flush into ledger disk. But ledger2's lastMark position which persisted into lastMark file has been updated to 100MB’s offset.
   
   When the bookie starts up, the journal reply position will be `min(ledger1's lastMark, ledger2's lastMark)`, and it will be 100MB’s offset. The ledger2's 30MB data won’t reply and that data will be lost.
   
   
   Discussion thread:
   https://lists.apache.org/thread/zz5vvv2yd80vqy22fv8wg5s2lqtkrzh9
   
   ### Changes
   1. Add checkpoint with specific ledgerDirManager support. When checkpoint triggered by specific ledgerDirManager, we only write `lastMark` into specific ledgerDirs 
   2. When bookie startup, read the minimal `lastMark` instead of the maximal `lastMark` as current last mark.
   3. I will add a test soon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] eolivelli merged pull request #3329: Fix data lost when configured multiple ledger directories

Posted by "eolivelli (via GitHub)" <gi...@apache.org>.
eolivelli merged PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] hangc0276 commented on a diff in pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
hangc0276 commented on code in PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#discussion_r900845051


##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Journal.java:
##########
@@ -660,7 +660,7 @@ static void writePaddingBytes(JournalChannel jc, ByteBuf paddingBuffer, int jour
     // Should data be fsynced on disk before triggering the callback
     private final boolean syncData;
 
-    private final LastLogMark lastLogMark = new LastLogMark(0, 0);
+    private final LastLogMark lastLogMark = new LastLogMark(Long.MAX_VALUE, Long.MAX_VALUE);

Review Comment:
   Maybe @merlimat has more context about init `lastLogMark` to 0



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] hangc0276 commented on a diff in pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
hangc0276 commented on code in PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#discussion_r900843052


##########
tools/perf/src/main/java/org/apache/bookkeeper/tools/perf/journal/JournalWriter.java:
##########
@@ -272,7 +272,7 @@ void execute(Journal[] journals) throws Exception {
             for (Journal journal : journals) {
                 Checkpoint cp = journal.newCheckpoint();
                 try {
-                    journal.checkpointComplete(cp, true);
+                    journal.checkpointComplete(cp, true, null);

Review Comment:
   I have added the javadoc on the `checkpointComplete` method interface.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] wenbingshen commented on a diff in pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
wenbingshen commented on code in PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#discussion_r916071847


##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Journal.java:
##########
@@ -245,7 +246,8 @@ void readLog() {
                     }
                     bb.clear();
                     mark.readLogMark(bb);
-                    if (curMark.compare(mark) < 0) {
+                    // get the minimum mark position from all the ledger directories to ensure no data loss.
+                    if (curMark.compare(mark) > 0) {

Review Comment:
   Since we are now replaying the journal from the smallest `LogMark`, can we get the journal log mark position of the current entry to the `JournalScanner` and compare it with the checkpoint position on the ledger disk where the entry to be restored is located, and only restore the entry whose `logMark` position is greater than the `checkpoint`?  So as to avoid repeatedly writing the data that has been flushed to the disk.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] hangc0276 commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1159449889

   @merlimat  @eolivelli @nicoloboschi @gaozhangmin @lordcheng10 I have updated the code and added the test to cover this change, Please help take a look, thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] StevenLuMT commented on a diff in pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
StevenLuMT commented on code in PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#discussion_r929771426


##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/SingleDirectoryDbLedgerStorage.java:
##########
@@ -829,7 +831,7 @@ private void swapWriteCache() {
     public void flush() throws IOException {
         Checkpoint cp = checkpointSource.newCheckpoint();
         checkpoint(cp);
-        checkpointSource.checkpointComplete(cp, true);
+        checkpointSource.checkpointComplete(cp, true, ledgerDirsManager);

Review Comment:
   @hangc0276 
   SingleDirectoryDbLedgerStorage's ledgerDirsManager can't cover all directories, only currently managed directory,I think you need all ledger' dirs 
   <img width="500" alt="image" src="https://user-images.githubusercontent.com/42990025/180979609-ce424961-0d19-4e90-bddb-e29af2af6bbe.png">
   
   so the structure of LedgerDirsManager is very confusing, I am also optimizing this, see for details in mail talking with @eolivelli 
   https://lists.apache.org/thread/njw4yhdgv9db64notq5o3hc0k8bbpl8v



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] StevenLuMT commented on a diff in pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
StevenLuMT commented on code in PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#discussion_r929771426


##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/SingleDirectoryDbLedgerStorage.java:
##########
@@ -829,7 +831,7 @@ private void swapWriteCache() {
     public void flush() throws IOException {
         Checkpoint cp = checkpointSource.newCheckpoint();
         checkpoint(cp);
-        checkpointSource.checkpointComplete(cp, true);
+        checkpointSource.checkpointComplete(cp, true, ledgerDirsManager);

Review Comment:
   @hangc0276 
   SingleDirectoryDbLedgerStorage's ledgerDirsManager can't cover all directories, only currently managed directory,I think you need all ledger' dirs 
   <img width="200" alt="image" src="https://user-images.githubusercontent.com/42990025/180979609-ce424961-0d19-4e90-bddb-e29af2af6bbe.png">
   
   so the structure of LedgerDirsManager is very confusing, I am also optimizing this, see for details in mail talking with @eolivelli 
   https://lists.apache.org/thread/njw4yhdgv9db64notq5o3hc0k8bbpl8v



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] gaozhangmin commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
gaozhangmin commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1155914475

   > > 2.When bookie startup, read the minimal lastMark instead of the maximal lastMark as current last mark.
   > > @hangc0276 How is this logic implemented? Do you need to modify the following logic?:
   > 
   > <img alt="image" width="1356" src="https://user-images.githubusercontent.com/19296967/173485039-9dc3959a-8e74-4b6b-97b5-85160f188fc2.png">
   
   +1, changes should also be made at `Journal.readLog`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] hangc0276 commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1159873860

   rerun failure checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] StevenLuMT commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
StevenLuMT commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1225341396

   fix old workflow,please see #3455 for detail


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] hangc0276 commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by "hangc0276 (via GitHub)" <gi...@apache.org>.
hangc0276 commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1605898652

   @eolivelli @merlimat @dlg99 @zymap I updated the code, and need your eyes for this PR, thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] hangc0276 commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1162741649

   > > > @hangc0276 I found when syncThread do flush, there exists duplicate `checkpointComplete` invoke.
   > > > SyncThread flush:
   > > > ```
   > > > private void flush() {
   > > >         Checkpoint checkpoint = checkpointSource.newCheckpoint();
   > > >         try {
   > > >             ledgerStorage.flush();
   > > >         } catch (NoWritableLedgerDirException e) {
   > > >             log.error("No writeable ledger directories", e);
   > > >             dirsListener.allDisksFull(true);
   > > >             return;
   > > >         } catch (IOException e) {
   > > >             log.error("Exception flushing ledgers", e);
   > > >             return;
   > > >         }
   > > > 
   > > >         if (disableCheckpoint) {
   > > >             return;
   > > >         }
   > > > 
   > > >         log.info("Flush ledger storage at checkpoint {}.", checkpoint);
   > > >         try {
   > > >             checkpointSource.checkpointComplete(null, checkpoint, false);
   > > >         } catch (IOException e) {
   > > >             log.error("Exception marking checkpoint as complete", e);
   > > >             dirsListener.allDisksFull(true);
   > > >         }
   > > >     }
   > > > ```
   > > > 
   > > > 
   > > >     
   > > >       
   > > >     
   > > > 
   > > >       
   > > >     
   > > > 
   > > >     
   > > >   
   > > > ```
   > > > @Override
   > > >     public void flush() throws IOException {
   > > >         Checkpoint cp = checkpointSource.newCheckpoint();
   > > >         checkpoint(cp);
   > > >         checkpointSource.checkpointComplete(ledgerDir, cp, true);
   > > >     }
   > > > ```
   > > 
   > > 
   > > I also think the checkpoint is duplicated here
   > 
   > @hangc0276 This problem is not going to be resolved here, right?
   
   @gaozhangmin  Yes,we can use another PR to solve it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] gaozhangmin commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
gaozhangmin commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1164207225

   > > > > @hangc0276 I found when syncThread do flush, there exists duplicate `checkpointComplete` invoke.
   > > > > SyncThread flush:
   > > > > ```
   > > > > private void flush() {
   > > > >         Checkpoint checkpoint = checkpointSource.newCheckpoint();
   > > > >         try {
   > > > >             ledgerStorage.flush();
   > > > >         } catch (NoWritableLedgerDirException e) {
   > > > >             log.error("No writeable ledger directories", e);
   > > > >             dirsListener.allDisksFull(true);
   > > > >             return;
   > > > >         } catch (IOException e) {
   > > > >             log.error("Exception flushing ledgers", e);
   > > > >             return;
   > > > >         }
   > > > > 
   > > > >         if (disableCheckpoint) {
   > > > >             return;
   > > > >         }
   > > > > 
   > > > >         log.info("Flush ledger storage at checkpoint {}.", checkpoint);
   > > > >         try {
   > > > >             checkpointSource.checkpointComplete(null, checkpoint, false);
   > > > >         } catch (IOException e) {
   > > > >             log.error("Exception marking checkpoint as complete", e);
   > > > >             dirsListener.allDisksFull(true);
   > > > >         }
   > > > >     }
   > > > > ```
   > > > > 
   > > > > 
   > > > >     
   > > > >       
   > > > >     
   > > > > 
   > > > >       
   > > > >     
   > > > > 
   > > > >     
   > > > >   
   > > > > ```
   > > > > @Override
   > > > >     public void flush() throws IOException {
   > > > >         Checkpoint cp = checkpointSource.newCheckpoint();
   > > > >         checkpoint(cp);
   > > > >         checkpointSource.checkpointComplete(ledgerDir, cp, true);
   > > > >     }
   > > > > ```
   > > > 
   > > > 
   > > > I also think the checkpoint is duplicated here
   > > 
   > > 
   > > @hangc0276 This problem is not going to be resolved here, right?
   > 
   > @gaozhangmin Yes,we can use another PR to solve it.
   
   I submit pr #3353 to solve this issue. @hangc0276  PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] wenbingshen commented on a diff in pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
wenbingshen commented on code in PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#discussion_r916071847


##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Journal.java:
##########
@@ -245,7 +246,8 @@ void readLog() {
                     }
                     bb.clear();
                     mark.readLogMark(bb);
-                    if (curMark.compare(mark) < 0) {
+                    // get the minimum mark position from all the ledger directories to ensure no data loss.
+                    if (curMark.compare(mark) > 0) {

Review Comment:
   Since we are now replaying the journal from the smallest `LogMark`, can we get the journal log mark position of the current entry to the `JournalScanner` and compare it with the checkpoint position on the ledger disk where the entry to be restored is located, and only restore the entry whose `logMark` position is greater than the `checkpoint`?  So as to avoid repeatedly writing the data that has been flushed to the ledger disk.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] StevenLuMT commented on a diff in pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
StevenLuMT commented on code in PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#discussion_r929771426


##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/SingleDirectoryDbLedgerStorage.java:
##########
@@ -829,7 +831,7 @@ private void swapWriteCache() {
     public void flush() throws IOException {
         Checkpoint cp = checkpointSource.newCheckpoint();
         checkpoint(cp);
-        checkpointSource.checkpointComplete(cp, true);
+        checkpointSource.checkpointComplete(cp, true, ledgerDirsManager);

Review Comment:
   @hangc0276 
   SingleDirectoryDbLedgerStorage's ledgerDirsManager can't cover all directories, only currently managed directory,I think you need all ledger' dirs 
   <img width="1000" alt="image" src="https://user-images.githubusercontent.com/42990025/180979609-ce424961-0d19-4e90-bddb-e29af2af6bbe.png">
   
   so the structure of LedgerDirsManager is very confusing, I am also optimizing this, see for details in mail talking with @eolivelli 
   https://lists.apache.org/thread/njw4yhdgv9db64notq5o3hc0k8bbpl8v



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] hangc0276 commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1159982087

   rerun failure checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] lordcheng10 commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
lordcheng10 commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1154661258

   > 2.When bookie startup, read the minimal lastMark instead of the maximal lastMark as current last mark.
   @hangc0276 How is this logic implemented? Do you need to modify the following logic?:
   <img width="1356" alt="image" src="https://user-images.githubusercontent.com/19296967/173485039-9dc3959a-8e74-4b6b-97b5-85160f188fc2.png">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] gaozhangmin commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
gaozhangmin commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1161450588

   > > @hangc0276 I found when syncThread do flush, there exists duplicate `checkpointComplete` invoke.
   > > SyncThread flush:
   > > ```
   > > private void flush() {
   > >         Checkpoint checkpoint = checkpointSource.newCheckpoint();
   > >         try {
   > >             ledgerStorage.flush();
   > >         } catch (NoWritableLedgerDirException e) {
   > >             log.error("No writeable ledger directories", e);
   > >             dirsListener.allDisksFull(true);
   > >             return;
   > >         } catch (IOException e) {
   > >             log.error("Exception flushing ledgers", e);
   > >             return;
   > >         }
   > > 
   > >         if (disableCheckpoint) {
   > >             return;
   > >         }
   > > 
   > >         log.info("Flush ledger storage at checkpoint {}.", checkpoint);
   > >         try {
   > >             checkpointSource.checkpointComplete(null, checkpoint, false);
   > >         } catch (IOException e) {
   > >             log.error("Exception marking checkpoint as complete", e);
   > >             dirsListener.allDisksFull(true);
   > >         }
   > >     }
   > > ```
   > > 
   > > 
   > >     
   > >       
   > >     
   > > 
   > >       
   > >     
   > > 
   > >     
   > >   
   > > ```
   > > @Override
   > >     public void flush() throws IOException {
   > >         Checkpoint cp = checkpointSource.newCheckpoint();
   > >         checkpoint(cp);
   > >         checkpointSource.checkpointComplete(ledgerDir, cp, true);
   > >     }
   > > ```
   > 
   > I also think the checkpoint is duplicated here
   
   @hangc0276  This problem is not going to be resolved  here, right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] lordcheng10 commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
lordcheng10 commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1154664769

   > @hangc0276 I found when syncThread do flush, there exists duplicate `checkpointComplete` invoke.
   > 
   > SyncThread flush:
   > 
   > ```
   > private void flush() {
   >         Checkpoint checkpoint = checkpointSource.newCheckpoint();
   >         try {
   >             ledgerStorage.flush();
   >         } catch (NoWritableLedgerDirException e) {
   >             log.error("No writeable ledger directories", e);
   >             dirsListener.allDisksFull(true);
   >             return;
   >         } catch (IOException e) {
   >             log.error("Exception flushing ledgers", e);
   >             return;
   >         }
   > 
   >         if (disableCheckpoint) {
   >             return;
   >         }
   > 
   >         log.info("Flush ledger storage at checkpoint {}.", checkpoint);
   >         try {
   >             checkpointSource.checkpointComplete(null, checkpoint, false);
   >         } catch (IOException e) {
   >             log.error("Exception marking checkpoint as complete", e);
   >             dirsListener.allDisksFull(true);
   >         }
   >     }
   > ```
   > 
   > ```
   > @Override
   >     public void flush() throws IOException {
   >         Checkpoint cp = checkpointSource.newCheckpoint();
   >         checkpoint(cp);
   >         checkpointSource.checkpointComplete(ledgerDir, cp, true);
   >     }
   > ```
   
   I also think the checkpoint is duplicated here


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] wenbingshen commented on a diff in pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
wenbingshen commented on code in PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#discussion_r916071847


##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Journal.java:
##########
@@ -245,7 +246,8 @@ void readLog() {
                     }
                     bb.clear();
                     mark.readLogMark(bb);
-                    if (curMark.compare(mark) < 0) {
+                    // get the minimum mark position from all the ledger directories to ensure no data loss.
+                    if (curMark.compare(mark) > 0) {

Review Comment:
   Since we are now replaying the journal from the smallest LogMark, can we get the journal log mark position of the current entry to the JournalScanner and compare it with the checkpoint position on the ledger disk where the entry to be restored is located, and only restore the entry whose logMark position is greater than the checkpoint? So as to avoid repeatedly writing the data that has been flushed to the disk.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] hangc0276 commented on a diff in pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
hangc0276 commented on code in PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#discussion_r929466044


##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Journal.java:
##########
@@ -245,7 +246,8 @@ void readLog() {
                     }
                     bb.clear();
                     mark.readLogMark(bb);
-                    if (curMark.compare(mark) < 0) {
+                    // get the minimum mark position from all the ledger directories to ensure no data loss.
+                    if (curMark.compare(mark) > 0) {

Review Comment:
   It will introduce complex logic for this comparison.
   1. If the ledger directory expands or shrinks, the map logic of the ledgerId to the ledger directory (`logMark`) also changed. It may be located on the wrong `logMark` file, and will lead to skipping the unflushed entries.
   2. There are many kinds of storage implementation, such as dbLedgerStorage, SortedLedgerStorage, and InterleavedLedgerStorage, we should get the ledgerId related storage instance to check the `logMark` position for each storage implementation. This operation will introduce complex logic.
   3. For the comparison, we can only save the write ledger throughput. We also need to read the data from the journal log file out.
   
   Based on the above reason, I prefer to replay all entries in the journal log file based on the min `logMark` position.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] StevenLuMT commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
StevenLuMT commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1194919134

   rerun failure checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] hangc0276 commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1162741810

   rerun failure checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] hangc0276 commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1163261490

   rerun failure checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] hangc0276 commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1159449458

   > it's better to add proper test case to avoid future regressions
   
   @nicoloboschi I have added the test to cover this change, please help take a look, thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] hangc0276 commented on a diff in pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
hangc0276 commented on code in PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#discussion_r900800828


##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Journal.java:
##########
@@ -660,7 +660,7 @@ static void writePaddingBytes(JournalChannel jc, ByteBuf paddingBuffer, int jour
     // Should data be fsynced on disk before triggering the callback
     private final boolean syncData;
 
-    private final LastLogMark lastLogMark = new LastLogMark(0, 0);
+    private final LastLogMark lastLogMark = new LastLogMark(Long.MAX_VALUE, Long.MAX_VALUE);

Review Comment:
   When creating a Journal instance, it will initiate `lastLogMark` by reading each ledger directory's `lastMark` file and get the minimum position as the replay start point. So we should init lastLogMark with the MAX_VALUE.
   
   In fact, the original logic of init `lastLogMark` with 0, and get the maximum position of all the ledger directory's `lastMark` file to init the lastLogMark. IMO, it will lose data.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] hangc0276 commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1194954429

   This PR changed the default `logMark` behaviour, it lead to some unit test failure, I'm working on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] gaozhangmin commented on a diff in pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
gaozhangmin commented on code in PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#discussion_r897477804


##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookieImpl.java:
##########
@@ -603,7 +604,7 @@ private void replay(Journal journal, JournalScanner scanner) throws IOException
             journalId >= markedLog.getLogFileId());
         // last log mark may be missed due to no sync up before
         // validate filtered log ids only when we have markedLogId
-        if (markedLog.getLogFileId() > 0) {
+        if (markedLog.getLogFileId() > 0 && markedLog.getLogFileId() != Long.MAX_VALUE) {

Review Comment:
   Change should also be made at `Journal.readLog`, which curMark should be set to the min mark from ledgers dir.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] eolivelli commented on a diff in pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
eolivelli commented on code in PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#discussion_r895396529


##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookieImpl.java:
##########
@@ -369,8 +369,9 @@ public Checkpoint newCheckpoint() {
                 }
 
                 @Override
-                public void checkpointComplete(Checkpoint checkpoint, boolean compact)
-                        throws IOException {
+                public void checkpointComplete(Checkpoint checkpoint,
+                                               boolean compact,
+                                               LedgerDirsManager ledgerDirsManager1) throws IOException {

Review Comment:
   nit: `ledgerDirsManager`



##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookieImpl.java:
##########
@@ -603,7 +604,7 @@ private void replay(Journal journal, JournalScanner scanner) throws IOException
             journalId >= markedLog.getLogFileId());
         // last log mark may be missed due to no sync up before
         // validate filtered log ids only when we have markedLogId
-        if (markedLog.getLogFileId() > 0) {
+        if (markedLog.getLogFileId() > 0 && markedLog.getLogFileId() != Long.MAX_VALUE) {

Review Comment:
   I wonder if it would be better to have a constant for Long.MAX_VALUE



##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/CheckpointSource.java:
##########
@@ -89,7 +89,9 @@ public String toString() {
      * @param compact
      *          Flag to compact old checkpoints.
      */
-    void checkpointComplete(Checkpoint checkpoint, boolean compact) throws IOException;
+    void checkpointComplete(Checkpoint checkpoint,
+                            boolean compact,
+                            LedgerDirsManager ledgerDirsManager) throws IOException;

Review Comment:
   we should add javadocs here and explain why we have this `ledgerDirsManager` and when it may be null



##########
tools/perf/src/main/java/org/apache/bookkeeper/tools/perf/journal/JournalWriter.java:
##########
@@ -272,7 +272,7 @@ void execute(Journal[] journals) throws Exception {
             for (Journal journal : journals) {
                 Checkpoint cp = journal.newCheckpoint();
                 try {
-                    journal.checkpointComplete(cp, true);
+                    journal.checkpointComplete(cp, true, null);

Review Comment:
   what about adding a comment here ?
   `null` means that the checkpoint is not started by a specific LedgersDirManager



##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Journal.java:
##########
@@ -660,7 +660,7 @@ static void writePaddingBytes(JournalChannel jc, ByteBuf paddingBuffer, int jour
     // Should data be fsynced on disk before triggering the callback
     private final boolean syncData;
 
-    private final LastLogMark lastLogMark = new LastLogMark(0, 0);
+    private final LastLogMark lastLogMark = new LastLogMark(Long.MAX_VALUE, Long.MAX_VALUE);

Review Comment:
   we should make a public constant here, otherwise magic numbers are easily forgotten.
   
   also, I may understand why we are changing from 0 to MAX_VALUE, but...what is the impact ?



##########
bookkeeper-server/src/test/java/org/apache/bookkeeper/bookie/BookieAccessor.java:
##########
@@ -35,6 +35,6 @@ public static void forceFlush(BookieImpl b) throws IOException {
         CheckpointSourceList source = new CheckpointSourceList(b.journals);
         Checkpoint cp = source.newCheckpoint();
         b.ledgerStorage.flush();
-        source.checkpointComplete(cp, true);
+        source.checkpointComplete(cp, true, null);

Review Comment:
   so `null` means "all"...
   we should document this in the javadocs



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] gaozhangmin commented on pull request #3329: Fix data lose when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
gaozhangmin commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1153526529

   @hangc0276 I found when syncThread do flush, there exists duplicate `checkpointComplete` invoke.
   
   SyncThread flush:
   ```
   private void flush() {
           Checkpoint checkpoint = checkpointSource.newCheckpoint();
           try {
               ledgerStorage.flush();
           } catch (NoWritableLedgerDirException e) {
               log.error("No writeable ledger directories", e);
               dirsListener.allDisksFull(true);
               return;
           } catch (IOException e) {
               log.error("Exception flushing ledgers", e);
               return;
           }
   
           if (disableCheckpoint) {
               return;
           }
   
           log.info("Flush ledger storage at checkpoint {}.", checkpoint);
           try {
               checkpointSource.checkpointComplete(null, checkpoint, false);
           } catch (IOException e) {
               log.error("Exception marking checkpoint as complete", e);
               dirsListener.allDisksFull(true);
           }
       }
   ```
   
   ```
   @Override
       public void flush() throws IOException {
           Checkpoint cp = checkpointSource.newCheckpoint();
           checkpoint(cp);
           checkpointSource.checkpointComplete(ledgerDir, cp, true);
       }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] hangc0276 commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1159893852

   rerun failure checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] aloyszhang commented on a diff in pull request #3329: Fix data lost when configured multiple ledger directories

Posted by "aloyszhang (via GitHub)" <gi...@apache.org>.
aloyszhang commented on code in PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#discussion_r1194830852


##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookieImpl.java:
##########
@@ -872,6 +869,9 @@ int shutdown(int exitCode) {
                 // Shutdown the EntryLogger which has the GarbageCollector Thread running
                 ledgerStorage.shutdown();
 
+                // Shutdown Sync thread
+                syncThread.shutdown();

Review Comment:
   Here may through exceptions when shutdown the `syncThread` which will call `checkpoint`  of ledgerStoreage, since we have already shut down the ledgeerStorage before



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] hangc0276 commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by "hangc0276 (via GitHub)" <gi...@apache.org>.
hangc0276 commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1547430443

   > There are two places that can trigger `checkpoint`.
   > 
   > 1. the scheduled tasks in `SyncThread.doCheckpoint`
   > 2. the ledgerDir write-cache full and then flush
   >    The second way is the root cause of data loss.
   > 
   > If removing the `checkpointSource.checkpointComplete` logic in `flush`, then at this point we will not delete journal files.
   > 
   > The scheduled task `SyncThread.doCheckpoint` will invoke `checkpointSource.checkpointComplete` and it's safe here to delete journal files since we have already flushed all write-caches for all ledger directories.
   > 
   > WDYT @hangc0276
   `
   @aloyszhang Thanks for your suggestion. Yes, making the `SyncThread.doCheckpoint` the only endpoint is the simplest way to solve this problem. I updated the PR description, please help take a look, thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] eolivelli commented on a diff in pull request #3329: Fix data lost when configured multiple ledger directories

Posted by "eolivelli (via GitHub)" <gi...@apache.org>.
eolivelli commented on code in PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#discussion_r1241227103


##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/SingleDirectoryDbLedgerStorage.java:
##########
@@ -905,7 +907,9 @@ private void swapWriteCache() {
     public void flush() throws IOException {
         Checkpoint cp = checkpointSource.newCheckpoint();
         checkpoint(cp);
-        checkpointSource.checkpointComplete(cp, true);
+        if (singleLedgerDirs) {

Review Comment:
   Please add a small comment with a quick description of the motivation for this condition



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] hangc0276 commented on a diff in pull request #3329: Fix data lost when configured multiple ledger directories

Posted by "hangc0276 (via GitHub)" <gi...@apache.org>.
hangc0276 commented on code in PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#discussion_r1200075769


##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookieImpl.java:
##########
@@ -872,6 +869,9 @@ int shutdown(int exitCode) {
                 // Shutdown the EntryLogger which has the GarbageCollector Thread running
                 ledgerStorage.shutdown();
 
+                // Shutdown Sync thread
+                syncThread.shutdown();

Review Comment:
   Yes, you are right. I updated the code, please help take a look, thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] aloyszhang commented on pull request #3329: Fix data lost when configured multiple ledger directories

Posted by "aloyszhang (via GitHub)" <gi...@apache.org>.
aloyszhang commented on PR #3329:
URL: https://github.com/apache/bookkeeper/pull/3329#issuecomment-1539798146

   There are two places that can trigger `checkpoint`.
   1. the scheduled tasks in `SyncThread.doCheckpoint` 
   2. the ledgerDir write-cache full and then flush
   The second way is the root cause of data loss.
   
   If removing the `checkpointSource.checkpointComplete` logic in `flush`, then at this point we will not delete journal files.
   
   The scheduled task `SyncThread.doCheckpoint` will invoke `checkpointSource.checkpointComplete` and it's safe here to delete journal files since we have already flushed all write-caches for all ledger directories.
   
   WDYT @hangc0276 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org