You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@bookkeeper.apache.org by GitBox <gi...@apache.org> on 2022/07/14 08:04:43 UTC
[GitHub] [bookkeeper] wolfstudy opened a new issue, #3405: Ledger has been in IN_RECOVERY state and cannot be recovered
wolfstudy opened a new issue, #3405:
URL: https://github.com/apache/bookkeeper/issues/3405
**BUG REPORT**
***Describe the bug***
Initially, we found that a topic in Pulsar could not do normal message production and sending operations. Therefore, querying the stats and stats-internal of the current topic returns no results, as follows:
Status:
```
{
"msgRateIn" : 0.0,
"msgThroughputIn" : 0.0,
"msgRateOut" : 0.0,
"msgThroughputOut" : 0.0,
"bytesInCounter" : 0,
"msgInCounter" : 0,
"bytesOutCounter" : 0,
"msgOutCounter" : 0,
"averageMsgSize" : 0.0,
"msgChunkPublished" : false,
"storageSize" : 0,
"backlogSize" : 0,
"offloadedStorageSize" : 0,
"publishers" : [ ],
"subscriptions" : { },
"replication" : { },
"nonContiguousDeletedMessagesRanges" : 0,
"nonContiguousDeletedMessagesRangesSerializedSize" : 0,
"publishRateLimitedTimes" : 0,
"metadata" : {
"partitions" : 1
},
"partitions" : { }
}
```
Status-internal:
```
{
"metadata" : {
"partitions" : 0
},
"partitions" : { }
}
```
At this point, I went to query Bookie's log and found the following error message:
![image](https://user-images.githubusercontent.com/20965307/178931175-d21d9200-af4e-4df7-9b85-dbbb795ea9c5.png)
And the source code is here: https://github.com/apache/pulsar/blob/6704f12104/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerFactoryImpl.java#L366-L374
Here we can see that the current Ledger state has not been restored within the time of the default configuration managedLedgerMetadataOperationsTimeoutSeconds=60s.
So we tried to use the readledger command to find that the current Ledger stores nearly 700w of Entries, so when the Broker tries to load the topic, it will open the current ledger, but the Ledger is always in the state of IN_RECOVERY, So it will enter the following code logic:
```
if (doRecovery) {
lh.recover(new OrderedGenericCallback<Void>(bk.getMainWorkerPool(), ledgerId) {
@Override
public void safeOperationComplete(int rc, Void result) {
if (rc == BKException.Code.OK) {
openComplete(BKException.Code.OK, lh);
} else if (rc == BKException.Code.UnauthorizedAccessException) {
closeLedgerHandle();
openComplete(BKException.Code.UnauthorizedAccessException, null);
} else {
closeLedgerHandle();
openComplete(bk.getReturnRc(BKException.Code.LedgerRecoveryException), null);
}
}
@Override
public String toString() {
return String.format("Recover(%d)", ledgerId);
}
});
}
```
However, since there are many Entry that need to be restored, we can see that the current Ledger has not been able to be restored, so the Topic on the Broker side cannot be loaded correctly, so the entire service has been unavailable.
***To Reproduce***
I don't know how this phenomenon came about, and I don't know how to reproduce this scene
## Solution
We locally mocked a bookie client object, and then re-fetched the properties of LedgerMetadata, forcing the current Ledger state to be reset from IN_RECOVERY to CLOSED state, and the problem recovered
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] eolivelli commented on issue #3405: Ledger has been in IN_RECOVERY state and cannot be recovered
Posted by GitBox <gi...@apache.org>.
eolivelli commented on issue #3405:
URL: https://github.com/apache/bookkeeper/issues/3405#issuecomment-1184377321
are you sure that this is a bookkeeper problem ?
do you have any logs from the bookkeeper client ? "org.apache.bookkeeper" loggers (excluding Pulsar mledger logs)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] StevenLuMT commented on issue #3405: Ledger has been in IN_RECOVERY state and cannot be recovered
Posted by GitBox <gi...@apache.org>.
StevenLuMT commented on issue #3405:
URL: https://github.com/apache/bookkeeper/issues/3405#issuecomment-1191750815
I have a question what is ledger's config
EnsembleSize=?
WriteQuorum=?
AckQuorum=?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org