You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/06/27 03:44:00 UTC
[GitHub] [doris] SWJTU-ZhangLei opened a new issue, #10436: [Bug] [FE] bdb recoveryTracker should overlap or follow on disk last VLSN of 488,692 recoveryFirst= 488,694
SWJTU-ZhangLei opened a new issue, #10436:
URL: https://github.com/apache/doris/issues/10436
### Search before asking
- [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues.
### Version
root@regtest-15-bj:~# /mnt/hdd01/DORIS_MASTER_ASAN/be/lib/palo_be --version
trunk DEBUG (build git://regtest-15-bj/mnt/hdd01/repo_center/doris_master/be/../@516f5b17894233771e86e17e6a950296b1ac596e)
Built on Fri, 24 Jun 2022 11:17:10 CST by root@regtest-15-bj
### What's Wrong?
fe cannot start and fe.log print this:
88668 2022-06-27 09:41:04,139 INFO (main|1) [Catalog.loadRecycleBin():1817] finished replay recycleBin from image
88669 2022-06-27 09:41:04,158 INFO (main|1) [Catalog.loadGlobalVariable():1824] finished replay globalVariable from image
88670 2022-06-27 09:41:04,162 INFO (main|1) [InternalDataSource.loadCluster():3117] finished replay cluster from image
88671 2022-06-27 09:41:04,163 INFO (main|1) [Catalog.loadBrokers():4468] finished replay brokerMgr from image
88672 2022-06-27 09:41:04,165 INFO (main|1) [Catalog.loadResources():1848] finished replay resources from image
88673 2022-06-27 09:41:04,165 INFO (main|1) [Catalog.loadExportJob():1713] finished replay exportJob from image
88674 2022-06-27 09:41:04,165 INFO (main|1) [Catalog.loadSyncJobs():1721] finished replay syncJobMgr from image
88675 2022-06-27 09:41:04,209 INFO (main|1) [Catalog.loadBackupHandler():1781] finished replay backupHandler from image
88676 2022-06-27 09:41:04,210 INFO (main|1) [PaloAuth.readFields():1818] Load PaloAuth from meta version < 111, degrade UserPrivTable to CatalogPrivTable
88677 2022-06-27 09:41:04,212 INFO (main|1) [Catalog.loadPaloAuth():1794] finished replay paloAuth from image
88678 2022-06-27 09:41:04,212 INFO (main|1) [Catalog.loadTransactionState():1802] finished replay transactionState from image
88679 2022-06-27 09:41:04,213 INFO (main|1) [Catalog.loadColocateTableIndex():1830] finished replay colocateTableIndex from image
88680 2022-06-27 09:41:04,213 INFO (main|1) [Catalog.loadRoutineLoadJobs():1836] finished replay routineLoadJobs from image
88681 2022-06-27 09:41:04,213 INFO (main|1) [Catalog.loadLoadJobsV2():1842] finished replay loadJobsV2 from image
88682 2022-06-27 09:41:04,213 INFO (main|1) [Catalog.loadSmallFiles():1854] finished replay smallFiles from image
88683 2022-06-27 09:41:04,213 INFO (main|1) [Catalog.loadPlugins():4755] finished replay plugins from image
88684 2022-06-27 09:41:04,316 INFO (main|1) [Catalog.loadDeleteHandler():1787] finished replay deleteHandler from image
88685 2022-06-27 09:41:04,317 INFO (main|1) [Catalog.loadSqlBlockRule():1862] finished replay sqlBlockRule from image
88686 2022-06-27 09:41:04,323 INFO (main|1) [Catalog.loadPolicy():1873] finished replay policy from image
88687 2022-06-27 09:41:04,324 INFO (main|1) [Catalog.loadDatasource():1887] finished replay datasource from image
88688 2022-06-27 09:41:04,324 INFO (main|1) [MetaReader.read():104] finished to load image in 821 ms
88689 2022-06-27 09:41:18,175 ERROR (main|1) [BDBEnvironment.setup():199] error to open replicated environment. will exit.
88690 com.sleepycat.je.EnvironmentFailureException: (JE 18.3.12) 172.21.16.15_9810_1655561015119(-1):/mnt/hdd01/DORIS_MASTER_ASAN/fe/doris-meta/bdb recoveryTracker should overlap or follow on disk last VLSN of 488,692 recoveryFirst= 488,694 UNEXPECTED_STATE_FATAL: Unexpected internal state, u 88690 nable to continue. Environment is invalid and must be closed.
88691 at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:459) ~[je-18.3.12.jar:18.3.12]
88692 at com.sleepycat.je.rep.vlsn.VLSNIndex.merge(VLSNIndex.java:1641) ~[je-18.3.12.jar:18.3.12]
88693 at com.sleepycat.je.rep.vlsn.VLSNIndex.init(VLSNIndex.java:1534) ~[je-18.3.12.jar:18.3.12]
88694 at com.sleepycat.je.rep.vlsn.VLSNIndex.<init>(VLSNIndex.java:426) ~[je-18.3.12.jar:18.3.12]
88695 at com.sleepycat.je.rep.impl.RepImpl.preRecoveryCheckpointInit(RepImpl.java:575) ~[je-18.3.12.jar:18.3.12]
88696 at com.sleepycat.je.recovery.RecoveryManager.recover(RecoveryManager.java:508) ~[je-18.3.12.jar:18.3.12]
88697 at com.sleepycat.je.dbi.EnvironmentImpl.finishInit(EnvironmentImpl.java:895) ~[je-18.3.12.jar:18.3.12]
88698 at com.sleepycat.je.dbi.DbEnvPool.getEnvironment(DbEnvPool.java:222) ~[je-18.3.12.jar:18.3.12]
88699 at com.sleepycat.je.Environment.makeEnvironmentImpl(Environment.java:278) ~[je-18.3.12.jar:18.3.12]
88700 at com.sleepycat.je.Environment.<init>(Environment.java:258) ~[je-18.3.12.jar:18.3.12]
88701 at com.sleepycat.je.rep.ReplicatedEnvironment.<init>(ReplicatedEnvironment.java:605) ~[je-18.3.12.jar:18.3.12]
88702 at com.sleepycat.je.rep.ReplicatedEnvironment.<init>(ReplicatedEnvironment.java:464) ~[je-18.3.12.jar:18.3.12]
88703 at com.sleepycat.je.rep.ReplicatedEnvironment.<init>(ReplicatedEnvironment.java:538) ~[je-18.3.12.jar:18.3.12]
88704 at org.apache.doris.journal.bdbje.BDBEnvironment.setup(BDBEnvironment.java:152) ~[palo-fe.jar:1.0-SNAPSHOT]
88705 at org.apache.doris.journal.bdbje.BDBJEJournal.open(BDBJEJournal.java:302) ~[palo-fe.jar:1.0-SNAPSHOT]
88706 at org.apache.doris.persist.EditLog.open(EditLog.java:889) ~[palo-fe.jar:1.0-SNAPSHOT]
88707 at org.apache.doris.catalog.Catalog.initialize(Catalog.java:812) ~[palo-fe.jar:1.0-SNAPSHOT]
88708 at org.apache.doris.PaloFe.start(PaloFe.java:128) ~[palo-fe.jar:1.0-SNAPSHOT]
88709 at org.apache.doris.PaloFe.main(PaloFe.java:63) ~[palo-fe.jar:1.0-SNAPSHOT]
### What You Expected?
fe can start normally.
### How to Reproduce?
only meet once. In regression test environment ,i just start/stop fe for many times;
### Anything Else?
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [doris] SWJTU-ZhangLei commented on issue #10436: [Bug] [FE] bdb recoveryTracker should overlap or follow on disk last VLSN of 488,692 recoveryFirst= 488,694
Posted by GitBox <gi...@apache.org>.
SWJTU-ZhangLei commented on issue #10436:
URL: https://github.com/apache/doris/issues/10436#issuecomment-1170669415
> Why does this problem occur. How to reproduce this problem? Is there an error in the log file of bdbje, or is there a problem with our usage? If these problems are not clear, it is not recommended to modify the bdbje code
in our regression test, we build a 3 fe cluster and we met many times; when we update fe for many times;
and i found them met the same problem https://github.com/StarRocks/bdb-je/issues/1 (published by apache license)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org