You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/06/16 10:57:56 UTC

[GitHub] [incubator-doris] morningman opened a new issue #3889: [Bug][RoutineLoad] Routine load pause and never run again.

morningman opened a new issue #3889:
URL: https://github.com/apache/incubator-doris/issues/3889


   **Describe the bug**
   fe.log has error:
   
   ```
   2020-06-16 04:55:17,211 ERROR 27 [PublishVersionDaemon.runAfterCatalogReady():57] errors while publish version to all backends
   java.util.NoSuchElementException: No value present
           at java.util.Optional.get(Optional.java:135) ~[?:1.8.0_161]
           at org.apache.doris.load.routineload.RoutineLoadJob.afterVisible(RoutineLoadJob.java:806) ~[palo-fe.jar:?]
           at org.apache.doris.transaction.TransactionState.afterStateTransform(TransactionState.java:409) ~[palo-fe.jar:?]
           at org.apache.doris.transaction.TransactionState.afterStateTransform(TransactionState.java:392) ~[palo-fe.jar:?]
           at org.apache.doris.transaction.DatabaseTransactionMgr.finishTransaction(DatabaseTransactionMgr.java:762) ~[palo-fe.jar:?]
           at org.apache.doris.transaction.GlobalTransactionMgr.finishTransaction(GlobalTransactionMgr.java:224) ~[palo-fe.jar:?]
           at org.apache.doris.transaction.PublishVersionDaemon.publishVersion(PublishVersionDaemon.java:208) ~[palo-fe.jar:?]
           at org.apache.doris.transaction.PublishVersionDaemon.runAfterCatalogReady(PublishVersionDaemon.java:55) [palo-fe.jar:?]
           at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) [palo-fe.jar:?]
           at org.apache.doris.common.util.Daemon.run(Daemon.java:116) [palo-fe.jar:?]
   ```
   
   **Debug**
   
   https://github.com/apache/incubator-doris/blob/0224d49842f99b58125bd07294af35bbe4a700c7/fe/src/main/java/org/apache/doris/load/routineload/RoutineLoadJob.java#L927-L936
   
   the `routineLoadTaskInfo.setTxnStatus(txnStatus);` in line 928 should be before the check `if (state == JobState.RUNNING) {` in line 927, or the error may happen by following step:
   
   1. A routine load task's transaction is COMMITTED, and begin to call `RoutineLoadJob.executeTaskOnTxnStatusChanged()`.
   
   2. The routine load job PAUSED because of some other error, such as BE down
   
   3. Code runs to line 927 and find that job's state is not RUNNING, so it will not set the transaction status saved in the `routineLoadTaskInfo`
   
   4. the transaction is COMMITTED, but due to some reason, it can not change to VISIBLE for a long time, and because transaction status in `routineLoadTaskInfo`(not the status in transaction) is still PENDING, so the routine load task timeout checker will consider is as a timeout task. 
   
   https://github.com/apache/incubator-doris/blob/0224d49842f99b58125bd07294af35bbe4a700c7/fe/src/main/java/org/apache/doris/load/routineload/RoutineLoadJob.java#L483-L489
   
   5. Timeout checker remove the `routineLoadTaskInfo` from the job.
   
   6. the PUBLISH of the transaction is finally succeeded, but when calling `afterVisible()`, it can not find the  `routineLoadTaskInfo` in the job, so exception will thrown in line 806.
   
   https://github.com/apache/incubator-doris/blob/0224d49842f99b58125bd07294af35bbe4a700c7/fe/src/main/java/org/apache/doris/load/routineload/RoutineLoadJob.java#L804-L807


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] kangkaisen closed issue #3889: [Bug][RoutineLoad] Routine load pause and never run again.

Posted by GitBox <gi...@apache.org>.
kangkaisen closed issue #3889:
URL: https://github.com/apache/incubator-doris/issues/3889


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] kangkaisen commented on issue #3889: [Bug][RoutineLoad] Routine load pause and never run again.

Posted by GitBox <gi...@apache.org>.
kangkaisen commented on issue #3889:
URL: https://github.com/apache/incubator-doris/issues/3889#issuecomment-645115947


   Good analysis!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org