You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by "George Klimov (JIRA)" <ji...@apache.org> on 2018/08/10 15:46:00 UTC
[jira] [Created] (ZEPPELIN-3704) Scheduler.getJobsRunning() returns
finished jobs
George Klimov created ZEPPELIN-3704:
---------------------------------------
Summary: Scheduler.getJobsRunning() returns finished jobs
Key: ZEPPELIN-3704
URL: https://issues.apache.org/jira/browse/ZEPPELIN-3704
Project: Zeppelin
Issue Type: Bug
Components: zeppelin-zengine
Reporter: George Klimov
Assignee: George Klimov
Sometimes last paragraphs marks as *ABORT* with no reason after successful running using cron with active "After execution stop the interpreter" setting. I found out that reason of this behavior is that Scheduler.getJobsRunning() returns finished jobs. Has anyone ever faced this problem?
Short log (with additional log info from TinkoffCreditSystems fork):
{noformat}
INFO [2018-08-10 00:08:00,000] ({DefaultQuartzScheduler_Worker-47} Notebook.java[execute]:945) - Start schedule run note: 2C68U586U, cronExpr:"0 8 0 * * ?"
INFO [2018-08-10 00:08:00,047] ({pool-2-thread-266} SchedulerFactory.java[jobStarted]:109) - Job 20170814-171621_1685490119 started by scheduler
INFO [2018-08-10 00:10:35,387] ({pool-2-thread-266} SchedulerFactory.java[jobFinished]:115) - Job 20170814-171621_1685490119 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-greenplum_pd:user:2C68U586U-shared_session
INFO [2018-08-10 00:10:35,417] ({pool-2-thread-3838} SchedulerFactory.java[jobStarted]:109) - Job 20180402-171122_400058927 started by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark:user:2C68U586U-shared_session
INFO [2018-08-10 00:11:57,428] ({pool-2-thread-3838} SchedulerFactory.java[jobFinished]:115) - Job 20180402-171122_400058927 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark:user:2C68U586U-shared_session
INFO [2018-08-10 00:11:57,445] ({pool-2-thread-996} SchedulerFactory.java[jobStarted]:109) - Job 20180413-191933_1545337614 started by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark:user:2C68U586U-shared_session
INFO [2018-08-10 00:11:57,527] ({pool-2-thread-996} NotebookServer.java[afterStatusChange]:2631) - Job 20180413-191933_1545337614 is finished successfully, status: FINISHED
INFO [2018-08-10 00:11:57,547] ({DefaultQuartzScheduler_Worker-47} Paragraph.java[execute]:343) - skip to run blank paragraph. 20180423-134725_1702290212
INFO [2018-08-10 00:11:57,547] ({DefaultQuartzScheduler_Worker-47} Notebook.java[execute]:947) - End schedule run note: 2C68U586U
INFO [2018-08-10 00:11:57,548] ({DefaultQuartzScheduler_Worker-47} ManagedInterpreterGroup.java[close]:100) - Close Session: shared_session for interpreter setting: spark
INFO [2018-08-10 00:11:57,553] ({pool-2-thread-996} VFSNotebookRepo.java[save]:196) - Saving note:2C68U586U
Third job status from FINISHED becomes ABORT
WARN [2018-08-10 00:11:57,555] ({DefaultQuartzScheduler_Worker-47} NotebookServer.java[afterStatusChange]:2633) - Job 20180413-191933_1545337614 is finished, status: ABORT, exception: null, result: %text 'sometext'
INFO [2018-08-10 00:11:57,577] ({pool-2-thread-996} SchedulerFactory.java[jobFinished]:115) - Job 20180413-191933_1545337614 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark:user:2C68U586U-shared_session
INFO [2018-08-10 00:11:57,585] ({DefaultQuartzScheduler_Worker-47} ManagedInterpreterGroup.java[close]:130) - Job paragraph_1523636373190_-1466164905 aborted
{noformat}
Full log with debug messages:
{noformat}
INFO [2018-08-10 17:31:37,193] ({pool-2-thread-123} NotebookServer.java[afterStatusChange]:2631) - Job 20180810-124513_1104099490 is finished successfully, status: FINISHED
INFO [2018-08-10 17:31:37,215] ({pool-2-thread-123} VFSNotebookRepo.java[save]:196) - Saving note:2DNHBQ5N2
INFO [2018-08-10 17:31:37,216] ({pool-2-thread-123} SchedulerFactory.java[jobFinished]:115) - Job 20180810-124513_1104099490 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark::2DNHBQ5N2-shared_session
INFO [2018-08-10 17:31:37,228] ({pool-2-thread-131} SchedulerFactory.java[jobStarted]:109) - Job 20180810-132950_1064210956 started by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark::2DNHBQ5N2-shared_session
INFO [2018-08-10 17:31:37,229] ({pool-2-thread-131} Paragraph.java[jobRun]:380) - Run paragraph [paragraph_id: 20180810-132950_1064210956, interpreter: spark.pyspark, note_id: 2DNHBQ5N2, user: user1]
INFO [2018-08-10 17:31:38,207] ({pool-2-thread-35} Paragraph.java[jobRun]:460) - End of Run paragraph [paragraph_id: 20180810-124513_1104099490, interpreter: spark.pyspark, note_id: 2DNHBQ5N2, user: user1]
INFO [2018-08-10 17:31:38,207] ({pool-2-thread-35} SchedulerFactory.java[jobFinished]:115) - Job 20180810-124513_1104099490 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark::2DNHBQ5N2-shared_session
INFO [2018-08-10 17:31:38,224] ({pool-2-thread-131} Paragraph.java[jobRun]:460) - End of Run paragraph [paragraph_id: 20180810-132950_1064210956, interpreter: spark.pyspark, note_id: 2DNHBQ5N2, user: user1]
INFO [2018-08-10 17:31:38,227] ({pool-2-thread-131} NotebookServer.java[afterStatusChange]:2631) - Job 20180810-132950_1064210956 is finished successfully, status: FINISHED
INFO [2018-08-10 17:31:38,229] ({MyScheduler_Worker-5} Paragraph.java[execute]:343) - skip to run blank paragraph. 20180810-133022_784315150
INFO [2018-08-10 17:31:38,230] ({MyScheduler_Worker-5} Notebook.java[execute]:947) - End schedule run note: 2DNHBQ5N2
INFO [2018-08-10 17:31:38,230] ({MyScheduler_Worker-5} ManagedInterpreterGroup.java[close]:102) - Close Session: shared_session for interpreter setting: spark
INFO [2018-08-10 17:31:38,230] ({MyScheduler_Worker-5} RemoteScheduler.java[getJobsRunning]:135) -
[DEBUG]
RemoteScheduler adds paragraph_1533896990379_-679637373 to running list, job status is FINISHED
[DEBUG]
INFO [2018-08-10 17:31:38,230] ({MyScheduler_Worker-5} ManagedInterpreterGroup.java[close]:130) -
[DEBUG]
job paragraph_1533896990379_-679637373 is running
[DEBUG]
INFO [2018-08-10 17:31:38,231] ({MyScheduler_Worker-5} ManagedInterpreterGroup.java[close]:132) -
[DEBUG]
job paragraph_1533896990379_-679637373 is instanceof paragraph
[DEBUG]
INFO [2018-08-10 17:31:38,231] ({MyScheduler_Worker-5} ManagedInterpreterGroup.java[close]:133) -
[DEBUG]
Job description before aborting:
ParagraphId: 20180810-132950_1064210956
Status: FINISHED
Config: {colWidth=12.0, fontSize=9.0, enabled=true, results={}, editorSetting={language=python, editOnDblClick=false, completionKey=TAB, completionSupport=true}, editorMode=ace/mode/python, editorHide=false, tableHide=true}
Json: {
"text": "%spark.pyspark\nimport os\nimport shutil \n\nshutil.copy2(\u0027/home/egklimov/IdeaProjects/tcs-zeppelin/logs/zeppelin-egklimov-ubuntu013.log\u0027, \u0027/home/egklimov/IdeaProjects/tcs-zeppelin/logs/tmp/zeppelin-egklimov-ubuntu013.log\u0027)",
"user": "user1",
"dateUpdated": "2018-08-10 16:01:58.663",
"config": {
"colWidth": 12.0,
"fontSize": 9.0,
"enabled": true,
"results": {},
"editorSetting": {
"language": "python",
"editOnDblClick": false,
"completionKey": "TAB",
"completionSupport": true
},
"editorMode": "ace/mode/python",
"editorHide": false,
"tableHide": true
},
"settings": {
"params": {},
"forms": {}
},
"results": {
"code": "SUCCESS",
"msg": [
{
"type": "TEXT",
"data": "\u0027/home/egklimov/IdeaProjects/tcs-zeppelin/logs/tmp/zeppelin-egklimov-ubuntu013.log\u0027\n"
}
]
},
"apps": [],
"jobName": "paragraph_1533896990379_-679637373",
"id": "20180810-132950_1064210956",
"dateCreated": "2018-08-10 13:29:50.379",
"dateStarted": "2018-08-10 17:31:37.229",
"dateFinished": "2018-08-10 17:31:38.225",
"status": "FINISHED",
"progressUpdateIntervalMs": 500
}
[DEBUG]
INFO [2018-08-10 17:31:38,253] ({pool-2-thread-131} VFSNotebookRepo.java[save]:196) - Saving note:2DNHBQ5N2
INFO [2018-08-10 17:31:38,254] ({pool-2-thread-131} SchedulerFactory.java[jobFinished]:115) - Job 20180810-132950_1064210956 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark::2DNHBQ5N2-shared_session
WARN [2018-08-10 17:31:38,262] ({MyScheduler_Worker-5} NotebookServer.java[afterStatusChange]:2633) - Job 20180810-132950_1064210956 is finished, status: ABORT, exception: null, result: %text '/home/egklimov/IdeaProjects/tcs-zeppelin/logs/tmp/zeppelin-egklimov-ubuntu013.log'
INFO [2018-08-10 17:31:38,275] ({MyScheduler_Worker-5} VFSNotebookRepo.java[save]:196) - Saving note:2DNHBQ5N2
INFO [2018-08-10 17:31:38,276] ({MyScheduler_Worker-5} ManagedInterpreterGroup.java[close]:165) - Job paragraph_1533896990379_-679637373 aborted
INFO [2018-08-10 17:31:38,277] ({MyScheduler_Worker-5} ManagedInterpreterGroup.java[close]:167) -
[DEBUG]
Job description after aborting:
ParagraphId: 20180810-132950_1064210956
Status: ABORT
Config: {colWidth=12.0, fontSize=9.0, enabled=true, results={}, editorSetting={language=python, editOnDblClick=false, completionKey=TAB, completionSupport=true}, editorMode=ace/mode/python, editorHide=false, tableHide=true}
Json: {
"text": "%spark.pyspark\nimport os\nimport shutil \n\nshutil.copy2(\u0027/home/egklimov/IdeaProjects/tcs-zeppelin/logs/zeppelin-egklimov-ubuntu013.log\u0027, \u0027/home/egklimov/IdeaProjects/tcs-zeppelin/logs/tmp/zeppelin-egklimov-ubuntu013.log\u0027)",
"user": "user1",
"dateUpdated": "2018-08-10 16:01:58.663",
"config": {
"colWidth": 12.0,
"fontSize": 9.0,
"enabled": true,
"results": {},
"editorSetting": {
"language": "python",
"editOnDblClick": false,
"completionKey": "TAB",
"completionSupport": true
},
"editorMode": "ace/mode/python",
"editorHide": false,
"tableHide": true
},
"settings": {
"params": {},
"forms": {}
},
"results": {
"code": "SUCCESS",
"msg": [
{
"type": "TEXT",
"data": "\u0027/home/egklimov/IdeaProjects/tcs-zeppelin/logs/tmp/zeppelin-egklimov-ubuntu013.log\u0027\n"
}
]
},
"apps": [],
"jobName": "paragraph_1533896990379_-679637373",
"id": "20180810-132950_1064210956",
"dateCreated": "2018-08-10 13:29:50.379",
"dateStarted": "2018-08-10 17:31:37.229",
"dateFinished": "2018-08-10 17:31:38.225",
"status": "ABORT",
"progressUpdateIntervalMs": 500
}
[DEBUG]
{noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)