You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by "George Klimov (JIRA)" <ji...@apache.org> on 2018/08/10 15:46:00 UTC

[jira] [Created] (ZEPPELIN-3704) Scheduler.getJobsRunning() returns finished jobs

George Klimov created ZEPPELIN-3704:
---------------------------------------

             Summary: Scheduler.getJobsRunning() returns finished jobs
                 Key: ZEPPELIN-3704
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3704
             Project: Zeppelin
          Issue Type: Bug
          Components: zeppelin-zengine
            Reporter: George Klimov
            Assignee: George Klimov


Sometimes last paragraphs marks as *ABORT* with no reason after successful running using cron with active "After execution stop the interpreter" setting. I found out that reason of this behavior is that Scheduler.getJobsRunning() returns finished jobs. Has anyone ever faced this problem?

Short log (with additional log info from TinkoffCreditSystems fork):
{noformat}
 INFO [2018-08-10 00:08:00,000] ({DefaultQuartzScheduler_Worker-47} Notebook.java[execute]:945) - Start schedule run note: 2C68U586U, cronExpr:"0 8 0 * * ?"
 INFO [2018-08-10 00:08:00,047] ({pool-2-thread-266} SchedulerFactory.java[jobStarted]:109) - Job 20170814-171621_1685490119 started by scheduler  
 INFO [2018-08-10 00:10:35,387] ({pool-2-thread-266} SchedulerFactory.java[jobFinished]:115) - Job 20170814-171621_1685490119 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-greenplum_pd:user:2C68U586U-shared_session
 INFO [2018-08-10 00:10:35,417] ({pool-2-thread-3838} SchedulerFactory.java[jobStarted]:109) - Job 20180402-171122_400058927 started by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark:user:2C68U586U-shared_session
 INFO [2018-08-10 00:11:57,428] ({pool-2-thread-3838} SchedulerFactory.java[jobFinished]:115) - Job 20180402-171122_400058927 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark:user:2C68U586U-shared_session
 INFO [2018-08-10 00:11:57,445] ({pool-2-thread-996} SchedulerFactory.java[jobStarted]:109) - Job 20180413-191933_1545337614 started by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark:user:2C68U586U-shared_session
 INFO [2018-08-10 00:11:57,527] ({pool-2-thread-996} NotebookServer.java[afterStatusChange]:2631) - Job 20180413-191933_1545337614 is finished successfully, status: FINISHED
 INFO [2018-08-10 00:11:57,547] ({DefaultQuartzScheduler_Worker-47} Paragraph.java[execute]:343) - skip to run blank paragraph. 20180423-134725_1702290212
 INFO [2018-08-10 00:11:57,547] ({DefaultQuartzScheduler_Worker-47} Notebook.java[execute]:947) - End schedule run note: 2C68U586U
 INFO [2018-08-10 00:11:57,548] ({DefaultQuartzScheduler_Worker-47} ManagedInterpreterGroup.java[close]:100) - Close Session: shared_session for interpreter setting: spark
 INFO [2018-08-10 00:11:57,553] ({pool-2-thread-996} VFSNotebookRepo.java[save]:196) - Saving note:2C68U586U

	Third job status from FINISHED becomes ABORT 

 WARN [2018-08-10 00:11:57,555] ({DefaultQuartzScheduler_Worker-47} NotebookServer.java[afterStatusChange]:2633) - Job 20180413-191933_1545337614 is finished, status: ABORT, exception: null, result: %text 'sometext'
 INFO [2018-08-10 00:11:57,577] ({pool-2-thread-996} SchedulerFactory.java[jobFinished]:115) - Job 20180413-191933_1545337614 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark:user:2C68U586U-shared_session
 INFO [2018-08-10 00:11:57,585] ({DefaultQuartzScheduler_Worker-47} ManagedInterpreterGroup.java[close]:130) - Job paragraph_1523636373190_-1466164905 aborted 
{noformat}

Full log with debug messages:
{noformat}
 INFO [2018-08-10 17:31:37,193] ({pool-2-thread-123} NotebookServer.java[afterStatusChange]:2631) - Job 20180810-124513_1104099490 is finished successfully, status: FINISHED
 INFO [2018-08-10 17:31:37,215] ({pool-2-thread-123} VFSNotebookRepo.java[save]:196) - Saving note:2DNHBQ5N2
 INFO [2018-08-10 17:31:37,216] ({pool-2-thread-123} SchedulerFactory.java[jobFinished]:115) - Job 20180810-124513_1104099490 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark::2DNHBQ5N2-shared_session
 INFO [2018-08-10 17:31:37,228] ({pool-2-thread-131} SchedulerFactory.java[jobStarted]:109) - Job 20180810-132950_1064210956 started by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark::2DNHBQ5N2-shared_session
 INFO [2018-08-10 17:31:37,229] ({pool-2-thread-131} Paragraph.java[jobRun]:380) - Run paragraph [paragraph_id: 20180810-132950_1064210956, interpreter: spark.pyspark, note_id: 2DNHBQ5N2, user: user1]
 INFO [2018-08-10 17:31:38,207] ({pool-2-thread-35} Paragraph.java[jobRun]:460) - End of Run paragraph [paragraph_id: 20180810-124513_1104099490, interpreter: spark.pyspark, note_id: 2DNHBQ5N2, user: user1]
 INFO [2018-08-10 17:31:38,207] ({pool-2-thread-35} SchedulerFactory.java[jobFinished]:115) - Job 20180810-124513_1104099490 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark::2DNHBQ5N2-shared_session
 INFO [2018-08-10 17:31:38,224] ({pool-2-thread-131} Paragraph.java[jobRun]:460) - End of Run paragraph [paragraph_id: 20180810-132950_1064210956, interpreter: spark.pyspark, note_id: 2DNHBQ5N2, user: user1]
 INFO [2018-08-10 17:31:38,227] ({pool-2-thread-131} NotebookServer.java[afterStatusChange]:2631) - Job 20180810-132950_1064210956 is finished successfully, status: FINISHED
 INFO [2018-08-10 17:31:38,229] ({MyScheduler_Worker-5} Paragraph.java[execute]:343) - skip to run blank paragraph. 20180810-133022_784315150
 INFO [2018-08-10 17:31:38,230] ({MyScheduler_Worker-5} Notebook.java[execute]:947) - End schedule run note: 2DNHBQ5N2
 INFO [2018-08-10 17:31:38,230] ({MyScheduler_Worker-5} ManagedInterpreterGroup.java[close]:102) - Close Session: shared_session for interpreter setting: spark
 INFO [2018-08-10 17:31:38,230] ({MyScheduler_Worker-5} RemoteScheduler.java[getJobsRunning]:135) - 
[DEBUG]
		RemoteScheduler adds paragraph_1533896990379_-679637373 to running list, job status is FINISHED
[DEBUG]

 INFO [2018-08-10 17:31:38,230] ({MyScheduler_Worker-5} ManagedInterpreterGroup.java[close]:130) - 
[DEBUG]
	job paragraph_1533896990379_-679637373 is running
[DEBUG]

 INFO [2018-08-10 17:31:38,231] ({MyScheduler_Worker-5} ManagedInterpreterGroup.java[close]:132) - 
[DEBUG]
	job paragraph_1533896990379_-679637373 is instanceof paragraph
[DEBUG]

 INFO [2018-08-10 17:31:38,231] ({MyScheduler_Worker-5} ManagedInterpreterGroup.java[close]:133) - 
[DEBUG]
	Job description before aborting:

		ParagraphId: 20180810-132950_1064210956
		Status: FINISHED
		Config: {colWidth=12.0, fontSize=9.0, enabled=true, results={}, editorSetting={language=python, editOnDblClick=false, completionKey=TAB, completionSupport=true}, editorMode=ace/mode/python, editorHide=false, tableHide=true}
		Json: {
  "text": "%spark.pyspark\nimport os\nimport shutil \n\nshutil.copy2(\u0027/home/egklimov/IdeaProjects/tcs-zeppelin/logs/zeppelin-egklimov-ubuntu013.log\u0027, \u0027/home/egklimov/IdeaProjects/tcs-zeppelin/logs/tmp/zeppelin-egklimov-ubuntu013.log\u0027)",
  "user": "user1",
  "dateUpdated": "2018-08-10 16:01:58.663",
  "config": {
    "colWidth": 12.0,
    "fontSize": 9.0,
    "enabled": true,
    "results": {},
    "editorSetting": {
      "language": "python",
      "editOnDblClick": false,
      "completionKey": "TAB",
      "completionSupport": true
    },
    "editorMode": "ace/mode/python",
    "editorHide": false,
    "tableHide": true
  },
  "settings": {
    "params": {},
    "forms": {}
  },
  "results": {
    "code": "SUCCESS",
    "msg": [
      {
        "type": "TEXT",
        "data": "\u0027/home/egklimov/IdeaProjects/tcs-zeppelin/logs/tmp/zeppelin-egklimov-ubuntu013.log\u0027\n"
      }
    ]
  },
  "apps": [],
  "jobName": "paragraph_1533896990379_-679637373",
  "id": "20180810-132950_1064210956",
  "dateCreated": "2018-08-10 13:29:50.379",
  "dateStarted": "2018-08-10 17:31:37.229",
  "dateFinished": "2018-08-10 17:31:38.225",
  "status": "FINISHED",
  "progressUpdateIntervalMs": 500
}  
[DEBUG]

 INFO [2018-08-10 17:31:38,253] ({pool-2-thread-131} VFSNotebookRepo.java[save]:196) - Saving note:2DNHBQ5N2
 INFO [2018-08-10 17:31:38,254] ({pool-2-thread-131} SchedulerFactory.java[jobFinished]:115) - Job 20180810-132950_1064210956 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark::2DNHBQ5N2-shared_session
 WARN [2018-08-10 17:31:38,262] ({MyScheduler_Worker-5} NotebookServer.java[afterStatusChange]:2633) - Job 20180810-132950_1064210956 is finished, status: ABORT, exception: null, result: %text '/home/egklimov/IdeaProjects/tcs-zeppelin/logs/tmp/zeppelin-egklimov-ubuntu013.log'

 INFO [2018-08-10 17:31:38,275] ({MyScheduler_Worker-5} VFSNotebookRepo.java[save]:196) - Saving note:2DNHBQ5N2
 INFO [2018-08-10 17:31:38,276] ({MyScheduler_Worker-5} ManagedInterpreterGroup.java[close]:165) - Job paragraph_1533896990379_-679637373 aborted 
 INFO [2018-08-10 17:31:38,277] ({MyScheduler_Worker-5} ManagedInterpreterGroup.java[close]:167) - 
[DEBUG]

	Job description after aborting:

		ParagraphId: 20180810-132950_1064210956
		Status: ABORT
		Config: {colWidth=12.0, fontSize=9.0, enabled=true, results={}, editorSetting={language=python, editOnDblClick=false, completionKey=TAB, completionSupport=true}, editorMode=ace/mode/python, editorHide=false, tableHide=true}
		Json: {
  "text": "%spark.pyspark\nimport os\nimport shutil \n\nshutil.copy2(\u0027/home/egklimov/IdeaProjects/tcs-zeppelin/logs/zeppelin-egklimov-ubuntu013.log\u0027, \u0027/home/egklimov/IdeaProjects/tcs-zeppelin/logs/tmp/zeppelin-egklimov-ubuntu013.log\u0027)",
  "user": "user1",
  "dateUpdated": "2018-08-10 16:01:58.663",
  "config": {
    "colWidth": 12.0,
    "fontSize": 9.0,
    "enabled": true,
    "results": {},
    "editorSetting": {
      "language": "python",
      "editOnDblClick": false,
      "completionKey": "TAB",
      "completionSupport": true
    },
    "editorMode": "ace/mode/python",
    "editorHide": false,
    "tableHide": true
  },
  "settings": {
    "params": {},
    "forms": {}
  },
  "results": {
    "code": "SUCCESS",
    "msg": [
      {
        "type": "TEXT",
        "data": "\u0027/home/egklimov/IdeaProjects/tcs-zeppelin/logs/tmp/zeppelin-egklimov-ubuntu013.log\u0027\n"
      }
    ]
  },
  "apps": [],
  "jobName": "paragraph_1533896990379_-679637373",
  "id": "20180810-132950_1064210956",
  "dateCreated": "2018-08-10 13:29:50.379",
  "dateStarted": "2018-08-10 17:31:37.229",
  "dateFinished": "2018-08-10 17:31:38.225",
  "status": "ABORT",
  "progressUpdateIntervalMs": 500
}
[DEBUG]
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)