You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Manuel Sopena Ballesteros <ma...@garvan.org.au> on 2019/10/21 05:46:33 UTC

driver crashesneed to find out why driver keeps crashing

Dear Apache Spark community,

My spark driver crashes and logs does not gives enough explanation of why it happens:

INFO [2019-10-21 16:33:37,045] ({pool-6-thread-7} SchedulerFactory.java[jobStarted]:109) - Job 20190926-163704_913596201 started by scheduler interpreter_2100843352
DEBUG [2019-10-21 16:33:37,046] ({pool-6-thread-7} RemoteInterpreterServer.java[jobRun]:632) - Script after hooks: a = "bigword"
b = "bigword"
print(a)

for i in range(10000000):
    a += b

print(a)
__zeppelin__._displayhook()
DEBUG [2019-10-21 16:33:37,046] ({pool-6-thread-7} RemoteInterpreterEventClient.java[sendEvent]:413) - Send Event: RemoteInterpreterEvent(type:META_INFOS, data:{"message":"Spark UI enabled","url":"http://r640-1-10-mlx.mlx:38863"})
DEBUG [2019-10-21 16:33:37,048] ({pool-5-thread-1} RemoteInterpreterEventClient.java[pollEvent]:366) - Send event META_INFOS
DEBUG [2019-10-21 16:33:37,054] ({Thread-34} RemoteInterpreterServer.java[onUpdate]:799) - Output Update:
DEBUG [2019-10-21 16:33:37,054] ({Thread-34} RemoteInterpreterEventClient.java[sendEvent]:413) - Send Event: RemoteInterpreterEvent(type:OUTPUT_UPDATE, data:{"data":"","index":"0","noteId":"2ENM9X82N","paragraphId":"20190926-163704_913596201","type":"TEXT"})
DEBUG [2019-10-21 16:33:37,054] ({Thread-34} RemoteInterpreterServer.java[onAppend]:789) - Output Append: bigword

DEBUG [2019-10-21 16:33:37,054] ({pool-5-thread-1} RemoteInterpreterEventClient.java[pollEvent]:366) - Send event OUTPUT_UPDATE
DEBUG [2019-10-21 16:33:37,054] ({Thread-34} RemoteInterpreterEventClient.java[sendEvent]:413) - Send Event: RemoteInterpreterEvent(type:OUTPUT_APPEND, data:{"data":"bigword\n","index":"0","noteId":"2ENM9X82N","paragraphId":"20190926-163704_913596201"})
DEBUG [2019-10-21 16:33:37,062] ({pool-5-thread-1} RemoteInterpreterEventClient.java[pollEvent]:366) - Send event OUTPUT_APPEND
DEBUG [2019-10-21 16:33:37,145] ({pool-5-thread-3} Interpreter.java[getProperty]:222) - key: zeppelin.spark.concurrentSQL, value: false
DEBUG [2019-10-21 16:33:37,145] ({pool-5-thread-3} Interpreter.java[getProperty]:222) - key: zeppelin.spark.concurrentSQL, value: false
DEBUG [2019-10-21 16:33:37,225] ({pool-5-thread-3} RemoteInterpreterServer.java[resourcePoolGetAll]:1089) - Request getAll from ZeppelinServer
ERROR [2019-10-21 16:33:40,357] ({SIGTERM handler} SignalUtils.scala[apply$mcZ$sp]:43) - RECEIVED SIGNAL TERM
INFO [2019-10-21 16:33:40,530] ({shutdown-hook-0} Logging.scala[logInfo]:54) - Invoking stop() from shutdown hook

I assume it is because the jvm runs out of memory but I would expect an error saying so.

Any idea?

Thank you very much
NOTICE
Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.

Re: driver crashesneed to find out why driver keeps crashing

Posted by Akshay Bhardwaj <ak...@gmail.com>.
Hi,

Were you able to check the executors logs for this? If executors are
running in a separate JVMs/machines, they will have separate log files from
driver. If the OOME is due to concatenation of the large string, it may be
reported in the executors logs first.

How are you running this spark job? Standalone spark process(master is set
to local[*]) ?
Spark master-slave cluster?
YARN or Mesos Cluster, etc?


Akshay Bhardwaj
+91-97111-33849


On Mon, Oct 21, 2019 at 11:20 AM Manuel Sopena Ballesteros <
manuel.sb@garvan.org.au> wrote:

> Dear Apache Spark community,
>
>
>
> My spark driver crashes and logs does not gives enough explanation of why
> it happens:
>
>
>
> INFO [2019-10-21 16:33:37,045] ({pool-6-thread-7}
> SchedulerFactory.java[jobStarted]:109) - Job 20190926-163704_913596201
> started by scheduler interpreter_2100843352
>
> DEBUG [2019-10-21 16:33:37,046] ({pool-6-thread-7}
> RemoteInterpreterServer.java[jobRun]:632) - Script after hooks: a =
> "bigword"
>
> b = "bigword"
>
> print(a)
>
>
>
> for i in range(10000000):
>
>     a += b
>
>
>
> print(a)
>
> __zeppelin__._displayhook()
>
> DEBUG [2019-10-21 16:33:37,046] ({pool-6-thread-7}
> RemoteInterpreterEventClient.java[sendEvent]:413) - Send Event:
> RemoteInterpreterEvent(type:META_INFOS, data:{"message":"Spark UI
> enabled","url":"http://r640-1-10-mlx.mlx:38863"})
>
> DEBUG [2019-10-21 16:33:37,048] ({pool-5-thread-1}
> RemoteInterpreterEventClient.java[pollEvent]:366) - Send event META_INFOS
>
> DEBUG [2019-10-21 16:33:37,054] ({Thread-34}
> RemoteInterpreterServer.java[onUpdate]:799) - Output Update:
>
> DEBUG [2019-10-21 16:33:37,054] ({Thread-34}
> RemoteInterpreterEventClient.java[sendEvent]:413) - Send Event:
> RemoteInterpreterEvent(type:OUTPUT_UPDATE,
> data:{"data":"","index":"0","noteId":"2ENM9X82N","paragraphId":"20190926-163704_913596201","type":"TEXT"})
>
> DEBUG [2019-10-21 16:33:37,054] ({Thread-34}
> RemoteInterpreterServer.java[onAppend]:789) - Output Append: bigword
>
>
>
> DEBUG [2019-10-21 16:33:37,054] ({pool-5-thread-1}
> RemoteInterpreterEventClient.java[pollEvent]:366) - Send event OUTPUT_UPDATE
>
> DEBUG [2019-10-21 16:33:37,054] ({Thread-34}
> RemoteInterpreterEventClient.java[sendEvent]:413) - Send Event:
> RemoteInterpreterEvent(type:OUTPUT_APPEND,
> data:{"data":"bigword\n","index":"0","noteId":"2ENM9X82N","paragraphId":"20190926-163704_913596201"})
>
> DEBUG [2019-10-21 16:33:37,062] ({pool-5-thread-1}
> RemoteInterpreterEventClient.java[pollEvent]:366) - Send event OUTPUT_APPEND
>
> DEBUG [2019-10-21 16:33:37,145] ({pool-5-thread-3}
> Interpreter.java[getProperty]:222) - key: zeppelin.spark.concurrentSQL,
> value: false
>
> DEBUG [2019-10-21 16:33:37,145] ({pool-5-thread-3}
> Interpreter.java[getProperty]:222) - key: zeppelin.spark.concurrentSQL,
> value: false
>
> DEBUG [2019-10-21 16:33:37,225] ({pool-5-thread-3}
> RemoteInterpreterServer.java[resourcePoolGetAll]:1089) - Request getAll
> from ZeppelinServer
>
> ERROR [2019-10-21 16:33:40,357] ({SIGTERM handler}
> SignalUtils.scala[apply$mcZ$sp]:43) - RECEIVED SIGNAL TERM
>
> INFO [2019-10-21 16:33:40,530] ({shutdown-hook-0}
> Logging.scala[logInfo]:54) - Invoking stop() from shutdown hook
>
>
>
> I assume it is because the jvm runs out of memory but I would expect an
> error saying so.
>
>
>
> Any idea?
>
>
>
> Thank you very much
> NOTICE
> Please consider the environment before printing this email. This message
> and any attachments are intended for the addressee named and may contain
> legally privileged/confidential/copyright information. If you are not the
> intended recipient, you should not read, use, disclose, copy or distribute
> this communication. If you have received this message in error please
> notify us at once by return email and then delete both messages. We accept
> no liability for the distribution of viruses or similar in electronic
> communications. This notice should not be removed.
>