You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Michael Armbrust (JIRA)" <ji...@apache.org> on 2015/09/15 23:24:46 UTC
[jira] [Resolved] (SPARK-5060) Spark driver main thread hanging
after SQL insert in Parquet file
[ https://issues.apache.org/jira/browse/SPARK-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Armbrust resolved SPARK-5060.
-------------------------------------
Resolution: Cannot Reproduce
This code has changed a lot in Spark 1.5, so I'm going to close this ticket. Please reopen if you can still reproduce.
> Spark driver main thread hanging after SQL insert in Parquet file
> -----------------------------------------------------------------
>
> Key: SPARK-5060
> URL: https://issues.apache.org/jira/browse/SPARK-5060
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Reporter: Alex Baretta
>
> Here's what the console shows:
> 15/01/01 01:12:29 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 58.0, whose tasks have all completed, from pool
> 15/01/01 01:12:29 INFO scheduler.DAGScheduler: Stage 58 (runJob at ParquetTableOperations.scala:326) finished in 5493.549 s
> 15/01/01 01:12:29 INFO scheduler.DAGScheduler: Job 41 finished: runJob at ParquetTableOperations.scala:326, took 5493.747061 s
> It is now 01:40:03, so the driver has been hanging for the last 28 minutes. The web UI on the other hand shows that all tasks completed successfully, and the output directory has been populated--although the _SUCCESS file is missing.
> It is worth noting that my code started this job as its own thread. The actual code looks like the following snippet, modulo some simplifications.
> def save_to_parquet(allowExisting : Boolean = false) = {
> val threads = tables.map(table => {
> val thread = new Thread {
> override def run {
> table.insertInto(t.table_name)
> }
> }
> thread.start
> thread
> })
> threads.foreach(_.join)
> }
> As far as I can see the insertInto call never returns.
> The version of Spark I'm using is built from master, off of this commit:
> commit 815de54002f9c1cfedc398e95896fa207b4a5305
> Author: YanTangZhai <ha...@tencent.com>
> Date: Mon Dec 29 11:30:54 2014 -0800
> [SPARK-4946] [CORE] Using AkkaUtils.askWithReply in MapOutputTracker.askTracker to reduce the chance of the communicating problem
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org