You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "Wu, James C." <Ja...@disney.com> on 2015/08/07 21:40:39 UTC

SparkSQL: "add jar" blocks all queries

Hi,

I got into a situation where a prior "add jar " command causing Spark SQL stops to work for all users.

Does anyone know how to fix the issue?

Regards,

james

From: <Wu>, Walt Disney <ja...@disney.com>>
Date: Friday, August 7, 2015 at 10:29 AM
To: "user@spark.apache.org<ma...@spark.apache.org>" <us...@spark.apache.org>>
Subject: SparkSQL: remove jar added by "add jar " command from dependencies

Hi,

I am using Spark SQL to run some queries on a set of avro data. Somehow I am getting this error

0: jdbc:hive2://n7-z01-0a2a1453> select count(*) from flume_test;

Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 26.0 failed 4 times, most recent failure: Lost task 3.3 in stage 26.0 (TID 1027, n7-z01-0a2a1457.iaas.starwave.com): java.io.IOException: Incomplete HDFS URI, no host: hdfs:////data/hive-jars/avro-mapred.jar

at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:141)

at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)

at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)

at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)

at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)

at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)

at org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1364)

at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:498)

at org.apache.spark.util.Utils$.fetchFile(Utils.scala:383)

at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$6.apply(Executor.scala:350)

at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$6.apply(Executor.scala:347)

at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)

at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)

at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)

at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)

at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)

at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)

at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)

at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:347)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)


I did not add the jar in this session, so I am wondering how I can get the jar removed from the dependencies so that It is not blocking all my spark sql queries for all sessions.

Thanks,

James

Re: SparkSQL: "add jar" blocks all queries

Posted by "Wu, James C." <Ja...@disney.com>.
Hi,

The issue only seems to happen when trying to access spark via the SparkSQL Thrift Server interface.

Does anyone know a fix?

james

From: <Wu>, Walt Disney <ja...@disney.com>>
Date: Friday, August 7, 2015 at 12:40 PM
To: "user@spark.apache.org<ma...@spark.apache.org>" <us...@spark.apache.org>>
Subject: SparkSQL: "add jar" blocks all queries

Hi,

I got into a situation where a prior "add jar " command causing Spark SQL stops to work for all users.

Does anyone know how to fix the issue?

Regards,

james

From: <Wu>, Walt Disney <ja...@disney.com>>
Date: Friday, August 7, 2015 at 10:29 AM
To: "user@spark.apache.org<ma...@spark.apache.org>" <us...@spark.apache.org>>
Subject: SparkSQL: remove jar added by "add jar " command from dependencies

Hi,

I am using Spark SQL to run some queries on a set of avro data. Somehow I am getting this error

0: jdbc:hive2://n7-z01-0a2a1453> select count(*) from flume_test;

Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 26.0 failed 4 times, most recent failure: Lost task 3.3 in stage 26.0 (TID 1027, n7-z01-0a2a1457.iaas.starwave.com): java.io.IOException: Incomplete HDFS URI, no host: hdfs:////data/hive-jars/avro-mapred.jar

at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:141)

at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)

at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)

at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)

at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)

at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)

at org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1364)

at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:498)

at org.apache.spark.util.Utils$.fetchFile(Utils.scala:383)

at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$6.apply(Executor.scala:350)

at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$6.apply(Executor.scala:347)

at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)

at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)

at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)

at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)

at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)

at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)

at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)

at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:347)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)


I did not add the jar in this session, so I am wondering how I can get the jar removed from the dependencies so that It is not blocking all my spark sql queries for all sessions.

Thanks,

James