You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Xuefu Zhang (JIRA)" <ji...@apache.org> on 2014/07/29 23:13:38 UTC

[jira] [Reopened] (HIVE-7437) Check if servlet-api and jetty module in Spark library are an issue for hive-spark integration [Spark Branch]

     [ https://issues.apache.org/jira/browse/HIVE-7437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xuefu Zhang reopened HIVE-7437:
-------------------------------


Reopen the issue for tracking purpose, as I can still hit the issue. I got the following stacktrace when running a simple query:
{code}
14/07/28 15:41:03 ERROR Executor: Exception in task ID 7
java.lang.IllegalStateException: unread block data
	at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2418)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1379)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1988)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1912)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1795)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)
	at org.apache.spark.scheduler.ShuffleMapTask.readExternal(ShuffleMapTask.scala:140)
	at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1834)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1793)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)
	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)
	at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:85)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:165)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:679)
{code}

After shading servlet-api/jetty in Spark build, the problem goes away.

> Check if servlet-api and jetty module in Spark library are an issue for hive-spark integration [Spark Branch]
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-7437
>                 URL: https://issues.apache.org/jira/browse/HIVE-7437
>             Project: Hive
>          Issue Type: Task
>          Components: Spark
>            Reporter: Xuefu Zhang
>            Assignee: Chengxiang Li
>             Fix For: spark-branch
>
>
> Currently we used a customized Spark 1.0.0 build for Hive on Spark project because of library conflicts. One of the conflicts found during POC is about servlet-api and jetty, where in Spark the version is 3.0 while the rest of Hadoop components, including Hive, is still on 2.5. As a followup for HIVE-7371, it would be good to figured out if this continues to be an issue.
> The corresponding Spark JIRA is SPARK-2420.
> NO PRECOMMIT TESTS. This is for spark-branch only.



--
This message was sent by Atlassian JIRA
(v6.2#6252)