You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Ophir Etzion <op...@foursquare.com> on 2015/12/15 23:26:41 UTC

Hive on Spark - Error: Child process exited before connecting back

Hi,

when trying to do Hive on Spark on CDH5.4.3 I get the following error when
trying to run a simple query using spark.

I've tried setting everything written here (
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started)
as well as what the cdh recommends.

any one encountered this as well? (searching for it didn't help much)

the error:

ERROR : Failed to execute spark task, with exception
'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark
client.)'
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark
client.
at
org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57)
at
org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
at
org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120)
at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
at
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)
at
org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
at
org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at
org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException:
java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel
client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited
before connecting back
at com.google.common.base.Throwables.propagate(Throwables.java:156)
at
org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:109)
at
org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
at
org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:91)
at
org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)
at
org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)
... 22 more
Caused by: java.util.concurrent.ExecutionException:
java.lang.RuntimeException: Cancel client
'2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
connecting back
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
at
org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:99)
... 26 more
Caused by: java.lang.RuntimeException: Cancel client
'2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
connecting back
at
org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)
at
org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427)
... 1 more

ERROR : Failed to execute spark task, with exception
'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark
client.)'
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark
client.
at
org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57)
at
org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
at
org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120)
at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
at
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)
at
org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
at
org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at
org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException:
java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel
client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited
before connecting back
at com.google.common.base.Throwables.propagate(Throwables.java:156)
at
org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:109)
at
org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
at
org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:91)
at
org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)
at
org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)
... 22 more
Caused by: java.util.concurrent.ExecutionException:
java.lang.RuntimeException: Cancel client
'2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
connecting back
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
at
org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:99)
... 26 more
Caused by: java.lang.RuntimeException: Cancel client
'2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
connecting back
at
org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)
at
org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427)
... 1 more
Error: Error while processing statement: FAILED: Execution Error, return
code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
(state=08S01,code=1)

Re: Hive on Spark - Error: Child process exited before connecting back

Posted by Xuefu Zhang <xz...@cloudera.com>.

Ophir,

Can you provide your hive.log here? Also, have you checked your spark
application log?

When this happens, it usually means that Hive is not able to launch an
spark application. In case of spark on YARN, this application is the
application master. If Hive fails to launch it, or the application master
fails before it can connect back, you would see such error messages. To get
more information, you should check the spark application log.

--Xuefu

On Tue, Dec 15, 2015 at 2:26 PM, Ophir Etzion <op...@foursquare.com> wrote:

> Hi,
>
> when trying to do Hive on Spark on CDH5.4.3 I get the following error when
> trying to run a simple query using spark.
>
> I've tried setting everything written here (
> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started)
> as well as what the cdh recommends.
>
> any one encountered this as well? (searching for it didn't help much)
>
> the error:
>
> ERROR : Failed to execute spark task, with exception
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark
> client.)'
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark
> client.
> at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57)
> at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
> at
> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120)
> at
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
> at
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)
> at
> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
> at
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException:
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel
> client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited
> before connecting back
> at com.google.common.base.Throwables.propagate(Throwables.java:156)
> at
> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:109)
> at
> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
> at
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:91)
> at
> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)
> at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)
> ... 22 more
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.RuntimeException: Cancel client
> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
> connecting back
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at
> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:99)
> ... 26 more
> Caused by: java.lang.RuntimeException: Cancel client
> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
> connecting back
> at
> org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)
> at
> org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427)
> ... 1 more
>
> ERROR : Failed to execute spark task, with exception
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark
> client.)'
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark
> client.
> at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57)
> at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
> at
> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120)
> at
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
> at
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)
> at
> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
> at
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException:
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel
> client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited
> before connecting back
> at com.google.common.base.Throwables.propagate(Throwables.java:156)
> at
> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:109)
> at
> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
> at
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:91)
> at
> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)
> at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)
> ... 22 more
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.RuntimeException: Cancel client
> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
> connecting back
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at
> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:99)
> ... 26 more
> Caused by: java.lang.RuntimeException: Cancel client
> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
> connecting back
> at
> org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)
> at
> org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427)
> ... 1 more
> Error: Error while processing statement: FAILED: Execution Error, return
> code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
> (state=08S01,code=1)
>
>

Re: Hive on Spark - Error: Child process exited before connecting back

Posted by Xuefu Zhang <xz...@cloudera.com>.

These missing classes are in hadoop jar. If you have HADOOP_HOME set, then
they should be in Hive classpath.

--Xuefu

On Thu, Dec 17, 2015 at 10:12 AM, Ophir Etzion <op...@foursquare.com> wrote:

> it seems like the problem is that the spark client needs FSDataInputStream
> but is not included in the hive-exec-1.1.0-cdh5.4.3.jar that is passed in
> the class path.
> I need to look more in spark-submit / org.apache.spark.deploy to see if
> there is a way to include more jars.
>
>
> 2015-12-17 17:34:01,679 INFO org.apache.hive.spark.client.SparkClientImpl:
> Running client driver with argv:
> /export/hdb3/data/cloudera/parcels/CDH-5.4.3-1.cdh5.4.3.p0.6/lib/spark/bin/spark-submit
> --executor-cores 1 --executor-memory 268435456 --proxy-user anonymous
> --properties-file /tmp/spark-submit.1508744664719491459.properties --class
> org.apache.hive.spark.client.RemoteDriver
> /export/hdb3/data/cloudera/parcels/CDH-5.4.3-1.cdh5.4.3.p0.6/jars/hive-exec-1.1.0-cdh5.4.3.jar
> --remote-host ezaq6.prod.foursquare.com --remote-port 44306 --conf
> hive.spark.client.connect.timeout=1000 --conf
> hive.spark.client.server.connect.timeout=90000 --conf
> hive.spark.client.channel.log.level=null --conf
> hive.spark.client.rpc.max.size=52428800 --conf
> hive.spark.client.rpc.threads=8 --conf hive.spark.client.secret.bits=256
> 2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl:
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/hadoop/fs/FSDataInputStream
> 2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: at
> org.apache.spark.deploy.SparkSubmitDriverBootstrapper$.main(SparkSubmitDriverBootstrapper.scala:71)
> 2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: at
> org.apache.spark.deploy.SparkSubmitDriverBootstrapper.main(SparkSubmitDriverBootstrapper.scala)
> 2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl:
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.fs.FSDataInputStream
> 2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: at
> java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> 2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: at
> java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> 2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: at
> java.security.AccessController.doPrivileged(Native Method)
> 2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: at
> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> 2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: at
> java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> 2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: at
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> 2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: at
> java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> 2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: ...
> 2 more
> 2015-12-17 17:34:02,438 WARN org.apache.hive.spark.client.SparkClientImpl:
> Child process exited with code 1.
>
> On Tue, Dec 15, 2015 at 11:15 PM, Xuefu Zhang <xz...@cloudera.com> wrote:
>
>> As to the spark versions that are supported. Spark has made
>> non-compatible API changes in 1.5, and that's the reason why Hive 1.1.0
>> doesn't work with Spark 1.5. However, the latest Hive in master or branch-1
>> should work with spark 1.5.
>>
>> Also, later CDH 5.4.x versions have already supported Spark 1.5. CDH 5.7,
>> which is coming so, will support Spark 1.6.
>>
>> --Xuefu
>>
>> On Tue, Dec 15, 2015 at 3:50 PM, Mich Talebzadeh <mi...@peridale.co.uk>
>> wrote:
>>
>>> To answer your point:
>>>
>>>
>>>
>>> “why would spark 1.5.2 specifically would not work with hive?”
>>>
>>>
>>>
>>> Because I tried Spark 1.5.2 and it did not work and unfortunately the
>>> only version seem to work (albeit requires messaging around) is version
>>> 1.3.1 of Spark.
>>>
>>>
>>>
>>> Look at the threads on “Managed to make Hive run on Spark engine” in
>>> user@hive.apache.org
>>>
>>>
>>>
>>>
>>>
>>> HTH,
>>>
>>>
>>>
>>>
>>>
>>> Mich Talebzadeh
>>>
>>>
>>>
>>> *Sybase ASE 15 Gold Medal Award 2008*
>>>
>>> A Winning Strategy: Running the most Critical Financial Data on ASE 15
>>>
>>>
>>> http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf
>>>
>>> Author of the books* "A Practitioner’s Guide to Upgrading to Sybase ASE
>>> 15", ISBN 978-0-9563693-0-7*.
>>>
>>> co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN
>>> 978-0-9759693-0-4*
>>>
>>> *Publications due shortly:*
>>>
>>> *Complex Event Processing in Heterogeneous Environments*, ISBN:
>>> 978-0-9563693-3-8
>>>
>>> *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume
>>> one out shortly
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>>
>>> NOTE: The information in this email is proprietary and confidential.
>>> This message is for the designated recipient only, if you are not the
>>> intended recipient, you should destroy it immediately. Any information in
>>> this message shall not be understood as given or endorsed by Peridale
>>> Technology Ltd, its subsidiaries or their employees, unless expressly so
>>> stated. It is the responsibility of the recipient to ensure that this email
>>> is virus free, therefore neither Peridale Ltd, its subsidiaries nor their
>>> employees accept any responsibility.
>>>
>>>
>>>
>>> *From:* Ophir Etzion [mailto:ophir@foursquare.com]
>>> *Sent:* 15 December 2015 22:42
>>> *To:* user@hive.apache.org
>>> *Cc:* user@spark.apache.org
>>> *Subject:* Re: Hive on Spark - Error: Child process exited before
>>> connecting back
>>>
>>>
>>>
>>> Hi,
>>>
>>> the versions are spark 1.3.0 and hive 1.1.0 as part of cloudera 5.4.3.
>>>
>>> I find it weird that it would work only on the version you mentioned as
>>> there is documentation (not good documentation but still..) on how to do it
>>> with cloudera that packages different versions.
>>>
>>> Thanks for the answer though.
>>>
>>> why would spark 1.5.2 specifically would not work with hive?
>>>
>>>
>>>
>>> Ophir
>>>
>>>
>>>
>>> On Tue, Dec 15, 2015 at 5:33 PM, Mich Talebzadeh <mi...@peridale.co.uk>
>>> wrote:
>>>
>>> Hi,
>>>
>>>
>>>
>>> The only version that I have managed to run Hive using Spark engine is
>>> Spark 1.3.1 on Hive 1.2.1
>>>
>>>
>>>
>>> Can you confirm the version of Spark you are running?
>>>
>>>
>>>
>>> FYI, Spark 1.5.2 will not work with Hive.
>>>
>>>
>>>
>>> HTH
>>>
>>>
>>>
>>> Mich Talebzadeh
>>>
>>>
>>>
>>> *Sybase ASE 15 Gold Medal Award 2008*
>>>
>>> A Winning Strategy: Running the most Critical Financial Data on ASE 15
>>>
>>>
>>> http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf
>>>
>>> Author of the books* "A Practitioner’s Guide to Upgrading to Sybase ASE
>>> 15", ISBN 978-0-9563693-0-7*.
>>>
>>> co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN
>>> 978-0-9759693-0-4*
>>>
>>> *Publications due shortly:*
>>>
>>> *Complex Event Processing in Heterogeneous Environments*, ISBN:
>>> 978-0-9563693-3-8
>>>
>>> *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume
>>> one out shortly
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>>
>>> NOTE: The information in this email is proprietary and confidential.
>>> This message is for the designated recipient only, if you are not the
>>> intended recipient, you should destroy it immediately. Any information in
>>> this message shall not be understood as given or endorsed by Peridale
>>> Technology Ltd, its subsidiaries or their employees, unless expressly so
>>> stated. It is the responsibility of the recipient to ensure that this email
>>> is virus free, therefore neither Peridale Ltd, its subsidiaries nor their
>>> employees accept any responsibility.
>>>
>>>
>>>
>>> *From:* Ophir Etzion [mailto:ophir@foursquare.com]
>>> *Sent:* 15 December 2015 22:27
>>> *To:* user@spark.apache.org; user@hive.apache.org
>>> *Subject:* Hive on Spark - Error: Child process exited before
>>> connecting back
>>>
>>>
>>>
>>> Hi,
>>>
>>>
>>>
>>> when trying to do Hive on Spark on CDH5.4.3 I get the following error
>>> when trying to run a simple query using spark.
>>>
>>> I've tried setting everything written here (
>>> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started)
>>> as well as what the cdh recommends.
>>>
>>> any one encountered this as well? (searching for it didn't help much)
>>>
>>> the error:
>>>
>>> ERROR : Failed to execute spark task, with exception
>>> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark
>>> client.)'
>>>
>>> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark
>>> client.
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)
>>>
>>>             at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
>>>
>>>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>>>
>>>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
>>>
>>>             at
>>> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)
>>>
>>>             at
>>> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
>>>
>>>             at
>>> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
>>>
>>>             at java.security.AccessController.doPrivileged(Native Method)
>>>
>>>             at javax.security.auth.Subject.doAs(Subject.java:415)
>>>
>>>             at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>>>
>>>             at
>>> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
>>>
>>>             at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>
>>>             at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>
>>>             at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>
>>>             at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>
>>>             at java.lang.Thread.run(Thread.java:745)
>>>
>>> Caused by: java.lang.RuntimeException:
>>> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel
>>> client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited
>>> before connecting back
>>>
>>>             at
>>> com.google.common.base.Throwables.propagate(Throwables.java:156)
>>>
>>>             at
>>> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:109)
>>>
>>>             at
>>> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:91)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)
>>>
>>>             ... 22 more
>>>
>>> Caused by: java.util.concurrent.ExecutionException:
>>> java.lang.RuntimeException: Cancel client
>>> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
>>> connecting back
>>>
>>>             at
>>> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>>>
>>>             at
>>> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:99)
>>>
>>>             ... 26 more
>>>
>>> Caused by: java.lang.RuntimeException: Cancel client
>>> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
>>> connecting back
>>>
>>>             at
>>> org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)
>>>
>>>             at
>>> org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427)
>>>
>>>             ... 1 more
>>>
>>>
>>>
>>> ERROR : Failed to execute spark task, with exception
>>> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark
>>> client.)'
>>>
>>> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark
>>> client.
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)
>>>
>>>             at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
>>>
>>>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>>>
>>>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
>>>
>>>             at
>>> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)
>>>
>>>             at
>>> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
>>>
>>>             at
>>> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
>>>
>>>             at java.security.AccessController.doPrivileged(Native Method)
>>>
>>>             at javax.security.auth.Subject.doAs(Subject.java:415)
>>>
>>>             at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>>>
>>>             at
>>> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
>>>
>>>             at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>
>>>             at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>
>>>             at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>
>>>             at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>
>>>             at java.lang.Thread.run(Thread.java:745)
>>>
>>> Caused by: java.lang.RuntimeException:
>>> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel
>>> client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited
>>> before connecting back
>>>
>>>             at
>>> com.google.common.base.Throwables.propagate(Throwables.java:156)
>>>
>>>             at
>>> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:109)
>>>
>>>             at
>>> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:91)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)
>>>
>>>             at
>>> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)
>>>
>>>             ... 22 more
>>>
>>> Caused by: java.util.concurrent.ExecutionException:
>>> java.lang.RuntimeException: Cancel client
>>> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
>>> connecting back
>>>
>>>             at
>>> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>>>
>>>             at
>>> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:99)
>>>
>>>             ... 26 more
>>>
>>> Caused by: java.lang.RuntimeException: Cancel client
>>> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
>>> connecting back
>>>
>>>             at
>>> org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)
>>>
>>>             at
>>> org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427)
>>>
>>>             ... 1 more
>>>
>>> Error: Error while processing statement: FAILED: Execution Error, return
>>> code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
>>> (state=08S01,code=1)
>>>
>>>
>>>
>>>
>>>
>>
>>
>

Re: Hive on Spark - Error: Child process exited before connecting back

Posted by Ophir Etzion <op...@foursquare.com>.

it seems like the problem is that the spark client needs FSDataInputStream
but is not included in the hive-exec-1.1.0-cdh5.4.3.jar that is passed in
the class path.
I need to look more in spark-submit / org.apache.spark.deploy to see if
there is a way to include more jars.


2015-12-17 17:34:01,679 INFO org.apache.hive.spark.client.SparkClientImpl:
Running client driver with argv:
/export/hdb3/data/cloudera/parcels/CDH-5.4.3-1.cdh5.4.3.p0.6/lib/spark/bin/spark-submit
--executor-cores 1 --executor-memory 268435456 --proxy-user anonymous
--properties-file /tmp/spark-submit.1508744664719491459.properties --class
org.apache.hive.spark.client.RemoteDriver
/export/hdb3/data/cloudera/parcels/CDH-5.4.3-1.cdh5.4.3.p0.6/jars/hive-exec-1.1.0-cdh5.4.3.jar
--remote-host ezaq6.prod.foursquare.com --remote-port 44306 --conf
hive.spark.client.connect.timeout=1000 --conf
hive.spark.client.server.connect.timeout=90000 --conf
hive.spark.client.channel.log.level=null --conf
hive.spark.client.rpc.max.size=52428800 --conf
hive.spark.client.rpc.threads=8 --conf hive.spark.client.secret.bits=256
2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl:
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/hadoop/fs/FSDataInputStream
2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: at
org.apache.spark.deploy.SparkSubmitDriverBootstrapper$.main(SparkSubmitDriverBootstrapper.scala:71)
2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: at
org.apache.spark.deploy.SparkSubmitDriverBootstrapper.main(SparkSubmitDriverBootstrapper.scala)
2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl:
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.fs.FSDataInputStream
2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: at
java.net.URLClassLoader$1.run(URLClassLoader.java:366)
2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: at
java.net.URLClassLoader$1.run(URLClassLoader.java:355)
2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: at
java.security.AccessController.doPrivileged(Native Method)
2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: at
java.net.URLClassLoader.findClass(URLClassLoader.java:354)
2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: at
java.lang.ClassLoader.loadClass(ClassLoader.java:425)
2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: at
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: at
java.lang.ClassLoader.loadClass(ClassLoader.java:358)
2015-12-17 17:34:02,435 INFO org.apache.hive.spark.client.SparkClientImpl: ...
2 more
2015-12-17 17:34:02,438 WARN org.apache.hive.spark.client.SparkClientImpl:
Child process exited with code 1.

On Tue, Dec 15, 2015 at 11:15 PM, Xuefu Zhang <xz...@cloudera.com> wrote:

> As to the spark versions that are supported. Spark has made non-compatible
> API changes in 1.5, and that's the reason why Hive 1.1.0 doesn't work with
> Spark 1.5. However, the latest Hive in master or branch-1 should work with
> spark 1.5.
>
> Also, later CDH 5.4.x versions have already supported Spark 1.5. CDH 5.7,
> which is coming so, will support Spark 1.6.
>
> --Xuefu
>
> On Tue, Dec 15, 2015 at 3:50 PM, Mich Talebzadeh <mi...@peridale.co.uk>
> wrote:
>
>> To answer your point:
>>
>>
>>
>> “why would spark 1.5.2 specifically would not work with hive?”
>>
>>
>>
>> Because I tried Spark 1.5.2 and it did not work and unfortunately the
>> only version seem to work (albeit requires messaging around) is version
>> 1.3.1 of Spark.
>>
>>
>>
>> Look at the threads on “Managed to make Hive run on Spark engine” in
>> user@hive.apache.org
>>
>>
>>
>>
>>
>> HTH,
>>
>>
>>
>>
>>
>> Mich Talebzadeh
>>
>>
>>
>> *Sybase ASE 15 Gold Medal Award 2008*
>>
>> A Winning Strategy: Running the most Critical Financial Data on ASE 15
>>
>>
>> http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf
>>
>> Author of the books* "A Practitioner’s Guide to Upgrading to Sybase ASE
>> 15", ISBN 978-0-9563693-0-7*.
>>
>> co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN
>> 978-0-9759693-0-4*
>>
>> *Publications due shortly:*
>>
>> *Complex Event Processing in Heterogeneous Environments*, ISBN:
>> 978-0-9563693-3-8
>>
>> *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume
>> one out shortly
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> NOTE: The information in this email is proprietary and confidential. This
>> message is for the designated recipient only, if you are not the intended
>> recipient, you should destroy it immediately. Any information in this
>> message shall not be understood as given or endorsed by Peridale Technology
>> Ltd, its subsidiaries or their employees, unless expressly so stated. It is
>> the responsibility of the recipient to ensure that this email is virus
>> free, therefore neither Peridale Ltd, its subsidiaries nor their employees
>> accept any responsibility.
>>
>>
>>
>> *From:* Ophir Etzion [mailto:ophir@foursquare.com]
>> *Sent:* 15 December 2015 22:42
>> *To:* user@hive.apache.org
>> *Cc:* user@spark.apache.org
>> *Subject:* Re: Hive on Spark - Error: Child process exited before
>> connecting back
>>
>>
>>
>> Hi,
>>
>> the versions are spark 1.3.0 and hive 1.1.0 as part of cloudera 5.4.3.
>>
>> I find it weird that it would work only on the version you mentioned as
>> there is documentation (not good documentation but still..) on how to do it
>> with cloudera that packages different versions.
>>
>> Thanks for the answer though.
>>
>> why would spark 1.5.2 specifically would not work with hive?
>>
>>
>>
>> Ophir
>>
>>
>>
>> On Tue, Dec 15, 2015 at 5:33 PM, Mich Talebzadeh <mi...@peridale.co.uk>
>> wrote:
>>
>> Hi,
>>
>>
>>
>> The only version that I have managed to run Hive using Spark engine is
>> Spark 1.3.1 on Hive 1.2.1
>>
>>
>>
>> Can you confirm the version of Spark you are running?
>>
>>
>>
>> FYI, Spark 1.5.2 will not work with Hive.
>>
>>
>>
>> HTH
>>
>>
>>
>> Mich Talebzadeh
>>
>>
>>
>> *Sybase ASE 15 Gold Medal Award 2008*
>>
>> A Winning Strategy: Running the most Critical Financial Data on ASE 15
>>
>>
>> http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf
>>
>> Author of the books* "A Practitioner’s Guide to Upgrading to Sybase ASE
>> 15", ISBN 978-0-9563693-0-7*.
>>
>> co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN
>> 978-0-9759693-0-4*
>>
>> *Publications due shortly:*
>>
>> *Complex Event Processing in Heterogeneous Environments*, ISBN:
>> 978-0-9563693-3-8
>>
>> *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume
>> one out shortly
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> NOTE: The information in this email is proprietary and confidential. This
>> message is for the designated recipient only, if you are not the intended
>> recipient, you should destroy it immediately. Any information in this
>> message shall not be understood as given or endorsed by Peridale Technology
>> Ltd, its subsidiaries or their employees, unless expressly so stated. It is
>> the responsibility of the recipient to ensure that this email is virus
>> free, therefore neither Peridale Ltd, its subsidiaries nor their employees
>> accept any responsibility.
>>
>>
>>
>> *From:* Ophir Etzion [mailto:ophir@foursquare.com]
>> *Sent:* 15 December 2015 22:27
>> *To:* user@spark.apache.org; user@hive.apache.org
>> *Subject:* Hive on Spark - Error: Child process exited before connecting
>> back
>>
>>
>>
>> Hi,
>>
>>
>>
>> when trying to do Hive on Spark on CDH5.4.3 I get the following error
>> when trying to run a simple query using spark.
>>
>> I've tried setting everything written here (
>> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started)
>> as well as what the cdh recommends.
>>
>> any one encountered this as well? (searching for it didn't help much)
>>
>> the error:
>>
>> ERROR : Failed to execute spark task, with exception
>> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark
>> client.)'
>>
>> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark
>> client.
>>
>>             at
>> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57)
>>
>>             at
>> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
>>
>>             at
>> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120)
>>
>>             at
>> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)
>>
>>             at
>> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
>>
>>             at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
>>
>>             at
>> org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)
>>
>>             at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)
>>
>>             at
>> org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
>>
>>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>>
>>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
>>
>>             at
>> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)
>>
>>             at
>> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
>>
>>             at
>> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
>>
>>             at java.security.AccessController.doPrivileged(Native Method)
>>
>>             at javax.security.auth.Subject.doAs(Subject.java:415)
>>
>>             at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>>
>>             at
>> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
>>
>>             at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>
>>             at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>
>>             at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>
>>             at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>
>>             at java.lang.Thread.run(Thread.java:745)
>>
>> Caused by: java.lang.RuntimeException:
>> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel
>> client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited
>> before connecting back
>>
>>             at
>> com.google.common.base.Throwables.propagate(Throwables.java:156)
>>
>>             at
>> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:109)
>>
>>             at
>> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
>>
>>             at
>> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:91)
>>
>>             at
>> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)
>>
>>             at
>> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)
>>
>>             ... 22 more
>>
>> Caused by: java.util.concurrent.ExecutionException:
>> java.lang.RuntimeException: Cancel client
>> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
>> connecting back
>>
>>             at
>> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>>
>>             at
>> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:99)
>>
>>             ... 26 more
>>
>> Caused by: java.lang.RuntimeException: Cancel client
>> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
>> connecting back
>>
>>             at
>> org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)
>>
>>             at
>> org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427)
>>
>>             ... 1 more
>>
>>
>>
>> ERROR : Failed to execute spark task, with exception
>> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark
>> client.)'
>>
>> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark
>> client.
>>
>>             at
>> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57)
>>
>>             at
>> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
>>
>>             at
>> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120)
>>
>>             at
>> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)
>>
>>             at
>> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
>>
>>             at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
>>
>>             at
>> org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)
>>
>>             at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)
>>
>>             at
>> org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
>>
>>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>>
>>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
>>
>>             at
>> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)
>>
>>             at
>> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
>>
>>             at
>> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
>>
>>             at java.security.AccessController.doPrivileged(Native Method)
>>
>>             at javax.security.auth.Subject.doAs(Subject.java:415)
>>
>>             at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>>
>>             at
>> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
>>
>>             at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>
>>             at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>
>>             at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>
>>             at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>
>>             at java.lang.Thread.run(Thread.java:745)
>>
>> Caused by: java.lang.RuntimeException:
>> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel
>> client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited
>> before connecting back
>>
>>             at
>> com.google.common.base.Throwables.propagate(Throwables.java:156)
>>
>>             at
>> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:109)
>>
>>             at
>> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
>>
>>             at
>> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:91)
>>
>>             at
>> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)
>>
>>             at
>> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)
>>
>>             ... 22 more
>>
>> Caused by: java.util.concurrent.ExecutionException:
>> java.lang.RuntimeException: Cancel client
>> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
>> connecting back
>>
>>             at
>> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>>
>>             at
>> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:99)
>>
>>             ... 26 more
>>
>> Caused by: java.lang.RuntimeException: Cancel client
>> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
>> connecting back
>>
>>             at
>> org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)
>>
>>             at
>> org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427)
>>
>>             ... 1 more
>>
>> Error: Error while processing statement: FAILED: Execution Error, return
>> code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
>> (state=08S01,code=1)
>>
>>
>>
>>
>>
>
>

Re: Hive on Spark - Error: Child process exited before connecting back

Posted by Xuefu Zhang <xz...@cloudera.com>.

As to the spark versions that are supported. Spark has made non-compatible
API changes in 1.5, and that's the reason why Hive 1.1.0 doesn't work with
Spark 1.5. However, the latest Hive in master or branch-1 should work with
spark 1.5.

Also, later CDH 5.4.x versions have already supported Spark 1.5. CDH 5.7,
which is coming so, will support Spark 1.6.

--Xuefu

On Tue, Dec 15, 2015 at 3:50 PM, Mich Talebzadeh <mi...@peridale.co.uk>
wrote:

> To answer your point:
>
>
>
> “why would spark 1.5.2 specifically would not work with hive?”
>
>
>
> Because I tried Spark 1.5.2 and it did not work and unfortunately the only
> version seem to work (albeit requires messaging around) is version 1.3.1 of
> Spark.
>
>
>
> Look at the threads on “Managed to make Hive run on Spark engine” in
> user@hive.apache.org
>
>
>
>
>
> HTH,
>
>
>
>
>
> Mich Talebzadeh
>
>
>
> *Sybase ASE 15 Gold Medal Award 2008*
>
> A Winning Strategy: Running the most Critical Financial Data on ASE 15
>
>
> http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf
>
> Author of the books* "A Practitioner’s Guide to Upgrading to Sybase ASE
> 15", ISBN 978-0-9563693-0-7*.
>
> co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN
> 978-0-9759693-0-4*
>
> *Publications due shortly:*
>
> *Complex Event Processing in Heterogeneous Environments*, ISBN:
> 978-0-9563693-3-8
>
> *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume
> one out shortly
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> NOTE: The information in this email is proprietary and confidential. This
> message is for the designated recipient only, if you are not the intended
> recipient, you should destroy it immediately. Any information in this
> message shall not be understood as given or endorsed by Peridale Technology
> Ltd, its subsidiaries or their employees, unless expressly so stated. It is
> the responsibility of the recipient to ensure that this email is virus
> free, therefore neither Peridale Ltd, its subsidiaries nor their employees
> accept any responsibility.
>
>
>
> *From:* Ophir Etzion [mailto:ophir@foursquare.com]
> *Sent:* 15 December 2015 22:42
> *To:* user@hive.apache.org
> *Cc:* user@spark.apache.org
> *Subject:* Re: Hive on Spark - Error: Child process exited before
> connecting back
>
>
>
> Hi,
>
> the versions are spark 1.3.0 and hive 1.1.0 as part of cloudera 5.4.3.
>
> I find it weird that it would work only on the version you mentioned as
> there is documentation (not good documentation but still..) on how to do it
> with cloudera that packages different versions.
>
> Thanks for the answer though.
>
> why would spark 1.5.2 specifically would not work with hive?
>
>
>
> Ophir
>
>
>
> On Tue, Dec 15, 2015 at 5:33 PM, Mich Talebzadeh <mi...@peridale.co.uk>
> wrote:
>
> Hi,
>
>
>
> The only version that I have managed to run Hive using Spark engine is
> Spark 1.3.1 on Hive 1.2.1
>
>
>
> Can you confirm the version of Spark you are running?
>
>
>
> FYI, Spark 1.5.2 will not work with Hive.
>
>
>
> HTH
>
>
>
> Mich Talebzadeh
>
>
>
> *Sybase ASE 15 Gold Medal Award 2008*
>
> A Winning Strategy: Running the most Critical Financial Data on ASE 15
>
>
> http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf
>
> Author of the books* "A Practitioner’s Guide to Upgrading to Sybase ASE
> 15", ISBN 978-0-9563693-0-7*.
>
> co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN
> 978-0-9759693-0-4*
>
> *Publications due shortly:*
>
> *Complex Event Processing in Heterogeneous Environments*, ISBN:
> 978-0-9563693-3-8
>
> *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume
> one out shortly
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> NOTE: The information in this email is proprietary and confidential. This
> message is for the designated recipient only, if you are not the intended
> recipient, you should destroy it immediately. Any information in this
> message shall not be understood as given or endorsed by Peridale Technology
> Ltd, its subsidiaries or their employees, unless expressly so stated. It is
> the responsibility of the recipient to ensure that this email is virus
> free, therefore neither Peridale Ltd, its subsidiaries nor their employees
> accept any responsibility.
>
>
>
> *From:* Ophir Etzion [mailto:ophir@foursquare.com]
> *Sent:* 15 December 2015 22:27
> *To:* user@spark.apache.org; user@hive.apache.org
> *Subject:* Hive on Spark - Error: Child process exited before connecting
> back
>
>
>
> Hi,
>
>
>
> when trying to do Hive on Spark on CDH5.4.3 I get the following error when
> trying to run a simple query using spark.
>
> I've tried setting everything written here (
> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started)
> as well as what the cdh recommends.
>
> any one encountered this as well? (searching for it didn't help much)
>
> the error:
>
> ERROR : Failed to execute spark task, with exception
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark
> client.)'
>
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark
> client.
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)
>
>             at
> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
>
>             at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
>
>             at
> org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)
>
>             at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)
>
>             at
> org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
>
>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>
>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
>
>             at java.security.AccessController.doPrivileged(Native Method)
>
>             at javax.security.auth.Subject.doAs(Subject.java:415)
>
>             at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
>
>             at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>
>             at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>
>             at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
>             at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
>             at java.lang.Thread.run(Thread.java:745)
>
> Caused by: java.lang.RuntimeException:
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel
> client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited
> before connecting back
>
>             at
> com.google.common.base.Throwables.propagate(Throwables.java:156)
>
>             at
> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:109)
>
>             at
> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:91)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)
>
>             ... 22 more
>
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.RuntimeException: Cancel client
> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
> connecting back
>
>             at
> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>
>             at
> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:99)
>
>             ... 26 more
>
> Caused by: java.lang.RuntimeException: Cancel client
> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
> connecting back
>
>             at
> org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)
>
>             at
> org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427)
>
>             ... 1 more
>
>
>
> ERROR : Failed to execute spark task, with exception
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark
> client.)'
>
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark
> client.
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)
>
>             at
> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
>
>             at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
>
>             at
> org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)
>
>             at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)
>
>             at
> org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
>
>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>
>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
>
>             at java.security.AccessController.doPrivileged(Native Method)
>
>             at javax.security.auth.Subject.doAs(Subject.java:415)
>
>             at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
>
>             at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>
>             at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>
>             at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
>             at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
>             at java.lang.Thread.run(Thread.java:745)
>
> Caused by: java.lang.RuntimeException:
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel
> client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited
> before connecting back
>
>             at
> com.google.common.base.Throwables.propagate(Throwables.java:156)
>
>             at
> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:109)
>
>             at
> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:91)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)
>
>             ... 22 more
>
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.RuntimeException: Cancel client
> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
> connecting back
>
>             at
> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>
>             at
> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:99)
>
>             ... 26 more
>
> Caused by: java.lang.RuntimeException: Cancel client
> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
> connecting back
>
>             at
> org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)
>
>             at
> org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427)
>
>             ... 1 more
>
> Error: Error while processing statement: FAILED: Execution Error, return
> code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
> (state=08S01,code=1)
>
>
>
>
>

RE: Hive on Spark - Error: Child process exited before connecting back

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.

To answer your point:

 

“why would spark 1.5.2 specifically would not work with hive?”

 

Because I tried Spark 1.5.2 and it did not work and unfortunately the only version seem to work (albeit requires messaging around) is version 1.3.1 of Spark.

 

Look at the threads on “Managed to make Hive run on Spark engine” in user@hive.apache.org <ma...@hive.apache.org> 

 

 

HTH,

 

 

Mich Talebzadeh

 

Sybase ASE 15 Gold Medal Award 2008

A Winning Strategy: Running the most Critical Financial Data on ASE 15

http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf

Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7. 

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4

Publications due shortly:

Complex Event Processing in Heterogeneous Environments, ISBN: 978-0-9563693-3-8

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly

 

http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/> 

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Technology Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Ophir Etzion [mailto:ophir@foursquare.com] 
Sent: 15 December 2015 22:42
To: user@hive.apache.org
Cc: user@spark.apache.org
Subject: Re: Hive on Spark - Error: Child process exited before connecting back

 

Hi,

the versions are spark 1.3.0 and hive 1.1.0 as part of cloudera 5.4.3.

I find it weird that it would work only on the version you mentioned as there is documentation (not good documentation but still..) on how to do it with cloudera that packages different versions.

Thanks for the answer though.

why would spark 1.5.2 specifically would not work with hive?

 

Ophir

 

On Tue, Dec 15, 2015 at 5:33 PM, Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk> > wrote:

Hi,

 

The only version that I have managed to run Hive using Spark engine is Spark 1.3.1 on Hive 1.2.1

 

Can you confirm the version of Spark you are running?

 

FYI, Spark 1.5.2 will not work with Hive.

 

HTH

 

Mich Talebzadeh

 

Sybase ASE 15 Gold Medal Award 2008

A Winning Strategy: Running the most Critical Financial Data on ASE 15

http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf

Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7. 

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4

Publications due shortly:

Complex Event Processing in Heterogeneous Environments, ISBN: 978-0-9563693-3-8

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly

 

http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/> 

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Technology Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Ophir Etzion [mailto:ophir@foursquare.com <ma...@foursquare.com> ] 
Sent: 15 December 2015 22:27
To: user@spark.apache.org <ma...@spark.apache.org> ; user@hive.apache.org <ma...@hive.apache.org> 
Subject: Hive on Spark - Error: Child process exited before connecting back

 

Hi,

 

when trying to do Hive on Spark on CDH5.4.3 I get the following error when trying to run a simple query using spark.

I've tried setting everything written here (https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started) as well as what the cdh recommends.

any one encountered this as well? (searching for it didn't help much)

the error:

ERROR : Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)'

org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client.

            at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57)

            at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)

            at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120)

            at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)

            at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)

            at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)

            at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)

            at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)

            at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)

            at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)

            at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)

            at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)

            at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)

            at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)

            at java.security.AccessController.doPrivileged(Native Method)

            at javax.security.auth.Subject.doAs(Subject.java:415)

            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)

            at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)

            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

            at java.util.concurrent.FutureTask.run(FutureTask.java:262)

            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

            at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before connecting back

            at com.google.common.base.Throwables.propagate(Throwables.java:156)

            at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:109)

            at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)

            at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:91)

            at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)

            at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)

            ... 22 more

Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before connecting back

            at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)

            at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:99)

            ... 26 more

Caused by: java.lang.RuntimeException: Cancel client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before connecting back

            at org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)

            at org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427)

            ... 1 more

 

ERROR : Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)'

org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client.

            at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57)

            at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)

            at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120)

            at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)

            at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)

            at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)

            at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)

            at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)

            at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)

            at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)

            at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)

            at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)

            at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)

            at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)

            at java.security.AccessController.doPrivileged(Native Method)

            at javax.security.auth.Subject.doAs(Subject.java:415)

            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)

            at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)

            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

            at java.util.concurrent.FutureTask.run(FutureTask.java:262)

            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

            at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before connecting back

            at com.google.common.base.Throwables.propagate(Throwables.java:156)

            at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:109)

            at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)

            at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:91)

            at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)

            at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)

            ... 22 more

Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before connecting back

            at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)

            at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:99)

            ... 26 more

Caused by: java.lang.RuntimeException: Cancel client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before connecting back

            at org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)

            at org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427)

            ... 1 more

Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask (state=08S01,code=1)

Re: Hive on Spark - Error: Child process exited before connecting back

Posted by Ophir Etzion <op...@foursquare.com>.

Hi,

the versions are spark 1.3.0 and hive 1.1.0 as part of cloudera 5.4.3.

I find it weird that it would work only on the version you mentioned as
there is documentation (not good documentation but still..) on how to do it
with cloudera that packages different versions.

Thanks for the answer though.

why would spark 1.5.2 specifically would not work with hive?

Ophir

On Tue, Dec 15, 2015 at 5:33 PM, Mich Talebzadeh <mi...@peridale.co.uk>
wrote:

> Hi,
>
>
>
> The only version that I have managed to run Hive using Spark engine is
> Spark 1.3.1 on Hive 1.2.1
>
>
>
> Can you confirm the version of Spark you are running?
>
>
>
> FYI, Spark 1.5.2 will not work with Hive.
>
>
>
> HTH
>
>
>
> Mich Talebzadeh
>
>
>
> *Sybase ASE 15 Gold Medal Award 2008*
>
> A Winning Strategy: Running the most Critical Financial Data on ASE 15
>
>
> http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf
>
> Author of the books* "A Practitioner’s Guide to Upgrading to Sybase ASE
> 15", ISBN 978-0-9563693-0-7*.
>
> co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN
> 978-0-9759693-0-4*
>
> *Publications due shortly:*
>
> *Complex Event Processing in Heterogeneous Environments*, ISBN:
> 978-0-9563693-3-8
>
> *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume
> one out shortly
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> NOTE: The information in this email is proprietary and confidential. This
> message is for the designated recipient only, if you are not the intended
> recipient, you should destroy it immediately. Any information in this
> message shall not be understood as given or endorsed by Peridale Technology
> Ltd, its subsidiaries or their employees, unless expressly so stated. It is
> the responsibility of the recipient to ensure that this email is virus
> free, therefore neither Peridale Ltd, its subsidiaries nor their employees
> accept any responsibility.
>
>
>
> *From:* Ophir Etzion [mailto:ophir@foursquare.com]
> *Sent:* 15 December 2015 22:27
> *To:* user@spark.apache.org; user@hive.apache.org
> *Subject:* Hive on Spark - Error: Child process exited before connecting
> back
>
>
>
> Hi,
>
>
>
> when trying to do Hive on Spark on CDH5.4.3 I get the following error when
> trying to run a simple query using spark.
>
> I've tried setting everything written here (
> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started)
> as well as what the cdh recommends.
>
> any one encountered this as well? (searching for it didn't help much)
>
> the error:
>
> ERROR : Failed to execute spark task, with exception
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark
> client.)'
>
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark
> client.
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)
>
>             at
> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
>
>             at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
>
>             at
> org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)
>
>             at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)
>
>             at
> org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
>
>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>
>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
>
>             at java.security.AccessController.doPrivileged(Native Method)
>
>             at javax.security.auth.Subject.doAs(Subject.java:415)
>
>             at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
>
>             at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>
>             at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>
>             at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
>             at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
>             at java.lang.Thread.run(Thread.java:745)
>
> Caused by: java.lang.RuntimeException:
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel
> client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited
> before connecting back
>
>             at
> com.google.common.base.Throwables.propagate(Throwables.java:156)
>
>             at
> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:109)
>
>             at
> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:91)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)
>
>             ... 22 more
>
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.RuntimeException: Cancel client
> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
> connecting back
>
>             at
> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>
>             at
> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:99)
>
>             ... 26 more
>
> Caused by: java.lang.RuntimeException: Cancel client
> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
> connecting back
>
>             at
> org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)
>
>             at
> org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427)
>
>             ... 1 more
>
>
>
> ERROR : Failed to execute spark task, with exception
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark
> client.)'
>
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark
> client.
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)
>
>             at
> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
>
>             at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
>
>             at
> org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)
>
>             at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)
>
>             at
> org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
>
>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>
>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
>
>             at java.security.AccessController.doPrivileged(Native Method)
>
>             at javax.security.auth.Subject.doAs(Subject.java:415)
>
>             at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
>
>             at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>
>             at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>
>             at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
>             at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
>             at java.lang.Thread.run(Thread.java:745)
>
> Caused by: java.lang.RuntimeException:
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel
> client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited
> before connecting back
>
>             at
> com.google.common.base.Throwables.propagate(Throwables.java:156)
>
>             at
> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:109)
>
>             at
> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:91)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)
>
>             ... 22 more
>
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.RuntimeException: Cancel client
> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
> connecting back
>
>             at
> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>
>             at
> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:99)
>
>             ... 26 more
>
> Caused by: java.lang.RuntimeException: Cancel client
> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
> connecting back
>
>             at
> org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)
>
>             at
> org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427)
>
>             ... 1 more
>
> Error: Error while processing statement: FAILED: Execution Error, return
> code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
> (state=08S01,code=1)
>
>
>

Re: Hive on Spark - Error: Child process exited before connecting back

Posted by Ophir Etzion <op...@foursquare.com>.

Hi,

the versions are spark 1.3.0 and hive 1.1.0 as part of cloudera 5.4.3.

I find it weird that it would work only on the version you mentioned as
there is documentation (not good documentation but still..) on how to do it
with cloudera that packages different versions.

Thanks for the answer though.

why would spark 1.5.2 specifically would not work with hive?

Ophir

On Tue, Dec 15, 2015 at 5:33 PM, Mich Talebzadeh <mi...@peridale.co.uk>
wrote:

> Hi,
>
>
>
> The only version that I have managed to run Hive using Spark engine is
> Spark 1.3.1 on Hive 1.2.1
>
>
>
> Can you confirm the version of Spark you are running?
>
>
>
> FYI, Spark 1.5.2 will not work with Hive.
>
>
>
> HTH
>
>
>
> Mich Talebzadeh
>
>
>
> *Sybase ASE 15 Gold Medal Award 2008*
>
> A Winning Strategy: Running the most Critical Financial Data on ASE 15
>
>
> http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf
>
> Author of the books* "A Practitioner’s Guide to Upgrading to Sybase ASE
> 15", ISBN 978-0-9563693-0-7*.
>
> co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN
> 978-0-9759693-0-4*
>
> *Publications due shortly:*
>
> *Complex Event Processing in Heterogeneous Environments*, ISBN:
> 978-0-9563693-3-8
>
> *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume
> one out shortly
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> NOTE: The information in this email is proprietary and confidential. This
> message is for the designated recipient only, if you are not the intended
> recipient, you should destroy it immediately. Any information in this
> message shall not be understood as given or endorsed by Peridale Technology
> Ltd, its subsidiaries or their employees, unless expressly so stated. It is
> the responsibility of the recipient to ensure that this email is virus
> free, therefore neither Peridale Ltd, its subsidiaries nor their employees
> accept any responsibility.
>
>
>
> *From:* Ophir Etzion [mailto:ophir@foursquare.com]
> *Sent:* 15 December 2015 22:27
> *To:* user@spark.apache.org; user@hive.apache.org
> *Subject:* Hive on Spark - Error: Child process exited before connecting
> back
>
>
>
> Hi,
>
>
>
> when trying to do Hive on Spark on CDH5.4.3 I get the following error when
> trying to run a simple query using spark.
>
> I've tried setting everything written here (
> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started)
> as well as what the cdh recommends.
>
> any one encountered this as well? (searching for it didn't help much)
>
> the error:
>
> ERROR : Failed to execute spark task, with exception
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark
> client.)'
>
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark
> client.
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)
>
>             at
> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
>
>             at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
>
>             at
> org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)
>
>             at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)
>
>             at
> org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
>
>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>
>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
>
>             at java.security.AccessController.doPrivileged(Native Method)
>
>             at javax.security.auth.Subject.doAs(Subject.java:415)
>
>             at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
>
>             at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>
>             at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>
>             at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
>             at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
>             at java.lang.Thread.run(Thread.java:745)
>
> Caused by: java.lang.RuntimeException:
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel
> client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited
> before connecting back
>
>             at
> com.google.common.base.Throwables.propagate(Throwables.java:156)
>
>             at
> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:109)
>
>             at
> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:91)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)
>
>             ... 22 more
>
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.RuntimeException: Cancel client
> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
> connecting back
>
>             at
> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>
>             at
> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:99)
>
>             ... 26 more
>
> Caused by: java.lang.RuntimeException: Cancel client
> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
> connecting back
>
>             at
> org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)
>
>             at
> org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427)
>
>             ... 1 more
>
>
>
> ERROR : Failed to execute spark task, with exception
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark
> client.)'
>
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark
> client.
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)
>
>             at
> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
>
>             at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
>
>             at
> org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)
>
>             at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)
>
>             at
> org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
>
>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>
>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
>
>             at java.security.AccessController.doPrivileged(Native Method)
>
>             at javax.security.auth.Subject.doAs(Subject.java:415)
>
>             at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>
>             at
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
>
>             at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>
>             at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>
>             at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
>             at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
>             at java.lang.Thread.run(Thread.java:745)
>
> Caused by: java.lang.RuntimeException:
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel
> client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited
> before connecting back
>
>             at
> com.google.common.base.Throwables.propagate(Throwables.java:156)
>
>             at
> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:109)
>
>             at
> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:91)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)
>
>             at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)
>
>             ... 22 more
>
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.RuntimeException: Cancel client
> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
> connecting back
>
>             at
> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>
>             at
> org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:99)
>
>             ... 26 more
>
> Caused by: java.lang.RuntimeException: Cancel client
> '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before
> connecting back
>
>             at
> org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)
>
>             at
> org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427)
>
>             ... 1 more
>
> Error: Error while processing statement: FAILED: Execution Error, return
> code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
> (state=08S01,code=1)
>
>
>

RE: Hive on Spark - Error: Child process exited before connecting back

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.

Hi,

 

The only version that I have managed to run Hive using Spark engine is Spark 1.3.1 on Hive 1.2.1

 

Can you confirm the version of Spark you are running?

 

FYI, Spark 1.5.2 will not work with Hive.

 

HTH

 

Mich Talebzadeh

 

Sybase ASE 15 Gold Medal Award 2008

A Winning Strategy: Running the most Critical Financial Data on ASE 15

http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf

Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7. 

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4

Publications due shortly:

Complex Event Processing in Heterogeneous Environments, ISBN: 978-0-9563693-3-8

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly

 

http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/> 

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Technology Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Ophir Etzion [mailto:ophir@foursquare.com] 
Sent: 15 December 2015 22:27
To: user@spark.apache.org; user@hive.apache.org
Subject: Hive on Spark - Error: Child process exited before connecting back

 

Hi,

 

when trying to do Hive on Spark on CDH5.4.3 I get the following error when trying to run a simple query using spark.

I've tried setting everything written here (https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started) as well as what the cdh recommends.

any one encountered this as well? (searching for it didn't help much)

the error:

ERROR : Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)'

org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client.

            at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57)

            at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)

            at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120)

            at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)

            at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)

            at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)

            at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)

            at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)

            at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)

            at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)

            at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)

            at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)

            at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)

            at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)

            at java.security.AccessController.doPrivileged(Native Method)

            at javax.security.auth.Subject.doAs(Subject.java:415)

            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)

            at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)

            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

            at java.util.concurrent.FutureTask.run(FutureTask.java:262)

            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

            at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before connecting back

            at com.google.common.base.Throwables.propagate(Throwables.java:156)

            at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:109)

            at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)

            at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:91)

            at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)

            at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)

            ... 22 more

Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before connecting back

            at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)

            at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:99)

            ... 26 more

Caused by: java.lang.RuntimeException: Cancel client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before connecting back

            at org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)

            at org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427)

            ... 1 more

 

ERROR : Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)'

org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client.

            at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57)

            at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)

            at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120)

            at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)

            at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)

            at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)

            at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)

            at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)

            at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)

            at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)

            at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)

            at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)

            at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)

            at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)

            at java.security.AccessController.doPrivileged(Native Method)

            at javax.security.auth.Subject.doAs(Subject.java:415)

            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)

            at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)

            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

            at java.util.concurrent.FutureTask.run(FutureTask.java:262)

            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

            at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before connecting back

            at com.google.common.base.Throwables.propagate(Throwables.java:156)

            at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:109)

            at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)

            at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:91)

            at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)

            at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)

            ... 22 more

Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before connecting back

            at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)

            at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:99)

            ... 26 more

Caused by: java.lang.RuntimeException: Cancel client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before connecting back

            at org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)

            at org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427)

            ... 1 more

Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask (state=08S01,code=1)