You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Jone Zhang <jo...@gmail.com> on 2015/12/09 10:56:35 UTC

Hive on Spark application will be submited more times when the queue resources is not enough.

*Hi, Xuefu:*

*See attachment 1*
*When the queue resources is not enough.*
*The application application_1448873753366_121022 will pending.*
*Two minutes later, the application application_1448873753366_121055 will
be submited and pending.*
*And then application_1448873753366_121062.*

*See attachment 2*
*When the queue resources is free.*
*The application  application_1448873753366_121062 begin to running.*
*Application_1448873753366_121022 and application_1448873753366_121055
 will failed fast.*

*Logs of Application_1448873753366_121022 as follows(same as *
*application_1448873753366_121055**):*
Container: container_1448873753366_121022_03_000001 on 10.226.136.122_8041
============================================================================
LogType: stderr
LogLength: 4664
Log Contents:
Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
in the future
Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
in the future
15/12/09 16:29:45 INFO yarn.ApplicationMaster: Registered signal handlers
for [TERM, HUP, INT]
15/12/09 16:29:46 INFO yarn.ApplicationMaster: ApplicationAttemptId:
appattempt_1448873753366_121022_000003
15/12/09 16:29:47 INFO spark.SecurityManager: Changing view acls to: mqq
15/12/09 16:29:47 INFO spark.SecurityManager: Changing modify acls to: mqq
15/12/09 16:29:47 INFO spark.SecurityManager: SecurityManager:
authentication disabled; ui acls disabled; users with view permissions:
Set(mqq); users with modify permissions: Set(mqq)
15/12/09 16:29:47 INFO yarn.ApplicationMaster: Starting the user
application in a separate Thread
15/12/09 16:29:47 INFO yarn.ApplicationMaster: Waiting for spark context
initialization
15/12/09 16:29:47 INFO yarn.ApplicationMaster: Waiting for spark context
initialization ...
15/12/09 16:29:47 INFO client.RemoteDriver: Connecting to:
10.179.12.140:38842
15/12/09 16:29:48 WARN rpc.Rpc: Invalid log level null, reverting to
default.
15/12/09 16:29:48 ERROR yarn.ApplicationMaster: User class threw exception:
java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
Client closed before SASL negotiation finished.
java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
Client closed before SASL negotiation finished.
        at
io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
        at
org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
        at
org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
Caused by: javax.security.sasl.SaslException: Client closed before SASL
negotiation finished.
        at
org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449)
        at
org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
        at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
        at
io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
        at
io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
        at
org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127)
        at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
        at
io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
        at
io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
        at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
        at
io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
        at
io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769)
        at
io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567)
        at
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
        at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
        at java.lang.Thread.run(Thread.java:745)
15/12/09 16:29:48 INFO yarn.ApplicationMaster: Final app status: FAILED,
exitCode: 15, (reason: User class threw exception:
java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
Client closed before SASL negotiation finished.)
15/12/09 16:29:57 ERROR yarn.ApplicationMaster: SparkContext did not
initialize after waiting for 150000 ms. Please check earlier log output for
errors. Failing the application.
15/12/09 16:29:57 INFO yarn.ApplicationMaster: Unregistering
ApplicationMaster with FAILED (diag message: User class threw exception:
java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
Client closed before SASL negotiation finished.)
15/12/09 16:29:57 INFO yarn.ApplicationMaster: Deleting staging directory
.sparkStaging/application_1448873753366_121022
15/12/09 16:29:57 INFO util.Utils: Shutdown hook called

LogType: stdout
LogLength: 216
Log Contents:
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
UseCompressedStrings; support was removed in 7.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
UseCompressedStrings; support was removed in 7.0



Container: container_1448873753366_121022_01_000001 on 10.226.148.5_8041
==========================================================================
LogType: stderr
LogLength: 4284
Log Contents:
Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
in the future
Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
in the future
15/12/09 16:29:17 INFO yarn.ApplicationMaster: Registered signal handlers
for [TERM, HUP, INT]
15/12/09 16:29:18 INFO yarn.ApplicationMaster: ApplicationAttemptId:
appattempt_1448873753366_121022_000001
15/12/09 16:29:18 INFO spark.SecurityManager: Changing view acls to: mqq
15/12/09 16:29:18 INFO spark.SecurityManager: Changing modify acls to: mqq
15/12/09 16:29:18 INFO spark.SecurityManager: SecurityManager:
authentication disabled; ui acls disabled; users with view permissions:
Set(mqq); users with modify permissions: Set(mqq)
15/12/09 16:29:19 INFO yarn.ApplicationMaster: Starting the user
application in a separate Thread
15/12/09 16:29:19 INFO yarn.ApplicationMaster: Waiting for spark context
initialization
15/12/09 16:29:19 INFO yarn.ApplicationMaster: Waiting for spark context
initialization ...
15/12/09 16:29:19 INFO client.RemoteDriver: Connecting to:
10.179.12.140:38842
15/12/09 16:29:20 WARN rpc.Rpc: Invalid log level null, reverting to
default.
15/12/09 16:29:20 ERROR yarn.ApplicationMaster: User class threw exception:
java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
Client closed before SASL negotiation finished.
java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
Client closed before SASL negotiation finished.
        at
io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
        at
org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
        at
org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
Caused by: javax.security.sasl.SaslException: Client closed before SASL
negotiation finished.
        at
org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449)
        at
org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
        at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
        at
io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
        at
io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
        at
org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127)
        at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
        at
io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
        at
io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
        at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
        at
io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
        at
io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769)
        at
io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567)
        at
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
        at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
        at java.lang.Thread.run(Thread.java:745)
15/12/09 16:29:20 INFO yarn.ApplicationMaster: Final app status: FAILED,
exitCode: 15, (reason: User class threw exception:
java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
Client closed before SASL negotiation finished.)
15/12/09 16:29:29 ERROR yarn.ApplicationMaster: SparkContext did not
initialize after waiting for 150000 ms. Please check earlier log output for
errors. Failing the application.
15/12/09 16:29:29 INFO util.Utils: Shutdown hook called

LogType: stdout
LogLength: 216
Log Contents:
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
UseCompressedStrings; support was removed in 7.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
UseCompressedStrings; support was removed in 7.0



Container: container_1448873753366_121022_02_000001 on 10.239.243.69_8041
===========================================================================
LogType: stderr
LogLength: 4284
Log Contents:
Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
in the future
Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
in the future
15/12/09 16:29:31 INFO yarn.ApplicationMaster: Registered signal handlers
for [TERM, HUP, INT]
15/12/09 16:29:32 INFO yarn.ApplicationMaster: ApplicationAttemptId:
appattempt_1448873753366_121022_000002
15/12/09 16:29:33 INFO spark.SecurityManager: Changing view acls to: mqq
15/12/09 16:29:33 INFO spark.SecurityManager: Changing modify acls to: mqq
15/12/09 16:29:33 INFO spark.SecurityManager: SecurityManager:
authentication disabled; ui acls disabled; users with view permissions:
Set(mqq); users with modify permissions: Set(mqq)
15/12/09 16:29:33 INFO yarn.ApplicationMaster: Starting the user
application in a separate Thread
15/12/09 16:29:33 INFO yarn.ApplicationMaster: Waiting for spark context
initialization
15/12/09 16:29:33 INFO yarn.ApplicationMaster: Waiting for spark context
initialization ...
15/12/09 16:29:33 INFO client.RemoteDriver: Connecting to:
10.179.12.140:38842
15/12/09 16:29:33 WARN rpc.Rpc: Invalid log level null, reverting to
default.
15/12/09 16:29:33 ERROR yarn.ApplicationMaster: User class threw exception:
java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
Client closed before SASL negotiation finished.
java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
Client closed before SASL negotiation finished.
        at
io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
        at
org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
        at
org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
Caused by: javax.security.sasl.SaslException: Client closed before SASL
negotiation finished.
        at
org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449)
        at
org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
        at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
        at
io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
        at
io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
        at
org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127)
        at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
        at
io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
        at
io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
        at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
        at
io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
        at
io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769)
        at
io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567)
        at
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
        at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
        at java.lang.Thread.run(Thread.java:745)
15/12/09 16:29:33 INFO yarn.ApplicationMaster: Final app status: FAILED,
exitCode: 15, (reason: User class threw exception:
java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
Client closed before SASL negotiation finished.)
15/12/09 16:29:43 ERROR yarn.ApplicationMaster: SparkContext did not
initialize after waiting for 150000 ms. Please check earlier log output for
errors. Failing the application.
15/12/09 16:29:43 INFO util.Utils: Shutdown hook called

LogType: stdout
LogLength: 216
Log Contents:
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
UseCompressedStrings; support was removed in 7.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
UseCompressedStrings; support was removed in 7.0


*I find submit cycle is base
on hive.spark.client.server.connect.timeout,which is 120000ms in our team.*
*And spark.yarn.am.waitTime is 150s.*

*My question is *
*1.Is a bug or design of submit more times?How do i set up the value of
hive.spark.client.server.connect.timeout when the queue resources is not
enough?*
*2. Why the application will fast fail,caused by "SparkContext did not
initialize after waiting for 150000 ms"?*

*Thanks.*
*Best wishes.*

Re: Hive on Spark application will be submited more times when the queue resources is not enough.

Posted by Xuefu Zhang <xz...@cloudera.com>.
Hi Jone,

Thanks for reporting the problem. When you say there is no enough resource,
do you mean that you cannot launch Yarn application masters?

I feel that we should error out right way if the application cannot be
submitted. Any attempt of resubmitted seems problematic. I'm not sure if
there is such control over this, but I think that's a good direction to
look at. I will check with our spark expert on this.

Thanks,
Xuefu

On Wed, Dec 9, 2015 at 8:48 PM, Jone Zhang <jo...@gmail.com> wrote:

> *It seems that the submit number depend on stage of the query.*
> *This query include three stages.*
>
> If queue resources is still *not enough after submit threee applications,** Hive
> client will close.*
> *"**Failed to execute spark task, with exception
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark
> client.)'*
> *FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.spark.SparkTask**"*
> *And this time, the port(eg **34682**)  kill in hive client(eg *
> *10.179.12.140**) use to **communicate with RSC **will  lost.*
>
> *The reources of queue is free **after awhile, the AM of three
> applications will fast fail beacause of "**15/12/10 12:28:43 INFO
> client.RemoteDriver: Connecting to:
> 10.179.12.140:34682...java.net.ConnectException: Connection refused:
> /10.179.12.140:34682 <http://10.179.12.140:34682>**"*
>
> *So, The application will fail if the queue resources if not **enough at
> point of the query be submited, even if the resources is free **after
> awhile.*
> *Do you have more idea about this question?*
>
> *Attch the query*
> set hive.execution.engine=spark;
> set spark.yarn.queue=tms;
> set spark.app.name=t_ad_tms_heartbeat_ok_3;
> insert overwrite table t_ad_tms_heartbeat_ok partition(ds=20151208)
> SELECT
>             NVL(a.qimei, b.qimei) AS qimei,
>             NVL(b.first_ip,a.user_ip) AS first_ip,
>             NVL(a.user_ip, b.last_ip) AS last_ip,
>             NVL(b.first_date, a.ds) AS first_date,
>             NVL(a.ds, b.last_date) AS last_date,
>             NVL(b.first_chid, a.chid) AS first_chid,
>             NVL(a.chid, b.last_chid) AS last_chid,
>             NVL(b.first_lc, a.lc) AS first_lc,
>             NVL(a.lc, b.last_lc) AS last_lc,
>             NVL(a.guid, b.guid) AS guid,
>             NVL(a.sn, b.sn) AS sn,
>             NVL(a.vn, b.vn) AS vn,
>                         NVL(a.vc, b.vc) AS vc,
>                         NVL(a.mo, b.mo) AS mo,
>                         NVL(a.rl, b.rl) AS rl,
>                         NVL(a.os, b.os) AS os,
>                         NVL(a.rv, b.rv) AS rv,
>                         NVL(a.qv, b.qv) AS qv,
>                         NVL(a.imei, b.imei) AS imei,
>                         NVL(a.romid, b.romid) AS romid,
>             NVL(a.bn, b.bn) AS bn,
>                         NVL(a.account_type, b.account_type) AS
> account_type,
>                         NVL(a.account, b.account) AS account
>     FROM
>         (SELECT
> ds,user_ip,guid,sn,vn,vc,mo,rl,chid,lcid,os,rv,qv,imei,qimei,lc,romid,bn,account_type,account
>         FROM    t_od_tms_heartbeat_ok
>         WHERE   ds = 20151208) a
>     FULL OUTER JOIN
>         (SELECT
> qimei,first_ip,last_ip,first_date,last_date,first_chid,last_chid,first_lc,last_lc,guid,sn,vn,vc,mo,rl,os,rv,qv,imei,romid,bn,account_type,account
>         FROM    t_ad_tms_heartbeat_ok
>         WHERE   last_date > 20150611
>         AND     ds = 20151207) b
>         ON   a.qimei=b.qimei;
>
> *Thanks.*
> *Best wishes.*
>
> 2015-12-09 19:51 GMT+08:00 Jone Zhang <jo...@gmail.com>:
>
>> But in some cases all of the applications will fail which caused
>>> by SparkContext did not initialize after waiting for 150000 ms.
>>> See attchment (hive.spark.client.server.connect.timeout is set to 5min).
>>
>>
>> *The error log is different  from original mail*
>>
>> Container: container_1448873753366_113453_01_000001 on 10.247.169.134_8041
>>
>> ============================================================================
>> LogType: stderr
>> LogLength: 3302
>> Log Contents:
>> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
>> in the future
>> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
>> in the future
>> 15/12/09 02:11:48 INFO yarn.ApplicationMaster: Registered signal handlers
>> for [TERM, HUP, INT]
>> 15/12/09 02:11:48 INFO yarn.ApplicationMaster: ApplicationAttemptId:
>> appattempt_1448873753366_113453_000001
>> 15/12/09 02:11:49 INFO spark.SecurityManager: Changing view acls to: mqq
>> 15/12/09 02:11:49 INFO spark.SecurityManager: Changing modify acls to: mqq
>> 15/12/09 02:11:49 INFO spark.SecurityManager: SecurityManager:
>> authentication disabled; ui acls disabled; users with view permissions:
>> Set(mqq); users with modify permissions: Set(mqq)
>> 15/12/09 02:11:49 INFO yarn.ApplicationMaster: Starting the user
>> application in a separate Thread
>> 15/12/09 02:11:49 INFO yarn.ApplicationMaster: Waiting for spark context
>> initialization
>> 15/12/09 02:11:49 INFO yarn.ApplicationMaster: Waiting for spark context
>> initialization ...
>> 15/12/09 02:11:49 INFO client.RemoteDriver: Connecting to:
>> 10.179.12.140:58013
>> 15/12/09 02:11:49 ERROR yarn.ApplicationMaster: User class threw
>> exception: java.util.concurrent.ExecutionException:
>> java.net.ConnectException: Connection refused: /10.179.12.140:58013
>> java.util.concurrent.ExecutionException: java.net.ConnectException:
>> Connection refused: /10.179.12.140:58013
>>         at
>> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>>         at
>> org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
>>         at
>> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>         at
>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
>> Caused by: java.net.ConnectException: Connection refused: /
>> 10.179.12.140:58013
>>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>         at
>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>>         at
>> io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:208)
>>         at
>> io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:287)
>>         at
>> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
>>         at
>> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>>         at
>> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>>         at
>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>>         at java.lang.Thread.run(Thread.java:745)
>> 15/12/09 02:11:49 INFO yarn.ApplicationMaster: Final app status: FAILED,
>> exitCode: 15, (reason: User class threw exception:
>> java.util.concurrent.ExecutionException: java.net.ConnectException:
>> Connection refused: /10.179.12.140:58013)
>> 15/12/09 02:11:59 ERROR yarn.ApplicationMaster: SparkContext did not
>> initialize after waiting for 150000 ms. Please check earlier log output for
>> errors. Failing the application.
>> 15/12/09 02:11:59 INFO util.Utils: Shutdown hook called
>>
>> 2015-12-09 19:22 GMT+08:00 Jone Zhang <jo...@gmail.com>:
>>
>>> Hive version is 1.2.1
>>> Spark version is 1.4.1
>>> Hadoop version is 2.5.1
>>>
>>> The application_1448873753366_121062 will success in the above mail.
>>>
>>> But in some cases all of the applications will fail which caused by SparkContext
>>> did not initialize after waiting for 150000 ms.
>>> See attchment (hive.spark.client.server.connect.timeout is set to 5min).
>>>
>>> Thanks.
>>> Best wishes.
>>>
>>> 2015-12-09 17:56 GMT+08:00 Jone Zhang <jo...@gmail.com>:
>>>
>>>> *Hi, Xuefu:*
>>>>
>>>> *See attachment 1*
>>>> *When the queue resources is not enough.*
>>>> *The application application_1448873753366_121022 will pending.*
>>>> *Two minutes later, the application application_1448873753366_121055
>>>> will be submited and pending.*
>>>> *And then application_1448873753366_121062.*
>>>>
>>>> *See attachment 2*
>>>> *When the queue resources is free.*
>>>> *The application  application_1448873753366_121062 begin to running.*
>>>> *Application_1448873753366_121022 and application_1448873753366_121055
>>>>  will failed fast.*
>>>>
>>>> *Logs of Application_1448873753366_121022 as follows(same as *
>>>> *application_1448873753366_121055**):*
>>>> Container: container_1448873753366_121022_03_000001 on
>>>> 10.226.136.122_8041
>>>>
>>>> ============================================================================
>>>> LogType: stderr
>>>> LogLength: 4664
>>>> Log Contents:
>>>> Please use CMSClassUnloadingEnabled in place of
>>>> CMSPermGenSweepingEnabled in the future
>>>> Please use CMSClassUnloadingEnabled in place of
>>>> CMSPermGenSweepingEnabled in the future
>>>> 15/12/09 16:29:45 INFO yarn.ApplicationMaster: Registered signal
>>>> handlers for [TERM, HUP, INT]
>>>> 15/12/09 16:29:46 INFO yarn.ApplicationMaster: ApplicationAttemptId:
>>>> appattempt_1448873753366_121022_000003
>>>> 15/12/09 16:29:47 INFO spark.SecurityManager: Changing view acls to: mqq
>>>> 15/12/09 16:29:47 INFO spark.SecurityManager: Changing modify acls to:
>>>> mqq
>>>> 15/12/09 16:29:47 INFO spark.SecurityManager: SecurityManager:
>>>> authentication disabled; ui acls disabled; users with view permissions:
>>>> Set(mqq); users with modify permissions: Set(mqq)
>>>> 15/12/09 16:29:47 INFO yarn.ApplicationMaster: Starting the user
>>>> application in a separate Thread
>>>> 15/12/09 16:29:47 INFO yarn.ApplicationMaster: Waiting for spark
>>>> context initialization
>>>> 15/12/09 16:29:47 INFO yarn.ApplicationMaster: Waiting for spark
>>>> context initialization ...
>>>> 15/12/09 16:29:47 INFO client.RemoteDriver: Connecting to:
>>>> 10.179.12.140:38842
>>>> 15/12/09 16:29:48 WARN rpc.Rpc: Invalid log level null, reverting to
>>>> default.
>>>> 15/12/09 16:29:48 ERROR yarn.ApplicationMaster: User class threw
>>>> exception: java.util.concurrent.ExecutionException:
>>>> javax.security.sasl.SaslException: Client closed before SASL negotiation
>>>> finished.
>>>> java.util.concurrent.ExecutionException:
>>>> javax.security.sasl.SaslException: Client closed before SASL negotiation
>>>> finished.
>>>>         at
>>>> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>>>>         at
>>>> org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
>>>>         at
>>>> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>         at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>         at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>>>         at
>>>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
>>>> Caused by: javax.security.sasl.SaslException: Client closed before SASL
>>>> negotiation finished.
>>>>         at
>>>> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449)
>>>>         at
>>>> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
>>>>         at
>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>>>         at
>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>>>         at
>>>> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>>>>         at
>>>> org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127)
>>>>         at
>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>>>         at
>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>>>         at
>>>> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>>>>         at
>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>>>         at
>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>>>         at
>>>> io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769)
>>>>         at
>>>> io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567)
>>>>         at
>>>> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
>>>>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
>>>>         at
>>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>>>>         at java.lang.Thread.run(Thread.java:745)
>>>> 15/12/09 16:29:48 INFO yarn.ApplicationMaster: Final app status:
>>>> FAILED, exitCode: 15, (reason: User class threw exception:
>>>> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
>>>> Client closed before SASL negotiation finished.)
>>>> 15/12/09 16:29:57 ERROR yarn.ApplicationMaster: SparkContext did not
>>>> initialize after waiting for 150000 ms. Please check earlier log output for
>>>> errors. Failing the application.
>>>> 15/12/09 16:29:57 INFO yarn.ApplicationMaster: Unregistering
>>>> ApplicationMaster with FAILED (diag message: User class threw exception:
>>>> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
>>>> Client closed before SASL negotiation finished.)
>>>> 15/12/09 16:29:57 INFO yarn.ApplicationMaster: Deleting staging
>>>> directory .sparkStaging/application_1448873753366_121022
>>>> 15/12/09 16:29:57 INFO util.Utils: Shutdown hook called
>>>>
>>>> LogType: stdout
>>>> LogLength: 216
>>>> Log Contents:
>>>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>>>> UseCompressedStrings; support was removed in 7.0
>>>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>>>> UseCompressedStrings; support was removed in 7.0
>>>>
>>>>
>>>>
>>>> Container: container_1448873753366_121022_01_000001 on 10.226.148.5_8041
>>>>
>>>> ==========================================================================
>>>> LogType: stderr
>>>> LogLength: 4284
>>>> Log Contents:
>>>> Please use CMSClassUnloadingEnabled in place of
>>>> CMSPermGenSweepingEnabled in the future
>>>> Please use CMSClassUnloadingEnabled in place of
>>>> CMSPermGenSweepingEnabled in the future
>>>> 15/12/09 16:29:17 INFO yarn.ApplicationMaster: Registered signal
>>>> handlers for [TERM, HUP, INT]
>>>> 15/12/09 16:29:18 INFO yarn.ApplicationMaster: ApplicationAttemptId:
>>>> appattempt_1448873753366_121022_000001
>>>> 15/12/09 16:29:18 INFO spark.SecurityManager: Changing view acls to: mqq
>>>> 15/12/09 16:29:18 INFO spark.SecurityManager: Changing modify acls to:
>>>> mqq
>>>> 15/12/09 16:29:18 INFO spark.SecurityManager: SecurityManager:
>>>> authentication disabled; ui acls disabled; users with view permissions:
>>>> Set(mqq); users with modify permissions: Set(mqq)
>>>> 15/12/09 16:29:19 INFO yarn.ApplicationMaster: Starting the user
>>>> application in a separate Thread
>>>> 15/12/09 16:29:19 INFO yarn.ApplicationMaster: Waiting for spark
>>>> context initialization
>>>> 15/12/09 16:29:19 INFO yarn.ApplicationMaster: Waiting for spark
>>>> context initialization ...
>>>> 15/12/09 16:29:19 INFO client.RemoteDriver: Connecting to:
>>>> 10.179.12.140:38842
>>>> 15/12/09 16:29:20 WARN rpc.Rpc: Invalid log level null, reverting to
>>>> default.
>>>> 15/12/09 16:29:20 ERROR yarn.ApplicationMaster: User class threw
>>>> exception: java.util.concurrent.ExecutionException:
>>>> javax.security.sasl.SaslException: Client closed before SASL negotiation
>>>> finished.
>>>> java.util.concurrent.ExecutionException:
>>>> javax.security.sasl.SaslException: Client closed before SASL negotiation
>>>> finished.
>>>>         at
>>>> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>>>>         at
>>>> org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
>>>>         at
>>>> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>         at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>         at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>>>         at
>>>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
>>>> Caused by: javax.security.sasl.SaslException: Client closed before SASL
>>>> negotiation finished.
>>>>         at
>>>> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449)
>>>>         at
>>>> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
>>>>         at
>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>>>         at
>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>>>         at
>>>> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>>>>         at
>>>> org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127)
>>>>         at
>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>>>         at
>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>>>         at
>>>> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>>>>         at
>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>>>         at
>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>>>         at
>>>> io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769)
>>>>         at
>>>> io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567)
>>>>         at
>>>> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
>>>>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
>>>>         at
>>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>>>>         at java.lang.Thread.run(Thread.java:745)
>>>> 15/12/09 16:29:20 INFO yarn.ApplicationMaster: Final app status:
>>>> FAILED, exitCode: 15, (reason: User class threw exception:
>>>> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
>>>> Client closed before SASL negotiation finished.)
>>>> 15/12/09 16:29:29 ERROR yarn.ApplicationMaster: SparkContext did not
>>>> initialize after waiting for 150000 ms. Please check earlier log output for
>>>> errors. Failing the application.
>>>> 15/12/09 16:29:29 INFO util.Utils: Shutdown hook called
>>>>
>>>> LogType: stdout
>>>> LogLength: 216
>>>> Log Contents:
>>>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>>>> UseCompressedStrings; support was removed in 7.0
>>>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>>>> UseCompressedStrings; support was removed in 7.0
>>>>
>>>>
>>>>
>>>> Container: container_1448873753366_121022_02_000001 on
>>>> 10.239.243.69_8041
>>>>
>>>> ===========================================================================
>>>> LogType: stderr
>>>> LogLength: 4284
>>>> Log Contents:
>>>> Please use CMSClassUnloadingEnabled in place of
>>>> CMSPermGenSweepingEnabled in the future
>>>> Please use CMSClassUnloadingEnabled in place of
>>>> CMSPermGenSweepingEnabled in the future
>>>> 15/12/09 16:29:31 INFO yarn.ApplicationMaster: Registered signal
>>>> handlers for [TERM, HUP, INT]
>>>> 15/12/09 16:29:32 INFO yarn.ApplicationMaster: ApplicationAttemptId:
>>>> appattempt_1448873753366_121022_000002
>>>> 15/12/09 16:29:33 INFO spark.SecurityManager: Changing view acls to: mqq
>>>> 15/12/09 16:29:33 INFO spark.SecurityManager: Changing modify acls to:
>>>> mqq
>>>> 15/12/09 16:29:33 INFO spark.SecurityManager: SecurityManager:
>>>> authentication disabled; ui acls disabled; users with view permissions:
>>>> Set(mqq); users with modify permissions: Set(mqq)
>>>> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Starting the user
>>>> application in a separate Thread
>>>> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Waiting for spark
>>>> context initialization
>>>> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Waiting for spark
>>>> context initialization ...
>>>> 15/12/09 16:29:33 INFO client.RemoteDriver: Connecting to:
>>>> 10.179.12.140:38842
>>>> 15/12/09 16:29:33 WARN rpc.Rpc: Invalid log level null, reverting to
>>>> default.
>>>> 15/12/09 16:29:33 ERROR yarn.ApplicationMaster: User class threw
>>>> exception: java.util.concurrent.ExecutionException:
>>>> javax.security.sasl.SaslException: Client closed before SASL negotiation
>>>> finished.
>>>> java.util.concurrent.ExecutionException:
>>>> javax.security.sasl.SaslException: Client closed before SASL negotiation
>>>> finished.
>>>>         at
>>>> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>>>>         at
>>>> org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
>>>>         at
>>>> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>         at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>         at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>>>         at
>>>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
>>>> Caused by: javax.security.sasl.SaslException: Client closed before SASL
>>>> negotiation finished.
>>>>         at
>>>> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449)
>>>>         at
>>>> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
>>>>         at
>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>>>         at
>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>>>         at
>>>> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>>>>         at
>>>> org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127)
>>>>         at
>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>>>         at
>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>>>         at
>>>> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>>>>         at
>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>>>         at
>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>>>         at
>>>> io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769)
>>>>         at
>>>> io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567)
>>>>         at
>>>> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
>>>>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
>>>>         at
>>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>>>>         at java.lang.Thread.run(Thread.java:745)
>>>> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Final app status:
>>>> FAILED, exitCode: 15, (reason: User class threw exception:
>>>> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
>>>> Client closed before SASL negotiation finished.)
>>>> 15/12/09 16:29:43 ERROR yarn.ApplicationMaster: SparkContext did not
>>>> initialize after waiting for 150000 ms. Please check earlier log output for
>>>> errors. Failing the application.
>>>> 15/12/09 16:29:43 INFO util.Utils: Shutdown hook called
>>>>
>>>> LogType: stdout
>>>> LogLength: 216
>>>> Log Contents:
>>>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>>>> UseCompressedStrings; support was removed in 7.0
>>>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>>>> UseCompressedStrings; support was removed in 7.0
>>>>
>>>>
>>>> *I find submit cycle is base
>>>> on hive.spark.client.server.connect.timeout,which is 120000ms in our team.*
>>>> *And spark.yarn.am.waitTime is 150s.*
>>>>
>>>> *My question is *
>>>> *1.Is a bug or design of submit more times?How do i set up the value of
>>>> hive.spark.client.server.connect.timeout when the queue resources is not
>>>> enough?*
>>>> *2. Why the application will fast fail,caused by "SparkContext did not
>>>> initialize after waiting for 150000 ms"?*
>>>>
>>>> *Thanks.*
>>>> *Best wishes.*
>>>>
>>>>
>>>
>>>
>>
>

Re: Hive on Spark application will be submited more times when the queue resources is not enough.

Posted by Jone Zhang <jo...@gmail.com>.
*It seems that the submit number depend on stage of the query.*
*This query include three stages.*

If queue resources is still *not enough after submit threee
applications,** Hive
client will close.*
*"**Failed to execute spark task, with exception
'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark
client.)'*
*FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.spark.SparkTask**"*
*And this time, the port(eg **34682**)  kill in hive client(eg *
*10.179.12.140**) use to **communicate with RSC **will  lost.*

*The reources of queue is free **after awhile, the AM of three applications
will fast fail beacause of "**15/12/10 12:28:43 INFO client.RemoteDriver:
Connecting to: 10.179.12.140:34682...java.net.ConnectException: Connection
refused: /10.179.12.140:34682 <http://10.179.12.140:34682>**"*

*So, The application will fail if the queue resources if not **enough at
point of the query be submited, even if the resources is free **after
awhile.*
*Do you have more idea about this question?*

*Attch the query*
set hive.execution.engine=spark;
set spark.yarn.queue=tms;
set spark.app.name=t_ad_tms_heartbeat_ok_3;
insert overwrite table t_ad_tms_heartbeat_ok partition(ds=20151208)
SELECT
            NVL(a.qimei, b.qimei) AS qimei,
            NVL(b.first_ip,a.user_ip) AS first_ip,
            NVL(a.user_ip, b.last_ip) AS last_ip,
            NVL(b.first_date, a.ds) AS first_date,
            NVL(a.ds, b.last_date) AS last_date,
            NVL(b.first_chid, a.chid) AS first_chid,
            NVL(a.chid, b.last_chid) AS last_chid,
            NVL(b.first_lc, a.lc) AS first_lc,
            NVL(a.lc, b.last_lc) AS last_lc,
            NVL(a.guid, b.guid) AS guid,
            NVL(a.sn, b.sn) AS sn,
            NVL(a.vn, b.vn) AS vn,
                        NVL(a.vc, b.vc) AS vc,
                        NVL(a.mo, b.mo) AS mo,
                        NVL(a.rl, b.rl) AS rl,
                        NVL(a.os, b.os) AS os,
                        NVL(a.rv, b.rv) AS rv,
                        NVL(a.qv, b.qv) AS qv,
                        NVL(a.imei, b.imei) AS imei,
                        NVL(a.romid, b.romid) AS romid,
            NVL(a.bn, b.bn) AS bn,
                        NVL(a.account_type, b.account_type) AS account_type,
                        NVL(a.account, b.account) AS account
    FROM
        (SELECT
ds,user_ip,guid,sn,vn,vc,mo,rl,chid,lcid,os,rv,qv,imei,qimei,lc,romid,bn,account_type,account
        FROM    t_od_tms_heartbeat_ok
        WHERE   ds = 20151208) a
    FULL OUTER JOIN
        (SELECT
qimei,first_ip,last_ip,first_date,last_date,first_chid,last_chid,first_lc,last_lc,guid,sn,vn,vc,mo,rl,os,rv,qv,imei,romid,bn,account_type,account
        FROM    t_ad_tms_heartbeat_ok
        WHERE   last_date > 20150611
        AND     ds = 20151207) b
        ON   a.qimei=b.qimei;

*Thanks.*
*Best wishes.*

2015-12-09 19:51 GMT+08:00 Jone Zhang <jo...@gmail.com>:

> But in some cases all of the applications will fail which caused
>> by SparkContext did not initialize after waiting for 150000 ms.
>> See attchment (hive.spark.client.server.connect.timeout is set to 5min).
>
>
> *The error log is different  from original mail*
>
> Container: container_1448873753366_113453_01_000001 on 10.247.169.134_8041
>
> ============================================================================
> LogType: stderr
> LogLength: 3302
> Log Contents:
> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
> in the future
> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
> in the future
> 15/12/09 02:11:48 INFO yarn.ApplicationMaster: Registered signal handlers
> for [TERM, HUP, INT]
> 15/12/09 02:11:48 INFO yarn.ApplicationMaster: ApplicationAttemptId:
> appattempt_1448873753366_113453_000001
> 15/12/09 02:11:49 INFO spark.SecurityManager: Changing view acls to: mqq
> 15/12/09 02:11:49 INFO spark.SecurityManager: Changing modify acls to: mqq
> 15/12/09 02:11:49 INFO spark.SecurityManager: SecurityManager:
> authentication disabled; ui acls disabled; users with view permissions:
> Set(mqq); users with modify permissions: Set(mqq)
> 15/12/09 02:11:49 INFO yarn.ApplicationMaster: Starting the user
> application in a separate Thread
> 15/12/09 02:11:49 INFO yarn.ApplicationMaster: Waiting for spark context
> initialization
> 15/12/09 02:11:49 INFO yarn.ApplicationMaster: Waiting for spark context
> initialization ...
> 15/12/09 02:11:49 INFO client.RemoteDriver: Connecting to:
> 10.179.12.140:58013
> 15/12/09 02:11:49 ERROR yarn.ApplicationMaster: User class threw
> exception: java.util.concurrent.ExecutionException:
> java.net.ConnectException: Connection refused: /10.179.12.140:58013
> java.util.concurrent.ExecutionException: java.net.ConnectException:
> Connection refused: /10.179.12.140:58013
>         at
> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>         at
> org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
>         at
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
> Caused by: java.net.ConnectException: Connection refused: /
> 10.179.12.140:58013
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>         at
> io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:208)
>         at
> io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:287)
>         at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
>         at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>         at
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>         at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>         at java.lang.Thread.run(Thread.java:745)
> 15/12/09 02:11:49 INFO yarn.ApplicationMaster: Final app status: FAILED,
> exitCode: 15, (reason: User class threw exception:
> java.util.concurrent.ExecutionException: java.net.ConnectException:
> Connection refused: /10.179.12.140:58013)
> 15/12/09 02:11:59 ERROR yarn.ApplicationMaster: SparkContext did not
> initialize after waiting for 150000 ms. Please check earlier log output for
> errors. Failing the application.
> 15/12/09 02:11:59 INFO util.Utils: Shutdown hook called
>
> 2015-12-09 19:22 GMT+08:00 Jone Zhang <jo...@gmail.com>:
>
>> Hive version is 1.2.1
>> Spark version is 1.4.1
>> Hadoop version is 2.5.1
>>
>> The application_1448873753366_121062 will success in the above mail.
>>
>> But in some cases all of the applications will fail which caused by SparkContext
>> did not initialize after waiting for 150000 ms.
>> See attchment (hive.spark.client.server.connect.timeout is set to 5min).
>>
>> Thanks.
>> Best wishes.
>>
>> 2015-12-09 17:56 GMT+08:00 Jone Zhang <jo...@gmail.com>:
>>
>>> *Hi, Xuefu:*
>>>
>>> *See attachment 1*
>>> *When the queue resources is not enough.*
>>> *The application application_1448873753366_121022 will pending.*
>>> *Two minutes later, the application application_1448873753366_121055
>>> will be submited and pending.*
>>> *And then application_1448873753366_121062.*
>>>
>>> *See attachment 2*
>>> *When the queue resources is free.*
>>> *The application  application_1448873753366_121062 begin to running.*
>>> *Application_1448873753366_121022 and application_1448873753366_121055
>>>  will failed fast.*
>>>
>>> *Logs of Application_1448873753366_121022 as follows(same as *
>>> *application_1448873753366_121055**):*
>>> Container: container_1448873753366_121022_03_000001 on
>>> 10.226.136.122_8041
>>>
>>> ============================================================================
>>> LogType: stderr
>>> LogLength: 4664
>>> Log Contents:
>>> Please use CMSClassUnloadingEnabled in place of
>>> CMSPermGenSweepingEnabled in the future
>>> Please use CMSClassUnloadingEnabled in place of
>>> CMSPermGenSweepingEnabled in the future
>>> 15/12/09 16:29:45 INFO yarn.ApplicationMaster: Registered signal
>>> handlers for [TERM, HUP, INT]
>>> 15/12/09 16:29:46 INFO yarn.ApplicationMaster: ApplicationAttemptId:
>>> appattempt_1448873753366_121022_000003
>>> 15/12/09 16:29:47 INFO spark.SecurityManager: Changing view acls to: mqq
>>> 15/12/09 16:29:47 INFO spark.SecurityManager: Changing modify acls to:
>>> mqq
>>> 15/12/09 16:29:47 INFO spark.SecurityManager: SecurityManager:
>>> authentication disabled; ui acls disabled; users with view permissions:
>>> Set(mqq); users with modify permissions: Set(mqq)
>>> 15/12/09 16:29:47 INFO yarn.ApplicationMaster: Starting the user
>>> application in a separate Thread
>>> 15/12/09 16:29:47 INFO yarn.ApplicationMaster: Waiting for spark context
>>> initialization
>>> 15/12/09 16:29:47 INFO yarn.ApplicationMaster: Waiting for spark context
>>> initialization ...
>>> 15/12/09 16:29:47 INFO client.RemoteDriver: Connecting to:
>>> 10.179.12.140:38842
>>> 15/12/09 16:29:48 WARN rpc.Rpc: Invalid log level null, reverting to
>>> default.
>>> 15/12/09 16:29:48 ERROR yarn.ApplicationMaster: User class threw
>>> exception: java.util.concurrent.ExecutionException:
>>> javax.security.sasl.SaslException: Client closed before SASL negotiation
>>> finished.
>>> java.util.concurrent.ExecutionException:
>>> javax.security.sasl.SaslException: Client closed before SASL negotiation
>>> finished.
>>>         at
>>> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>>>         at
>>> org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
>>>         at
>>> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>         at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>         at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>>         at
>>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
>>> Caused by: javax.security.sasl.SaslException: Client closed before SASL
>>> negotiation finished.
>>>         at
>>> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449)
>>>         at
>>> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
>>>         at
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>>         at
>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>>         at
>>> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>>>         at
>>> org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127)
>>>         at
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>>         at
>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>>         at
>>> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>>>         at
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>>         at
>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>>         at
>>> io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769)
>>>         at
>>> io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567)
>>>         at
>>> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
>>>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
>>>         at
>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>>>         at java.lang.Thread.run(Thread.java:745)
>>> 15/12/09 16:29:48 INFO yarn.ApplicationMaster: Final app status: FAILED,
>>> exitCode: 15, (reason: User class threw exception:
>>> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
>>> Client closed before SASL negotiation finished.)
>>> 15/12/09 16:29:57 ERROR yarn.ApplicationMaster: SparkContext did not
>>> initialize after waiting for 150000 ms. Please check earlier log output for
>>> errors. Failing the application.
>>> 15/12/09 16:29:57 INFO yarn.ApplicationMaster: Unregistering
>>> ApplicationMaster with FAILED (diag message: User class threw exception:
>>> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
>>> Client closed before SASL negotiation finished.)
>>> 15/12/09 16:29:57 INFO yarn.ApplicationMaster: Deleting staging
>>> directory .sparkStaging/application_1448873753366_121022
>>> 15/12/09 16:29:57 INFO util.Utils: Shutdown hook called
>>>
>>> LogType: stdout
>>> LogLength: 216
>>> Log Contents:
>>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>>> UseCompressedStrings; support was removed in 7.0
>>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>>> UseCompressedStrings; support was removed in 7.0
>>>
>>>
>>>
>>> Container: container_1448873753366_121022_01_000001 on 10.226.148.5_8041
>>>
>>> ==========================================================================
>>> LogType: stderr
>>> LogLength: 4284
>>> Log Contents:
>>> Please use CMSClassUnloadingEnabled in place of
>>> CMSPermGenSweepingEnabled in the future
>>> Please use CMSClassUnloadingEnabled in place of
>>> CMSPermGenSweepingEnabled in the future
>>> 15/12/09 16:29:17 INFO yarn.ApplicationMaster: Registered signal
>>> handlers for [TERM, HUP, INT]
>>> 15/12/09 16:29:18 INFO yarn.ApplicationMaster: ApplicationAttemptId:
>>> appattempt_1448873753366_121022_000001
>>> 15/12/09 16:29:18 INFO spark.SecurityManager: Changing view acls to: mqq
>>> 15/12/09 16:29:18 INFO spark.SecurityManager: Changing modify acls to:
>>> mqq
>>> 15/12/09 16:29:18 INFO spark.SecurityManager: SecurityManager:
>>> authentication disabled; ui acls disabled; users with view permissions:
>>> Set(mqq); users with modify permissions: Set(mqq)
>>> 15/12/09 16:29:19 INFO yarn.ApplicationMaster: Starting the user
>>> application in a separate Thread
>>> 15/12/09 16:29:19 INFO yarn.ApplicationMaster: Waiting for spark context
>>> initialization
>>> 15/12/09 16:29:19 INFO yarn.ApplicationMaster: Waiting for spark context
>>> initialization ...
>>> 15/12/09 16:29:19 INFO client.RemoteDriver: Connecting to:
>>> 10.179.12.140:38842
>>> 15/12/09 16:29:20 WARN rpc.Rpc: Invalid log level null, reverting to
>>> default.
>>> 15/12/09 16:29:20 ERROR yarn.ApplicationMaster: User class threw
>>> exception: java.util.concurrent.ExecutionException:
>>> javax.security.sasl.SaslException: Client closed before SASL negotiation
>>> finished.
>>> java.util.concurrent.ExecutionException:
>>> javax.security.sasl.SaslException: Client closed before SASL negotiation
>>> finished.
>>>         at
>>> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>>>         at
>>> org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
>>>         at
>>> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>         at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>         at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>>         at
>>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
>>> Caused by: javax.security.sasl.SaslException: Client closed before SASL
>>> negotiation finished.
>>>         at
>>> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449)
>>>         at
>>> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
>>>         at
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>>         at
>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>>         at
>>> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>>>         at
>>> org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127)
>>>         at
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>>         at
>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>>         at
>>> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>>>         at
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>>         at
>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>>         at
>>> io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769)
>>>         at
>>> io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567)
>>>         at
>>> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
>>>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
>>>         at
>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>>>         at java.lang.Thread.run(Thread.java:745)
>>> 15/12/09 16:29:20 INFO yarn.ApplicationMaster: Final app status: FAILED,
>>> exitCode: 15, (reason: User class threw exception:
>>> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
>>> Client closed before SASL negotiation finished.)
>>> 15/12/09 16:29:29 ERROR yarn.ApplicationMaster: SparkContext did not
>>> initialize after waiting for 150000 ms. Please check earlier log output for
>>> errors. Failing the application.
>>> 15/12/09 16:29:29 INFO util.Utils: Shutdown hook called
>>>
>>> LogType: stdout
>>> LogLength: 216
>>> Log Contents:
>>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>>> UseCompressedStrings; support was removed in 7.0
>>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>>> UseCompressedStrings; support was removed in 7.0
>>>
>>>
>>>
>>> Container: container_1448873753366_121022_02_000001 on 10.239.243.69_8041
>>>
>>> ===========================================================================
>>> LogType: stderr
>>> LogLength: 4284
>>> Log Contents:
>>> Please use CMSClassUnloadingEnabled in place of
>>> CMSPermGenSweepingEnabled in the future
>>> Please use CMSClassUnloadingEnabled in place of
>>> CMSPermGenSweepingEnabled in the future
>>> 15/12/09 16:29:31 INFO yarn.ApplicationMaster: Registered signal
>>> handlers for [TERM, HUP, INT]
>>> 15/12/09 16:29:32 INFO yarn.ApplicationMaster: ApplicationAttemptId:
>>> appattempt_1448873753366_121022_000002
>>> 15/12/09 16:29:33 INFO spark.SecurityManager: Changing view acls to: mqq
>>> 15/12/09 16:29:33 INFO spark.SecurityManager: Changing modify acls to:
>>> mqq
>>> 15/12/09 16:29:33 INFO spark.SecurityManager: SecurityManager:
>>> authentication disabled; ui acls disabled; users with view permissions:
>>> Set(mqq); users with modify permissions: Set(mqq)
>>> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Starting the user
>>> application in a separate Thread
>>> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Waiting for spark context
>>> initialization
>>> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Waiting for spark context
>>> initialization ...
>>> 15/12/09 16:29:33 INFO client.RemoteDriver: Connecting to:
>>> 10.179.12.140:38842
>>> 15/12/09 16:29:33 WARN rpc.Rpc: Invalid log level null, reverting to
>>> default.
>>> 15/12/09 16:29:33 ERROR yarn.ApplicationMaster: User class threw
>>> exception: java.util.concurrent.ExecutionException:
>>> javax.security.sasl.SaslException: Client closed before SASL negotiation
>>> finished.
>>> java.util.concurrent.ExecutionException:
>>> javax.security.sasl.SaslException: Client closed before SASL negotiation
>>> finished.
>>>         at
>>> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>>>         at
>>> org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
>>>         at
>>> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>         at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>         at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>>         at
>>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
>>> Caused by: javax.security.sasl.SaslException: Client closed before SASL
>>> negotiation finished.
>>>         at
>>> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449)
>>>         at
>>> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
>>>         at
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>>         at
>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>>         at
>>> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>>>         at
>>> org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127)
>>>         at
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>>         at
>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>>         at
>>> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>>>         at
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>>         at
>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>>         at
>>> io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769)
>>>         at
>>> io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567)
>>>         at
>>> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
>>>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
>>>         at
>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>>>         at java.lang.Thread.run(Thread.java:745)
>>> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Final app status: FAILED,
>>> exitCode: 15, (reason: User class threw exception:
>>> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
>>> Client closed before SASL negotiation finished.)
>>> 15/12/09 16:29:43 ERROR yarn.ApplicationMaster: SparkContext did not
>>> initialize after waiting for 150000 ms. Please check earlier log output for
>>> errors. Failing the application.
>>> 15/12/09 16:29:43 INFO util.Utils: Shutdown hook called
>>>
>>> LogType: stdout
>>> LogLength: 216
>>> Log Contents:
>>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>>> UseCompressedStrings; support was removed in 7.0
>>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>>> UseCompressedStrings; support was removed in 7.0
>>>
>>>
>>> *I find submit cycle is base
>>> on hive.spark.client.server.connect.timeout,which is 120000ms in our team.*
>>> *And spark.yarn.am.waitTime is 150s.*
>>>
>>> *My question is *
>>> *1.Is a bug or design of submit more times?How do i set up the value of
>>> hive.spark.client.server.connect.timeout when the queue resources is not
>>> enough?*
>>> *2. Why the application will fast fail,caused by "SparkContext did not
>>> initialize after waiting for 150000 ms"?*
>>>
>>> *Thanks.*
>>> *Best wishes.*
>>>
>>>
>>
>>
>

Re: Hive on Spark application will be submited more times when the queue resources is not enough.

Posted by Jone Zhang <jo...@gmail.com>.
>
> But in some cases all of the applications will fail which caused
> by SparkContext did not initialize after waiting for 150000 ms.
> See attchment (hive.spark.client.server.connect.timeout is set to 5min).


*The error log is different  from original mail*

Container: container_1448873753366_113453_01_000001 on 10.247.169.134_8041
============================================================================
LogType: stderr
LogLength: 3302
Log Contents:
Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
in the future
Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
in the future
15/12/09 02:11:48 INFO yarn.ApplicationMaster: Registered signal handlers
for [TERM, HUP, INT]
15/12/09 02:11:48 INFO yarn.ApplicationMaster: ApplicationAttemptId:
appattempt_1448873753366_113453_000001
15/12/09 02:11:49 INFO spark.SecurityManager: Changing view acls to: mqq
15/12/09 02:11:49 INFO spark.SecurityManager: Changing modify acls to: mqq
15/12/09 02:11:49 INFO spark.SecurityManager: SecurityManager:
authentication disabled; ui acls disabled; users with view permissions:
Set(mqq); users with modify permissions: Set(mqq)
15/12/09 02:11:49 INFO yarn.ApplicationMaster: Starting the user
application in a separate Thread
15/12/09 02:11:49 INFO yarn.ApplicationMaster: Waiting for spark context
initialization
15/12/09 02:11:49 INFO yarn.ApplicationMaster: Waiting for spark context
initialization ...
15/12/09 02:11:49 INFO client.RemoteDriver: Connecting to:
10.179.12.140:58013
15/12/09 02:11:49 ERROR yarn.ApplicationMaster: User class threw exception:
java.util.concurrent.ExecutionException: java.net.ConnectException:
Connection refused: /10.179.12.140:58013
java.util.concurrent.ExecutionException: java.net.ConnectException:
Connection refused: /10.179.12.140:58013
        at
io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
        at
org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
        at
org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
Caused by: java.net.ConnectException: Connection refused: /
10.179.12.140:58013
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
        at
io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:208)
        at
io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:287)
        at
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
        at
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
        at
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
        at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
        at java.lang.Thread.run(Thread.java:745)
15/12/09 02:11:49 INFO yarn.ApplicationMaster: Final app status: FAILED,
exitCode: 15, (reason: User class threw exception:
java.util.concurrent.ExecutionException: java.net.ConnectException:
Connection refused: /10.179.12.140:58013)
15/12/09 02:11:59 ERROR yarn.ApplicationMaster: SparkContext did not
initialize after waiting for 150000 ms. Please check earlier log output for
errors. Failing the application.
15/12/09 02:11:59 INFO util.Utils: Shutdown hook called

2015-12-09 19:22 GMT+08:00 Jone Zhang <jo...@gmail.com>:

> Hive version is 1.2.1
> Spark version is 1.4.1
> Hadoop version is 2.5.1
>
> The application_1448873753366_121062 will success in the above mail.
>
> But in some cases all of the applications will fail which caused by SparkContext
> did not initialize after waiting for 150000 ms.
> See attchment (hive.spark.client.server.connect.timeout is set to 5min).
>
> Thanks.
> Best wishes.
>
> 2015-12-09 17:56 GMT+08:00 Jone Zhang <jo...@gmail.com>:
>
>> *Hi, Xuefu:*
>>
>> *See attachment 1*
>> *When the queue resources is not enough.*
>> *The application application_1448873753366_121022 will pending.*
>> *Two minutes later, the application application_1448873753366_121055 will
>> be submited and pending.*
>> *And then application_1448873753366_121062.*
>>
>> *See attachment 2*
>> *When the queue resources is free.*
>> *The application  application_1448873753366_121062 begin to running.*
>> *Application_1448873753366_121022 and application_1448873753366_121055
>>  will failed fast.*
>>
>> *Logs of Application_1448873753366_121022 as follows(same as *
>> *application_1448873753366_121055**):*
>> Container: container_1448873753366_121022_03_000001 on 10.226.136.122_8041
>>
>> ============================================================================
>> LogType: stderr
>> LogLength: 4664
>> Log Contents:
>> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
>> in the future
>> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
>> in the future
>> 15/12/09 16:29:45 INFO yarn.ApplicationMaster: Registered signal handlers
>> for [TERM, HUP, INT]
>> 15/12/09 16:29:46 INFO yarn.ApplicationMaster: ApplicationAttemptId:
>> appattempt_1448873753366_121022_000003
>> 15/12/09 16:29:47 INFO spark.SecurityManager: Changing view acls to: mqq
>> 15/12/09 16:29:47 INFO spark.SecurityManager: Changing modify acls to: mqq
>> 15/12/09 16:29:47 INFO spark.SecurityManager: SecurityManager:
>> authentication disabled; ui acls disabled; users with view permissions:
>> Set(mqq); users with modify permissions: Set(mqq)
>> 15/12/09 16:29:47 INFO yarn.ApplicationMaster: Starting the user
>> application in a separate Thread
>> 15/12/09 16:29:47 INFO yarn.ApplicationMaster: Waiting for spark context
>> initialization
>> 15/12/09 16:29:47 INFO yarn.ApplicationMaster: Waiting for spark context
>> initialization ...
>> 15/12/09 16:29:47 INFO client.RemoteDriver: Connecting to:
>> 10.179.12.140:38842
>> 15/12/09 16:29:48 WARN rpc.Rpc: Invalid log level null, reverting to
>> default.
>> 15/12/09 16:29:48 ERROR yarn.ApplicationMaster: User class threw
>> exception: java.util.concurrent.ExecutionException:
>> javax.security.sasl.SaslException: Client closed before SASL negotiation
>> finished.
>> java.util.concurrent.ExecutionException:
>> javax.security.sasl.SaslException: Client closed before SASL negotiation
>> finished.
>>         at
>> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>>         at
>> org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
>>         at
>> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>         at
>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
>> Caused by: javax.security.sasl.SaslException: Client closed before SASL
>> negotiation finished.
>>         at
>> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449)
>>         at
>> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
>>         at
>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>         at
>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>         at
>> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>>         at
>> org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127)
>>         at
>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>         at
>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>         at
>> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>>         at
>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>         at
>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>         at
>> io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769)
>>         at
>> io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567)
>>         at
>> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
>>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
>>         at
>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>>         at java.lang.Thread.run(Thread.java:745)
>> 15/12/09 16:29:48 INFO yarn.ApplicationMaster: Final app status: FAILED,
>> exitCode: 15, (reason: User class threw exception:
>> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
>> Client closed before SASL negotiation finished.)
>> 15/12/09 16:29:57 ERROR yarn.ApplicationMaster: SparkContext did not
>> initialize after waiting for 150000 ms. Please check earlier log output for
>> errors. Failing the application.
>> 15/12/09 16:29:57 INFO yarn.ApplicationMaster: Unregistering
>> ApplicationMaster with FAILED (diag message: User class threw exception:
>> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
>> Client closed before SASL negotiation finished.)
>> 15/12/09 16:29:57 INFO yarn.ApplicationMaster: Deleting staging directory
>> .sparkStaging/application_1448873753366_121022
>> 15/12/09 16:29:57 INFO util.Utils: Shutdown hook called
>>
>> LogType: stdout
>> LogLength: 216
>> Log Contents:
>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>> UseCompressedStrings; support was removed in 7.0
>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>> UseCompressedStrings; support was removed in 7.0
>>
>>
>>
>> Container: container_1448873753366_121022_01_000001 on 10.226.148.5_8041
>> ==========================================================================
>> LogType: stderr
>> LogLength: 4284
>> Log Contents:
>> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
>> in the future
>> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
>> in the future
>> 15/12/09 16:29:17 INFO yarn.ApplicationMaster: Registered signal handlers
>> for [TERM, HUP, INT]
>> 15/12/09 16:29:18 INFO yarn.ApplicationMaster: ApplicationAttemptId:
>> appattempt_1448873753366_121022_000001
>> 15/12/09 16:29:18 INFO spark.SecurityManager: Changing view acls to: mqq
>> 15/12/09 16:29:18 INFO spark.SecurityManager: Changing modify acls to: mqq
>> 15/12/09 16:29:18 INFO spark.SecurityManager: SecurityManager:
>> authentication disabled; ui acls disabled; users with view permissions:
>> Set(mqq); users with modify permissions: Set(mqq)
>> 15/12/09 16:29:19 INFO yarn.ApplicationMaster: Starting the user
>> application in a separate Thread
>> 15/12/09 16:29:19 INFO yarn.ApplicationMaster: Waiting for spark context
>> initialization
>> 15/12/09 16:29:19 INFO yarn.ApplicationMaster: Waiting for spark context
>> initialization ...
>> 15/12/09 16:29:19 INFO client.RemoteDriver: Connecting to:
>> 10.179.12.140:38842
>> 15/12/09 16:29:20 WARN rpc.Rpc: Invalid log level null, reverting to
>> default.
>> 15/12/09 16:29:20 ERROR yarn.ApplicationMaster: User class threw
>> exception: java.util.concurrent.ExecutionException:
>> javax.security.sasl.SaslException: Client closed before SASL negotiation
>> finished.
>> java.util.concurrent.ExecutionException:
>> javax.security.sasl.SaslException: Client closed before SASL negotiation
>> finished.
>>         at
>> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>>         at
>> org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
>>         at
>> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>         at
>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
>> Caused by: javax.security.sasl.SaslException: Client closed before SASL
>> negotiation finished.
>>         at
>> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449)
>>         at
>> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
>>         at
>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>         at
>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>         at
>> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>>         at
>> org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127)
>>         at
>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>         at
>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>         at
>> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>>         at
>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>         at
>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>         at
>> io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769)
>>         at
>> io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567)
>>         at
>> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
>>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
>>         at
>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>>         at java.lang.Thread.run(Thread.java:745)
>> 15/12/09 16:29:20 INFO yarn.ApplicationMaster: Final app status: FAILED,
>> exitCode: 15, (reason: User class threw exception:
>> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
>> Client closed before SASL negotiation finished.)
>> 15/12/09 16:29:29 ERROR yarn.ApplicationMaster: SparkContext did not
>> initialize after waiting for 150000 ms. Please check earlier log output for
>> errors. Failing the application.
>> 15/12/09 16:29:29 INFO util.Utils: Shutdown hook called
>>
>> LogType: stdout
>> LogLength: 216
>> Log Contents:
>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>> UseCompressedStrings; support was removed in 7.0
>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>> UseCompressedStrings; support was removed in 7.0
>>
>>
>>
>> Container: container_1448873753366_121022_02_000001 on 10.239.243.69_8041
>>
>> ===========================================================================
>> LogType: stderr
>> LogLength: 4284
>> Log Contents:
>> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
>> in the future
>> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
>> in the future
>> 15/12/09 16:29:31 INFO yarn.ApplicationMaster: Registered signal handlers
>> for [TERM, HUP, INT]
>> 15/12/09 16:29:32 INFO yarn.ApplicationMaster: ApplicationAttemptId:
>> appattempt_1448873753366_121022_000002
>> 15/12/09 16:29:33 INFO spark.SecurityManager: Changing view acls to: mqq
>> 15/12/09 16:29:33 INFO spark.SecurityManager: Changing modify acls to: mqq
>> 15/12/09 16:29:33 INFO spark.SecurityManager: SecurityManager:
>> authentication disabled; ui acls disabled; users with view permissions:
>> Set(mqq); users with modify permissions: Set(mqq)
>> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Starting the user
>> application in a separate Thread
>> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Waiting for spark context
>> initialization
>> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Waiting for spark context
>> initialization ...
>> 15/12/09 16:29:33 INFO client.RemoteDriver: Connecting to:
>> 10.179.12.140:38842
>> 15/12/09 16:29:33 WARN rpc.Rpc: Invalid log level null, reverting to
>> default.
>> 15/12/09 16:29:33 ERROR yarn.ApplicationMaster: User class threw
>> exception: java.util.concurrent.ExecutionException:
>> javax.security.sasl.SaslException: Client closed before SASL negotiation
>> finished.
>> java.util.concurrent.ExecutionException:
>> javax.security.sasl.SaslException: Client closed before SASL negotiation
>> finished.
>>         at
>> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>>         at
>> org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
>>         at
>> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>         at
>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
>> Caused by: javax.security.sasl.SaslException: Client closed before SASL
>> negotiation finished.
>>         at
>> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449)
>>         at
>> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
>>         at
>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>         at
>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>         at
>> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>>         at
>> org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127)
>>         at
>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>         at
>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>         at
>> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>>         at
>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>>         at
>> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>>         at
>> io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769)
>>         at
>> io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567)
>>         at
>> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
>>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
>>         at
>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>>         at java.lang.Thread.run(Thread.java:745)
>> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Final app status: FAILED,
>> exitCode: 15, (reason: User class threw exception:
>> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
>> Client closed before SASL negotiation finished.)
>> 15/12/09 16:29:43 ERROR yarn.ApplicationMaster: SparkContext did not
>> initialize after waiting for 150000 ms. Please check earlier log output for
>> errors. Failing the application.
>> 15/12/09 16:29:43 INFO util.Utils: Shutdown hook called
>>
>> LogType: stdout
>> LogLength: 216
>> Log Contents:
>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>> UseCompressedStrings; support was removed in 7.0
>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>> UseCompressedStrings; support was removed in 7.0
>>
>>
>> *I find submit cycle is base
>> on hive.spark.client.server.connect.timeout,which is 120000ms in our team.*
>> *And spark.yarn.am.waitTime is 150s.*
>>
>> *My question is *
>> *1.Is a bug or design of submit more times?How do i set up the value of
>> hive.spark.client.server.connect.timeout when the queue resources is not
>> enough?*
>> *2. Why the application will fast fail,caused by "SparkContext did not
>> initialize after waiting for 150000 ms"?*
>>
>> *Thanks.*
>> *Best wishes.*
>>
>>
>
>

Re: Hive on Spark application will be submited more times when the queue resources is not enough.

Posted by Jone Zhang <jo...@gmail.com>.
Hive version is 1.2.1
Spark version is 1.4.1
Hadoop version is 2.5.1

The application_1448873753366_121062 will success in the above mail.

But in some cases all of the applications will fail which caused by
SparkContext
did not initialize after waiting for 150000 ms.
See attchment (hive.spark.client.server.connect.timeout is set to 5min).

Thanks.
Best wishes.

2015-12-09 17:56 GMT+08:00 Jone Zhang <jo...@gmail.com>:

> *Hi, Xuefu:*
>
> *See attachment 1*
> *When the queue resources is not enough.*
> *The application application_1448873753366_121022 will pending.*
> *Two minutes later, the application application_1448873753366_121055 will
> be submited and pending.*
> *And then application_1448873753366_121062.*
>
> *See attachment 2*
> *When the queue resources is free.*
> *The application  application_1448873753366_121062 begin to running.*
> *Application_1448873753366_121022 and application_1448873753366_121055
>  will failed fast.*
>
> *Logs of Application_1448873753366_121022 as follows(same as *
> *application_1448873753366_121055**):*
> Container: container_1448873753366_121022_03_000001 on 10.226.136.122_8041
>
> ============================================================================
> LogType: stderr
> LogLength: 4664
> Log Contents:
> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
> in the future
> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
> in the future
> 15/12/09 16:29:45 INFO yarn.ApplicationMaster: Registered signal handlers
> for [TERM, HUP, INT]
> 15/12/09 16:29:46 INFO yarn.ApplicationMaster: ApplicationAttemptId:
> appattempt_1448873753366_121022_000003
> 15/12/09 16:29:47 INFO spark.SecurityManager: Changing view acls to: mqq
> 15/12/09 16:29:47 INFO spark.SecurityManager: Changing modify acls to: mqq
> 15/12/09 16:29:47 INFO spark.SecurityManager: SecurityManager:
> authentication disabled; ui acls disabled; users with view permissions:
> Set(mqq); users with modify permissions: Set(mqq)
> 15/12/09 16:29:47 INFO yarn.ApplicationMaster: Starting the user
> application in a separate Thread
> 15/12/09 16:29:47 INFO yarn.ApplicationMaster: Waiting for spark context
> initialization
> 15/12/09 16:29:47 INFO yarn.ApplicationMaster: Waiting for spark context
> initialization ...
> 15/12/09 16:29:47 INFO client.RemoteDriver: Connecting to:
> 10.179.12.140:38842
> 15/12/09 16:29:48 WARN rpc.Rpc: Invalid log level null, reverting to
> default.
> 15/12/09 16:29:48 ERROR yarn.ApplicationMaster: User class threw
> exception: java.util.concurrent.ExecutionException:
> javax.security.sasl.SaslException: Client closed before SASL negotiation
> finished.
> java.util.concurrent.ExecutionException:
> javax.security.sasl.SaslException: Client closed before SASL negotiation
> finished.
>         at
> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>         at
> org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
>         at
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL
> negotiation finished.
>         at
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449)
>         at
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
>         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>         at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>         at
> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>         at
> org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127)
>         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>         at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>         at
> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>         at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>         at
> io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769)
>         at
> io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567)
>         at
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
>         at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>         at java.lang.Thread.run(Thread.java:745)
> 15/12/09 16:29:48 INFO yarn.ApplicationMaster: Final app status: FAILED,
> exitCode: 15, (reason: User class threw exception:
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
> Client closed before SASL negotiation finished.)
> 15/12/09 16:29:57 ERROR yarn.ApplicationMaster: SparkContext did not
> initialize after waiting for 150000 ms. Please check earlier log output for
> errors. Failing the application.
> 15/12/09 16:29:57 INFO yarn.ApplicationMaster: Unregistering
> ApplicationMaster with FAILED (diag message: User class threw exception:
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
> Client closed before SASL negotiation finished.)
> 15/12/09 16:29:57 INFO yarn.ApplicationMaster: Deleting staging directory
> .sparkStaging/application_1448873753366_121022
> 15/12/09 16:29:57 INFO util.Utils: Shutdown hook called
>
> LogType: stdout
> LogLength: 216
> Log Contents:
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> UseCompressedStrings; support was removed in 7.0
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> UseCompressedStrings; support was removed in 7.0
>
>
>
> Container: container_1448873753366_121022_01_000001 on 10.226.148.5_8041
> ==========================================================================
> LogType: stderr
> LogLength: 4284
> Log Contents:
> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
> in the future
> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
> in the future
> 15/12/09 16:29:17 INFO yarn.ApplicationMaster: Registered signal handlers
> for [TERM, HUP, INT]
> 15/12/09 16:29:18 INFO yarn.ApplicationMaster: ApplicationAttemptId:
> appattempt_1448873753366_121022_000001
> 15/12/09 16:29:18 INFO spark.SecurityManager: Changing view acls to: mqq
> 15/12/09 16:29:18 INFO spark.SecurityManager: Changing modify acls to: mqq
> 15/12/09 16:29:18 INFO spark.SecurityManager: SecurityManager:
> authentication disabled; ui acls disabled; users with view permissions:
> Set(mqq); users with modify permissions: Set(mqq)
> 15/12/09 16:29:19 INFO yarn.ApplicationMaster: Starting the user
> application in a separate Thread
> 15/12/09 16:29:19 INFO yarn.ApplicationMaster: Waiting for spark context
> initialization
> 15/12/09 16:29:19 INFO yarn.ApplicationMaster: Waiting for spark context
> initialization ...
> 15/12/09 16:29:19 INFO client.RemoteDriver: Connecting to:
> 10.179.12.140:38842
> 15/12/09 16:29:20 WARN rpc.Rpc: Invalid log level null, reverting to
> default.
> 15/12/09 16:29:20 ERROR yarn.ApplicationMaster: User class threw
> exception: java.util.concurrent.ExecutionException:
> javax.security.sasl.SaslException: Client closed before SASL negotiation
> finished.
> java.util.concurrent.ExecutionException:
> javax.security.sasl.SaslException: Client closed before SASL negotiation
> finished.
>         at
> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>         at
> org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
>         at
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL
> negotiation finished.
>         at
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449)
>         at
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
>         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>         at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>         at
> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>         at
> org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127)
>         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>         at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>         at
> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>         at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>         at
> io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769)
>         at
> io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567)
>         at
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
>         at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>         at java.lang.Thread.run(Thread.java:745)
> 15/12/09 16:29:20 INFO yarn.ApplicationMaster: Final app status: FAILED,
> exitCode: 15, (reason: User class threw exception:
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
> Client closed before SASL negotiation finished.)
> 15/12/09 16:29:29 ERROR yarn.ApplicationMaster: SparkContext did not
> initialize after waiting for 150000 ms. Please check earlier log output for
> errors. Failing the application.
> 15/12/09 16:29:29 INFO util.Utils: Shutdown hook called
>
> LogType: stdout
> LogLength: 216
> Log Contents:
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> UseCompressedStrings; support was removed in 7.0
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> UseCompressedStrings; support was removed in 7.0
>
>
>
> Container: container_1448873753366_121022_02_000001 on 10.239.243.69_8041
> ===========================================================================
> LogType: stderr
> LogLength: 4284
> Log Contents:
> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
> in the future
> Please use CMSClassUnloadingEnabled in place of CMSPermGenSweepingEnabled
> in the future
> 15/12/09 16:29:31 INFO yarn.ApplicationMaster: Registered signal handlers
> for [TERM, HUP, INT]
> 15/12/09 16:29:32 INFO yarn.ApplicationMaster: ApplicationAttemptId:
> appattempt_1448873753366_121022_000002
> 15/12/09 16:29:33 INFO spark.SecurityManager: Changing view acls to: mqq
> 15/12/09 16:29:33 INFO spark.SecurityManager: Changing modify acls to: mqq
> 15/12/09 16:29:33 INFO spark.SecurityManager: SecurityManager:
> authentication disabled; ui acls disabled; users with view permissions:
> Set(mqq); users with modify permissions: Set(mqq)
> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Starting the user
> application in a separate Thread
> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Waiting for spark context
> initialization
> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Waiting for spark context
> initialization ...
> 15/12/09 16:29:33 INFO client.RemoteDriver: Connecting to:
> 10.179.12.140:38842
> 15/12/09 16:29:33 WARN rpc.Rpc: Invalid log level null, reverting to
> default.
> 15/12/09 16:29:33 ERROR yarn.ApplicationMaster: User class threw
> exception: java.util.concurrent.ExecutionException:
> javax.security.sasl.SaslException: Client closed before SASL negotiation
> finished.
> java.util.concurrent.ExecutionException:
> javax.security.sasl.SaslException: Client closed before SASL negotiation
> finished.
>         at
> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>         at
> org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
>         at
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL
> negotiation finished.
>         at
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:449)
>         at
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
>         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>         at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>         at
> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>         at
> org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127)
>         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>         at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>         at
> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
>         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)
>         at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:219)
>         at
> io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:769)
>         at
> io.netty.channel.AbstractChannel$AbstractUnsafe$5.run(AbstractChannel.java:567)
>         at
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
>         at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>         at java.lang.Thread.run(Thread.java:745)
> 15/12/09 16:29:33 INFO yarn.ApplicationMaster: Final app status: FAILED,
> exitCode: 15, (reason: User class threw exception:
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException:
> Client closed before SASL negotiation finished.)
> 15/12/09 16:29:43 ERROR yarn.ApplicationMaster: SparkContext did not
> initialize after waiting for 150000 ms. Please check earlier log output for
> errors. Failing the application.
> 15/12/09 16:29:43 INFO util.Utils: Shutdown hook called
>
> LogType: stdout
> LogLength: 216
> Log Contents:
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> UseCompressedStrings; support was removed in 7.0
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> UseCompressedStrings; support was removed in 7.0
>
>
> *I find submit cycle is base
> on hive.spark.client.server.connect.timeout,which is 120000ms in our team.*
> *And spark.yarn.am.waitTime is 150s.*
>
> *My question is *
> *1.Is a bug or design of submit more times?How do i set up the value of
> hive.spark.client.server.connect.timeout when the queue resources is not
> enough?*
> *2. Why the application will fast fail,caused by "SparkContext did not
> initialize after waiting for 150000 ms"?*
>
> *Thanks.*
> *Best wishes.*
>
>