You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Soni spark <so...@gmail.com> on 2016/01/21 12:41:55 UTC
spark job submisson on yarn-cluster mode failing
Hi Friends,
I spark job is successfully running on local mode but failing on
cluster mode. Below is the error message i am getting. anyone can help
me.
16/01/21 16:38:07 INFO twitter4j.TwitterStreamImpl: Establishing connection.
16/01/21 16:38:07 INFO twitter.TwitterReceiver: Twitter receiver started
16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Called receiver onStart
16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Waiting for
receiver to be stopped*16/01/21 16:38:10 ERROR yarn.ApplicationMaster:
RECEIVED SIGNAL 15: SIGTERM*
16/01/21 16:38:10 INFO streaming.StreamingContext: Invoking
stop(stopGracefully=false) from shutdown hook
16/01/21 16:38:10 INFO scheduler.ReceiverTracker: Sent stop signal to
all 1 receivers
16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Received stop signal
16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopping
receiver with message: Stopped by driver:
16/01/21 16:38:10 INFO twitter.TwitterReceiver: Twitter receiver stopped
16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Called receiver onStop
16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Deregistering
receiver 0*16/01/21 16:38:10 ERROR scheduler.ReceiverTracker:
Deregistered receiver for stream 0: Stopped by driver*
16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopped receiver 0
16/01/21 16:38:10 INFO receiver.BlockGenerator: Stopping BlockGenerator
16/01/21 16:38:10 INFO yarn.ApplicationMaster: Waiting for spark
context initialization ...
Thanks
Soniya
Re: spark job submisson on yarn-cluster mode failing
Posted by Ted Yu <yu...@gmail.com>.
Exception below is at WARN level.
Can you check hdfs healthiness ?
Which hadoop version are you using ?
There should be other fatal error if your job failed.
Cheers
On Thu, Jan 21, 2016 at 4:50 AM, Soni spark <so...@gmail.com>
wrote:
> Hi,
>
> I am facing below error msg now. please help me.
>
> 2016-01-21 16:06:14,123 WARN org.apache.hadoop.hdfs.DFSClient: Failed to
> connect to /xxx.xx.xx.xx:50010 for block, add to deadNodes and continue.
> java.nio.channels.ClosedByInterruptException
> java.nio.channels.ClosedByInterruptException
> at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:658)
> at
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
> at
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3101)
> at
> org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:755)
> at
> org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:670)
> at
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:337)
> at
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:576)
> at
> org.apache.hadoop.hdfs.DFSInputStream.seekToBlockSource(DFSInputStream.java:1460)
> at
> org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:773)
> at
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:806)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:847)
> at java.io.DataInputStream.read(DataInputStream.java:100)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:84)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
> at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:265)
> at
> org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
> at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
> at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
>
> Thanks
> Soniya
>
> On Thu, Jan 21, 2016 at 5:42 PM, Ted Yu <yu...@gmail.com> wrote:
>
>> Please also check AppMaster log.
>>
>> Thanks
>>
>> On Jan 21, 2016, at 3:51 AM, Akhil Das <ak...@sigmoidanalytics.com>
>> wrote:
>>
>> Can you look in the executor logs and see why the sparkcontext is being
>> shutdown? Similar discussion happened here previously.
>> http://apache-spark-user-list.1001560.n3.nabble.com/RECEIVED-SIGNAL-15-SIGTERM-td23668.html
>>
>> Thanks
>> Best Regards
>>
>> On Thu, Jan 21, 2016 at 5:11 PM, Soni spark <so...@gmail.com>
>> wrote:
>>
>>> Hi Friends,
>>>
>>> I spark job is successfully running on local mode but failing on cluster mode. Below is the error message i am getting. anyone can help me.
>>>
>>>
>>>
>>> 16/01/21 16:38:07 INFO twitter4j.TwitterStreamImpl: Establishing connection.
>>> 16/01/21 16:38:07 INFO twitter.TwitterReceiver: Twitter receiver started
>>> 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Called receiver onStart
>>> 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Waiting for receiver to be stopped*16/01/21 16:38:10 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM*
>>> 16/01/21 16:38:10 INFO streaming.StreamingContext: Invoking stop(stopGracefully=false) from shutdown hook
>>> 16/01/21 16:38:10 INFO scheduler.ReceiverTracker: Sent stop signal to all 1 receivers
>>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Received stop signal
>>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopping receiver with message: Stopped by driver:
>>> 16/01/21 16:38:10 INFO twitter.TwitterReceiver: Twitter receiver stopped
>>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Called receiver onStop
>>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Deregistering receiver 0*16/01/21 16:38:10 ERROR scheduler.ReceiverTracker: Deregistered receiver for stream 0: Stopped by driver*
>>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopped receiver 0
>>> 16/01/21 16:38:10 INFO receiver.BlockGenerator: Stopping BlockGenerator
>>> 16/01/21 16:38:10 INFO yarn.ApplicationMaster: Waiting for spark context initialization ...
>>>
>>> Thanks
>>>
>>> Soniya
>>>
>>>
>>
>
Re: spark job submisson on yarn-cluster mode failing
Posted by Soni spark <so...@gmail.com>.
Hi,
I am facing below error msg now. please help me.
2016-01-21 16:06:14,123 WARN org.apache.hadoop.hdfs.DFSClient: Failed to
connect to /xxx.xx.xx.xx:50010 for block, add to deadNodes and continue.
java.nio.channels.ClosedByInterruptException
java.nio.channels.ClosedByInterruptException
at
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:658)
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at
org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3101)
at
org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:755)
at
org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:670)
at
org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:337)
at
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:576)
at
org.apache.hadoop.hdfs.DFSInputStream.seekToBlockSource(DFSInputStream.java:1460)
at
org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:773)
at
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:806)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:847)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:84)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:265)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Thanks
Soniya
On Thu, Jan 21, 2016 at 5:42 PM, Ted Yu <yu...@gmail.com> wrote:
> Please also check AppMaster log.
>
> Thanks
>
> On Jan 21, 2016, at 3:51 AM, Akhil Das <ak...@sigmoidanalytics.com> wrote:
>
> Can you look in the executor logs and see why the sparkcontext is being
> shutdown? Similar discussion happened here previously.
> http://apache-spark-user-list.1001560.n3.nabble.com/RECEIVED-SIGNAL-15-SIGTERM-td23668.html
>
> Thanks
> Best Regards
>
> On Thu, Jan 21, 2016 at 5:11 PM, Soni spark <so...@gmail.com>
> wrote:
>
>> Hi Friends,
>>
>> I spark job is successfully running on local mode but failing on cluster mode. Below is the error message i am getting. anyone can help me.
>>
>>
>>
>> 16/01/21 16:38:07 INFO twitter4j.TwitterStreamImpl: Establishing connection.
>> 16/01/21 16:38:07 INFO twitter.TwitterReceiver: Twitter receiver started
>> 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Called receiver onStart
>> 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Waiting for receiver to be stopped*16/01/21 16:38:10 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM*
>> 16/01/21 16:38:10 INFO streaming.StreamingContext: Invoking stop(stopGracefully=false) from shutdown hook
>> 16/01/21 16:38:10 INFO scheduler.ReceiverTracker: Sent stop signal to all 1 receivers
>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Received stop signal
>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopping receiver with message: Stopped by driver:
>> 16/01/21 16:38:10 INFO twitter.TwitterReceiver: Twitter receiver stopped
>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Called receiver onStop
>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Deregistering receiver 0*16/01/21 16:38:10 ERROR scheduler.ReceiverTracker: Deregistered receiver for stream 0: Stopped by driver*
>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopped receiver 0
>> 16/01/21 16:38:10 INFO receiver.BlockGenerator: Stopping BlockGenerator
>> 16/01/21 16:38:10 INFO yarn.ApplicationMaster: Waiting for spark context initialization ...
>>
>> Thanks
>>
>> Soniya
>>
>>
>
Re: spark job submisson on yarn-cluster mode failing
Posted by Ted Yu <yu...@gmail.com>.
Please also check AppMaster log.
Thanks
> On Jan 21, 2016, at 3:51 AM, Akhil Das <ak...@sigmoidanalytics.com> wrote:
>
> Can you look in the executor logs and see why the sparkcontext is being shutdown? Similar discussion happened here previously. http://apache-spark-user-list.1001560.n3.nabble.com/RECEIVED-SIGNAL-15-SIGTERM-td23668.html
>
> Thanks
> Best Regards
>
>> On Thu, Jan 21, 2016 at 5:11 PM, Soni spark <so...@gmail.com> wrote:
>> Hi Friends,
>>
>> I spark job is successfully running on local mode but failing on cluster mode. Below is the error message i am getting. anyone can help me.
>>
>>
>> 16/01/21 16:38:07 INFO twitter4j.TwitterStreamImpl: Establishing connection.
>> 16/01/21 16:38:07 INFO twitter.TwitterReceiver: Twitter receiver started
>> 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Called receiver onStart
>> 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Waiting for receiver to be stopped
>> 16/01/21 16:38:10 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM
>> 16/01/21 16:38:10 INFO streaming.StreamingContext: Invoking stop(stopGracefully=false) from shutdown hook
>> 16/01/21 16:38:10 INFO scheduler.ReceiverTracker: Sent stop signal to all 1 receivers
>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Received stop signal
>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopping receiver with message: Stopped by driver:
>> 16/01/21 16:38:10 INFO twitter.TwitterReceiver: Twitter receiver stopped
>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Called receiver onStop
>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Deregistering receiver 0
>> 16/01/21 16:38:10 ERROR scheduler.ReceiverTracker: Deregistered receiver for stream 0: Stopped by driver
>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopped receiver 0
>> 16/01/21 16:38:10 INFO receiver.BlockGenerator: Stopping BlockGenerator
>> 16/01/21 16:38:10 INFO yarn.ApplicationMaster: Waiting for spark context initialization ...
>>
>> Thanks
>> Soniya
>
Re: spark job submisson on yarn-cluster mode failing
Posted by Akhil Das <ak...@sigmoidanalytics.com>.
Can you look in the executor logs and see why the sparkcontext is being
shutdown? Similar discussion happened here previously.
http://apache-spark-user-list.1001560.n3.nabble.com/RECEIVED-SIGNAL-15-SIGTERM-td23668.html
Thanks
Best Regards
On Thu, Jan 21, 2016 at 5:11 PM, Soni spark <so...@gmail.com>
wrote:
> Hi Friends,
>
> I spark job is successfully running on local mode but failing on cluster mode. Below is the error message i am getting. anyone can help me.
>
>
>
> 16/01/21 16:38:07 INFO twitter4j.TwitterStreamImpl: Establishing connection.
> 16/01/21 16:38:07 INFO twitter.TwitterReceiver: Twitter receiver started
> 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Called receiver onStart
> 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Waiting for receiver to be stopped*16/01/21 16:38:10 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM*
> 16/01/21 16:38:10 INFO streaming.StreamingContext: Invoking stop(stopGracefully=false) from shutdown hook
> 16/01/21 16:38:10 INFO scheduler.ReceiverTracker: Sent stop signal to all 1 receivers
> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Received stop signal
> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopping receiver with message: Stopped by driver:
> 16/01/21 16:38:10 INFO twitter.TwitterReceiver: Twitter receiver stopped
> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Called receiver onStop
> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Deregistering receiver 0*16/01/21 16:38:10 ERROR scheduler.ReceiverTracker: Deregistered receiver for stream 0: Stopped by driver*
> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopped receiver 0
> 16/01/21 16:38:10 INFO receiver.BlockGenerator: Stopping BlockGenerator
> 16/01/21 16:38:10 INFO yarn.ApplicationMaster: Waiting for spark context initialization ...
>
> Thanks
>
> Soniya
>
>