You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@slider.apache.org by Manoj Samel <ma...@gmail.com> on 2017/09/18 22:22:31 UTC

Sometimes slider commands time out in a secured cluster

CDH 5.5.1 cluster with Kerberos, slider version 0.80

Sometimes Slider commands start hanging

slider list <app> --containers

[root@s-76zyl02.sys.az1.eng.pdx.wd ~]# slider list spas --containers
2017-09-18 21:44:45,659 [main] INFO  tools.SliderUtils - JVM initialized
into secure mode with kerberos realm BIGDATA
Exception: Call From <host running command>/<host_ip> to <slider_AM_HOST>
failed on socket timeout exception: java.net.SocketTimeoutException: 15000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/<slider
command_host>:46777 remote=<host_running_slider_am>/<IP of host running
slider am>:32120]; For more details see:
http://wiki.apache.org/hadoop/SocketTimeout
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:750)
    at org.apache.hadoop.ipc.Client.call(Client.java:1476)
    at org.apache.hadoop.ipc.Client.call(Client.java:1403)
    at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
    at com.sun.proxy.$Proxy19.getLiveContainers(Unknown Source)
    at
org.apache.slider.server.appmaster.rpc.SliderClusterProtocolProxy.getLiveContainers(SliderClusterProtocolProxy.java:229)
    at
org.apache.slider.client.ipc.SliderClusterOperations.getContainers(SliderClusterOperations.java:458)
    at
org.apache.slider.client.SliderClient.getContainers(SliderClient.java:2763)
    at
org.apache.slider.client.SliderClient.actionList(SliderClient.java:2735)
    at org.apache.slider.client.SliderClient.exec(SliderClient.java:510)
    at
org.apache.slider.client.SliderClient.runService(SliderClient.java:424)
    at
org.apache.slider.core.main.ServiceLauncher.launchService(ServiceLauncher.java:188)
    at
org.apache.slider.core.main.ServiceLauncher.launchServiceRobustly(ServiceLauncher.java:475)
    at
org.apache.slider.core.main.ServiceLauncher.launchServiceAndExit(ServiceLauncher.java:403)
    at
org.apache.slider.core.main.ServiceLauncher.serviceMain(ServiceLauncher.java:630)
    at org.apache.slider.Slider.main(Slider.java:49)
Caused by: java.net.SocketTimeoutException: 15000 millis timeout while
waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/<Local_IP>:46777
remote=<Slider_AM_HOST>/<slider_am_host_ip>:32120]
    at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
    at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
    at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at
org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:515)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
    at java.io.DataInputStream.readInt(DataInputStream.java:387)
    at
org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1075)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:970)
2017-09-18 21:45:01,499 [main] INFO  util.ExitUtil - Exiting with status 56


Slider AM Log Shows no errors. The only warning I can see is about TGT
renewer

2017-09-18 15:40:57,009 [TGT Renewer for xyz@mydomain] WARN
 security.UserGroupInformation - Exception encountered while running the
renewal command. Aborting renew thread. ExitCodeException exitCode=1:
kinit: Ticket expired while renewing credentials
2017-09-18 15:43:29,536 [Socket Reader #1 for port 32120] INFO  ipc.Server
- Auth successful for xyz@mydomain (auth:SIMPLE)
2017-09-18 15:43:29,537 [Socket Reader #1 for port 32120] INFO
 authorize.ServiceAuthorizationManager - Authorization successful for
xyz@mydomain (auth:TOKEN) for protocol=interface
org.apache.slider.server.appmaster.rpc.SliderClusterProtocolPB
2017-09-18 15:48:29,569 [Socket Reader #1 for port 32120] INFO  ipc.Server
- Auth successful for xyz@mydomain (auth:SIMPLE)
2017-09-18 15:48:29,570 [Socket Reader #1 for port 32120] INFO
 authorize.ServiceAuthorizationManager - Authorization successful for
xyz@mydomain (auth:TOKEN) for protocol=interface
org.apache.slider.server.appmaster.rpc.SliderClusterProtocolPB

Re: Sometimes slider commands time out in a secured cluster

Posted by Billie Rinaldi <bi...@gmail.com>.
I found a similar question answered here:
https://community.hortonworks.com/questions/53776/warn-securityusergroupinformation-exception-encoun.html

It seems like it is related to the ticket renewal / expiration, so maybe
you need to kinit again, or maybe you did kinit again but it wasn't before
the ticket expired, or maybe the renewable life of the ticket has been
exceeded. I suggest trying kdestroy before issuing another kinit to see if
that addresses the issue. You may also want to check the renewal settings
on the KDC.

On Fri, Sep 22, 2017 at 2:32 PM, Manoj Samel <ma...@gmail.com>
wrote:

>    - The issue happens intermittently. HOWEVER, once the Slider AM starts
>    giving these timeout errors; it stays in that error mode. Then it
> cannot be
>    stopped (stop command gives same error). The only way is to kill the
> slider
>    App using "yarn application -kill <Slider App ID>" , which of course
> kills
>    the entire app.
>    - After AM starts giving the timeouts, it is still possible to ping AM
>    host:port using "nc" etc. so it does not seems to be a network issue.
>
>
> On Thu, Sep 21, 2017 at 12:47 PM, Gour Saha <gs...@hortonworks.com> wrote:
>
> > Just to see if the AM UI is accessible when the CLI fails. Seems like
> your
> > issue is intermittent. RPC timeout for CLIs are set to 15 secs, so there
> > could be several reasons for which the timeout occurs. Do you see any
> > network/routing issue to connect to the host where the AM is running?
> >
> > -Gour
> >
> > On 9/21/17, 12:31 PM, "Manoj Samel" <ma...@gmail.com> wrote:
> >
> > >Hi Gour,
> > >
> > >Will try to access the AM Web UI next time the issue happens. Is there
> > >anything specific that should be checked within the AM UI ? Or is the
> test
> > >just to see if AM UI is accessible at all ?
> > >
> > >Thanks,
> > >
> > >Manoj
> > >
> > >On Thu, Sep 21, 2017 at 11:26 AM, Gour Saha <gs...@hortonworks.com>
> > wrote:
> > >
> > >> Are you able to go to the RM UI and load the ApplicationMaster web ui
> > >>for
> > >> this app?
> > >>
> > >> -Gour
> > >>
> > >> On 9/21/17, 11:00 AM, "Manoj Samel" <ma...@gmail.com> wrote:
> > >>
> > >> >Any thoughts ?
> > >> >
> > >> >On Mon, Sep 18, 2017 at 3:22 PM, Manoj Samel <
> manojsameltech@gmail.com
> > >
> > >> >wrote:
> > >> >
> > >> >>
> > >> >> CDH 5.5.1 cluster with Kerberos, slider version 0.80
> > >> >>
> > >> >> Sometimes Slider commands start hanging
> > >> >>
> > >> >> slider list <app> --containers
> > >> >>
> > >> >> [root@s-76zyl02.sys.az1.eng.pdx.wd ~]# slider list spas
> --containers
> > >> >> 2017-09-18 21:44:45,659 [main] INFO  tools.SliderUtils - JVM
> > >>initialized
> > >> >> into secure mode with kerberos realm BIGDATA
> > >> >> Exception: Call From <host running command>/<host_ip> to
> > >> >><slider_AM_HOST>
> > >> >> failed on socket timeout exception: java.net.
> SocketTimeoutException:
> > >> >> 15000 millis timeout while waiting for channel to be ready for
> read.
> > >>ch
> > >> >>:
> > >> >> java.nio.channels.SocketChannel[connected local=/<slider
> > >> >> command_host>:46777 remote=<host_running_slider_am>/<IP of host
> > >>running
> > >> >> slider am>:32120]; For more details see:  http://wiki.apache.org/
> > >> >> hadoop/SocketTimeout
> > >> >>     at sun.reflect.NativeConstructorAccessorImpl.
> newInstance0(Native
> > >> >> Method)
> > >> >>     at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> > >> >> NativeConstructorAccessorImpl.java:62)
> > >> >>     at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> > >> >> DelegatingConstructorAccessorImpl.java:45)
> > >> >>     at
> > >>java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> > >> >>     at org.apache.hadoop.net.NetUtils.wrapWithMessage(
> > >> NetUtils.java:791)
> > >> >>     at
> > >>org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:750)
> > >> >>     at org.apache.hadoop.ipc.Client.call(Client.java:1476)
> > >> >>     at org.apache.hadoop.ipc.Client.call(Client.java:1403)
> > >> >>     at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.
> > >> >> invoke(ProtobufRpcEngine.java:230)
> > >> >>     at com.sun.proxy.$Proxy19.getLiveContainers(Unknown Source)
> > >> >>     at
> > >> >>org.apache.slider.server.appmaster.rpc.SliderClusterProtocolProxy.
> > >> >> getLiveContainers(SliderClusterProtocolProxy.java:229)
> > >> >>     at
> > >> >>org.apache.slider.client.ipc.SliderClusterOperations.getContainers(
> > >> >> SliderClusterOperations.java:458)
> > >> >>     at org.apache.slider.client.SliderClient.getContainers(
> > >> >> SliderClient.java:2763)
> > >> >>     at org.apache.slider.client.SliderClient.actionList(
> > >> >> SliderClient.java:2735)
> > >> >>     at org.apache.slider.client.SliderClient.exec(
> > >> SliderClient.java:510)
> > >> >>     at org.apache.slider.client.SliderClient.runService(
> > >> >> SliderClient.java:424)
> > >> >>     at org.apache.slider.core.main.ServiceLauncher.launchService(
> > >> >> ServiceLauncher.java:188)
> > >> >>     at
> > >> >>org.apache.slider.core.main.ServiceLauncher.launchServiceRobustly(
> > >> >> ServiceLauncher.java:475)
> > >> >>     at org.apache.slider.core.main.ServiceLauncher.
> > >> launchServiceAndExit(
> > >> >> ServiceLauncher.java:403)
> > >> >>     at org.apache.slider.core.main.ServiceLauncher.serviceMain(
> > >> >> ServiceLauncher.java:630)
> > >> >>     at org.apache.slider.Slider.main(Slider.java:49)
> > >> >> Caused by: java.net.SocketTimeoutException: 15000 millis timeout
> > >>while
> > >> >> waiting for channel to be ready for read. ch :
> > >> >>java.nio.channels.SocketChannel[connected
> > >> >> local=/<Local_IP>:46777
> > >> >>remote=<Slider_AM_HOST>/<slider_am_host_ip>:32120]
> > >> >>     at org.apache.hadoop.net.SocketIOWithTimeout.doIO(
> > >> >> SocketIOWithTimeout.java:164)
> > >> >>     at org.apache.hadoop.net.SocketInputStream.read(
> > >> >> SocketInputStream.java:161)
> > >> >>     at org.apache.hadoop.net.SocketInputStream.read(
> > >> >> SocketInputStream.java:131)
> > >> >>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
> > >> >>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
> > >> >>     at org.apache.hadoop.ipc.Client$Connection$PingInputStream.
> > >> >> read(Client.java:515)
> > >> >>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:
> > 246)
> > >> >>     at java.io.BufferedInputStream.read(BufferedInputStream.java:
> > 265)
> > >> >>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
> > >> >>     at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(
> > >> >> Client.java:1075)
> > >> >>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:
> 970)
> > >> >> 2017-09-18 21:45:01,499 [main] INFO  util.ExitUtil - Exiting with
> > >> >>status 56
> > >> >>
> > >> >>
> > >> >> Slider AM Log Shows no errors. The only warning I can see is about
> > >>TGT
> > >> >> renewer
> > >> >>
> > >> >> 2017-09-18 15:40:57,009 [TGT Renewer for xyz@mydomain] WARN
> > >> >>  security.UserGroupInformation - Exception encountered while
> running
> > >>the
> > >> >> renewal command. Aborting renew thread. ExitCodeException
> exitCode=1:
> > >> >> kinit: Ticket expired while renewing credentials
> > >> >> 2017-09-18 15:43:29,536 [Socket Reader #1 for port 32120] INFO
> > >> >>ipc.Server
> > >> >> - Auth successful for xyz@mydomain (auth:SIMPLE)
> > >> >> 2017-09-18 15:43:29,537 [Socket Reader #1 for port 32120] INFO
> > >> >>authorize.ServiceAuthorizationManager
> > >> >> - Authorization successful for xyz@mydomain (auth:TOKEN) for
> > >> >> protocol=interface org.apache.slider.server.appmaster.rpc.
> > >> >> SliderClusterProtocolPB
> > >> >> 2017-09-18 15:48:29,569 [Socket Reader #1 for port 32120] INFO
> > >> >>ipc.Server
> > >> >> - Auth successful for xyz@mydomain (auth:SIMPLE)
> > >> >> 2017-09-18 15:48:29,570 [Socket Reader #1 for port 32120] INFO
> > >> >>authorize.ServiceAuthorizationManager
> > >> >> - Authorization successful for xyz@mydomain (auth:TOKEN) for
> > >> >> protocol=interface org.apache.slider.server.appmaster.rpc.
> > >> >> SliderClusterProtocolPB
> > >> >>
> > >>
> > >>
> >
> >
>

Re: Sometimes slider commands time out in a secured cluster

Posted by Manoj Samel <ma...@gmail.com>.
   - The issue happens intermittently. HOWEVER, once the Slider AM starts
   giving these timeout errors; it stays in that error mode. Then it cannot be
   stopped (stop command gives same error). The only way is to kill the slider
   App using "yarn application -kill <Slider App ID>" , which of course kills
   the entire app.
   - After AM starts giving the timeouts, it is still possible to ping AM
   host:port using "nc" etc. so it does not seems to be a network issue.


On Thu, Sep 21, 2017 at 12:47 PM, Gour Saha <gs...@hortonworks.com> wrote:

> Just to see if the AM UI is accessible when the CLI fails. Seems like your
> issue is intermittent. RPC timeout for CLIs are set to 15 secs, so there
> could be several reasons for which the timeout occurs. Do you see any
> network/routing issue to connect to the host where the AM is running?
>
> -Gour
>
> On 9/21/17, 12:31 PM, "Manoj Samel" <ma...@gmail.com> wrote:
>
> >Hi Gour,
> >
> >Will try to access the AM Web UI next time the issue happens. Is there
> >anything specific that should be checked within the AM UI ? Or is the test
> >just to see if AM UI is accessible at all ?
> >
> >Thanks,
> >
> >Manoj
> >
> >On Thu, Sep 21, 2017 at 11:26 AM, Gour Saha <gs...@hortonworks.com>
> wrote:
> >
> >> Are you able to go to the RM UI and load the ApplicationMaster web ui
> >>for
> >> this app?
> >>
> >> -Gour
> >>
> >> On 9/21/17, 11:00 AM, "Manoj Samel" <ma...@gmail.com> wrote:
> >>
> >> >Any thoughts ?
> >> >
> >> >On Mon, Sep 18, 2017 at 3:22 PM, Manoj Samel <manojsameltech@gmail.com
> >
> >> >wrote:
> >> >
> >> >>
> >> >> CDH 5.5.1 cluster with Kerberos, slider version 0.80
> >> >>
> >> >> Sometimes Slider commands start hanging
> >> >>
> >> >> slider list <app> --containers
> >> >>
> >> >> [root@s-76zyl02.sys.az1.eng.pdx.wd ~]# slider list spas --containers
> >> >> 2017-09-18 21:44:45,659 [main] INFO  tools.SliderUtils - JVM
> >>initialized
> >> >> into secure mode with kerberos realm BIGDATA
> >> >> Exception: Call From <host running command>/<host_ip> to
> >> >><slider_AM_HOST>
> >> >> failed on socket timeout exception: java.net.SocketTimeoutException:
> >> >> 15000 millis timeout while waiting for channel to be ready for read.
> >>ch
> >> >>:
> >> >> java.nio.channels.SocketChannel[connected local=/<slider
> >> >> command_host>:46777 remote=<host_running_slider_am>/<IP of host
> >>running
> >> >> slider am>:32120]; For more details see:  http://wiki.apache.org/
> >> >> hadoop/SocketTimeout
> >> >>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> >> >> Method)
> >> >>     at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> >> >> NativeConstructorAccessorImpl.java:62)
> >> >>     at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> >> >> DelegatingConstructorAccessorImpl.java:45)
> >> >>     at
> >>java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> >> >>     at org.apache.hadoop.net.NetUtils.wrapWithMessage(
> >> NetUtils.java:791)
> >> >>     at
> >>org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:750)
> >> >>     at org.apache.hadoop.ipc.Client.call(Client.java:1476)
> >> >>     at org.apache.hadoop.ipc.Client.call(Client.java:1403)
> >> >>     at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.
> >> >> invoke(ProtobufRpcEngine.java:230)
> >> >>     at com.sun.proxy.$Proxy19.getLiveContainers(Unknown Source)
> >> >>     at
> >> >>org.apache.slider.server.appmaster.rpc.SliderClusterProtocolProxy.
> >> >> getLiveContainers(SliderClusterProtocolProxy.java:229)
> >> >>     at
> >> >>org.apache.slider.client.ipc.SliderClusterOperations.getContainers(
> >> >> SliderClusterOperations.java:458)
> >> >>     at org.apache.slider.client.SliderClient.getContainers(
> >> >> SliderClient.java:2763)
> >> >>     at org.apache.slider.client.SliderClient.actionList(
> >> >> SliderClient.java:2735)
> >> >>     at org.apache.slider.client.SliderClient.exec(
> >> SliderClient.java:510)
> >> >>     at org.apache.slider.client.SliderClient.runService(
> >> >> SliderClient.java:424)
> >> >>     at org.apache.slider.core.main.ServiceLauncher.launchService(
> >> >> ServiceLauncher.java:188)
> >> >>     at
> >> >>org.apache.slider.core.main.ServiceLauncher.launchServiceRobustly(
> >> >> ServiceLauncher.java:475)
> >> >>     at org.apache.slider.core.main.ServiceLauncher.
> >> launchServiceAndExit(
> >> >> ServiceLauncher.java:403)
> >> >>     at org.apache.slider.core.main.ServiceLauncher.serviceMain(
> >> >> ServiceLauncher.java:630)
> >> >>     at org.apache.slider.Slider.main(Slider.java:49)
> >> >> Caused by: java.net.SocketTimeoutException: 15000 millis timeout
> >>while
> >> >> waiting for channel to be ready for read. ch :
> >> >>java.nio.channels.SocketChannel[connected
> >> >> local=/<Local_IP>:46777
> >> >>remote=<Slider_AM_HOST>/<slider_am_host_ip>:32120]
> >> >>     at org.apache.hadoop.net.SocketIOWithTimeout.doIO(
> >> >> SocketIOWithTimeout.java:164)
> >> >>     at org.apache.hadoop.net.SocketInputStream.read(
> >> >> SocketInputStream.java:161)
> >> >>     at org.apache.hadoop.net.SocketInputStream.read(
> >> >> SocketInputStream.java:131)
> >> >>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
> >> >>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
> >> >>     at org.apache.hadoop.ipc.Client$Connection$PingInputStream.
> >> >> read(Client.java:515)
> >> >>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:
> 246)
> >> >>     at java.io.BufferedInputStream.read(BufferedInputStream.java:
> 265)
> >> >>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
> >> >>     at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(
> >> >> Client.java:1075)
> >> >>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:970)
> >> >> 2017-09-18 21:45:01,499 [main] INFO  util.ExitUtil - Exiting with
> >> >>status 56
> >> >>
> >> >>
> >> >> Slider AM Log Shows no errors. The only warning I can see is about
> >>TGT
> >> >> renewer
> >> >>
> >> >> 2017-09-18 15:40:57,009 [TGT Renewer for xyz@mydomain] WARN
> >> >>  security.UserGroupInformation - Exception encountered while running
> >>the
> >> >> renewal command. Aborting renew thread. ExitCodeException exitCode=1:
> >> >> kinit: Ticket expired while renewing credentials
> >> >> 2017-09-18 15:43:29,536 [Socket Reader #1 for port 32120] INFO
> >> >>ipc.Server
> >> >> - Auth successful for xyz@mydomain (auth:SIMPLE)
> >> >> 2017-09-18 15:43:29,537 [Socket Reader #1 for port 32120] INFO
> >> >>authorize.ServiceAuthorizationManager
> >> >> - Authorization successful for xyz@mydomain (auth:TOKEN) for
> >> >> protocol=interface org.apache.slider.server.appmaster.rpc.
> >> >> SliderClusterProtocolPB
> >> >> 2017-09-18 15:48:29,569 [Socket Reader #1 for port 32120] INFO
> >> >>ipc.Server
> >> >> - Auth successful for xyz@mydomain (auth:SIMPLE)
> >> >> 2017-09-18 15:48:29,570 [Socket Reader #1 for port 32120] INFO
> >> >>authorize.ServiceAuthorizationManager
> >> >> - Authorization successful for xyz@mydomain (auth:TOKEN) for
> >> >> protocol=interface org.apache.slider.server.appmaster.rpc.
> >> >> SliderClusterProtocolPB
> >> >>
> >>
> >>
>
>

Re: Sometimes slider commands time out in a secured cluster

Posted by Gour Saha <gs...@hortonworks.com>.
Just to see if the AM UI is accessible when the CLI fails. Seems like your
issue is intermittent. RPC timeout for CLIs are set to 15 secs, so there
could be several reasons for which the timeout occurs. Do you see any
network/routing issue to connect to the host where the AM is running?

-Gour

On 9/21/17, 12:31 PM, "Manoj Samel" <ma...@gmail.com> wrote:

>Hi Gour,
>
>Will try to access the AM Web UI next time the issue happens. Is there
>anything specific that should be checked within the AM UI ? Or is the test
>just to see if AM UI is accessible at all ?
>
>Thanks,
>
>Manoj
>
>On Thu, Sep 21, 2017 at 11:26 AM, Gour Saha <gs...@hortonworks.com> wrote:
>
>> Are you able to go to the RM UI and load the ApplicationMaster web ui
>>for
>> this app?
>>
>> -Gour
>>
>> On 9/21/17, 11:00 AM, "Manoj Samel" <ma...@gmail.com> wrote:
>>
>> >Any thoughts ?
>> >
>> >On Mon, Sep 18, 2017 at 3:22 PM, Manoj Samel <ma...@gmail.com>
>> >wrote:
>> >
>> >>
>> >> CDH 5.5.1 cluster with Kerberos, slider version 0.80
>> >>
>> >> Sometimes Slider commands start hanging
>> >>
>> >> slider list <app> --containers
>> >>
>> >> [root@s-76zyl02.sys.az1.eng.pdx.wd ~]# slider list spas --containers
>> >> 2017-09-18 21:44:45,659 [main] INFO  tools.SliderUtils - JVM
>>initialized
>> >> into secure mode with kerberos realm BIGDATA
>> >> Exception: Call From <host running command>/<host_ip> to
>> >><slider_AM_HOST>
>> >> failed on socket timeout exception: java.net.SocketTimeoutException:
>> >> 15000 millis timeout while waiting for channel to be ready for read.
>>ch
>> >>:
>> >> java.nio.channels.SocketChannel[connected local=/<slider
>> >> command_host>:46777 remote=<host_running_slider_am>/<IP of host
>>running
>> >> slider am>:32120]; For more details see:  http://wiki.apache.org/
>> >> hadoop/SocketTimeout
>> >>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> >> Method)
>> >>     at sun.reflect.NativeConstructorAccessorImpl.newInstance(
>> >> NativeConstructorAccessorImpl.java:62)
>> >>     at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
>> >> DelegatingConstructorAccessorImpl.java:45)
>> >>     at 
>>java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>> >>     at org.apache.hadoop.net.NetUtils.wrapWithMessage(
>> NetUtils.java:791)
>> >>     at 
>>org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:750)
>> >>     at org.apache.hadoop.ipc.Client.call(Client.java:1476)
>> >>     at org.apache.hadoop.ipc.Client.call(Client.java:1403)
>> >>     at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.
>> >> invoke(ProtobufRpcEngine.java:230)
>> >>     at com.sun.proxy.$Proxy19.getLiveContainers(Unknown Source)
>> >>     at
>> >>org.apache.slider.server.appmaster.rpc.SliderClusterProtocolProxy.
>> >> getLiveContainers(SliderClusterProtocolProxy.java:229)
>> >>     at
>> >>org.apache.slider.client.ipc.SliderClusterOperations.getContainers(
>> >> SliderClusterOperations.java:458)
>> >>     at org.apache.slider.client.SliderClient.getContainers(
>> >> SliderClient.java:2763)
>> >>     at org.apache.slider.client.SliderClient.actionList(
>> >> SliderClient.java:2735)
>> >>     at org.apache.slider.client.SliderClient.exec(
>> SliderClient.java:510)
>> >>     at org.apache.slider.client.SliderClient.runService(
>> >> SliderClient.java:424)
>> >>     at org.apache.slider.core.main.ServiceLauncher.launchService(
>> >> ServiceLauncher.java:188)
>> >>     at
>> >>org.apache.slider.core.main.ServiceLauncher.launchServiceRobustly(
>> >> ServiceLauncher.java:475)
>> >>     at org.apache.slider.core.main.ServiceLauncher.
>> launchServiceAndExit(
>> >> ServiceLauncher.java:403)
>> >>     at org.apache.slider.core.main.ServiceLauncher.serviceMain(
>> >> ServiceLauncher.java:630)
>> >>     at org.apache.slider.Slider.main(Slider.java:49)
>> >> Caused by: java.net.SocketTimeoutException: 15000 millis timeout
>>while
>> >> waiting for channel to be ready for read. ch :
>> >>java.nio.channels.SocketChannel[connected
>> >> local=/<Local_IP>:46777
>> >>remote=<Slider_AM_HOST>/<slider_am_host_ip>:32120]
>> >>     at org.apache.hadoop.net.SocketIOWithTimeout.doIO(
>> >> SocketIOWithTimeout.java:164)
>> >>     at org.apache.hadoop.net.SocketInputStream.read(
>> >> SocketInputStream.java:161)
>> >>     at org.apache.hadoop.net.SocketInputStream.read(
>> >> SocketInputStream.java:131)
>> >>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
>> >>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
>> >>     at org.apache.hadoop.ipc.Client$Connection$PingInputStream.
>> >> read(Client.java:515)
>> >>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>> >>     at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
>> >>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
>> >>     at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(
>> >> Client.java:1075)
>> >>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:970)
>> >> 2017-09-18 21:45:01,499 [main] INFO  util.ExitUtil - Exiting with
>> >>status 56
>> >>
>> >>
>> >> Slider AM Log Shows no errors. The only warning I can see is about
>>TGT
>> >> renewer
>> >>
>> >> 2017-09-18 15:40:57,009 [TGT Renewer for xyz@mydomain] WARN
>> >>  security.UserGroupInformation - Exception encountered while running
>>the
>> >> renewal command. Aborting renew thread. ExitCodeException exitCode=1:
>> >> kinit: Ticket expired while renewing credentials
>> >> 2017-09-18 15:43:29,536 [Socket Reader #1 for port 32120] INFO
>> >>ipc.Server
>> >> - Auth successful for xyz@mydomain (auth:SIMPLE)
>> >> 2017-09-18 15:43:29,537 [Socket Reader #1 for port 32120] INFO
>> >>authorize.ServiceAuthorizationManager
>> >> - Authorization successful for xyz@mydomain (auth:TOKEN) for
>> >> protocol=interface org.apache.slider.server.appmaster.rpc.
>> >> SliderClusterProtocolPB
>> >> 2017-09-18 15:48:29,569 [Socket Reader #1 for port 32120] INFO
>> >>ipc.Server
>> >> - Auth successful for xyz@mydomain (auth:SIMPLE)
>> >> 2017-09-18 15:48:29,570 [Socket Reader #1 for port 32120] INFO
>> >>authorize.ServiceAuthorizationManager
>> >> - Authorization successful for xyz@mydomain (auth:TOKEN) for
>> >> protocol=interface org.apache.slider.server.appmaster.rpc.
>> >> SliderClusterProtocolPB
>> >>
>>
>>


Re: Sometimes slider commands time out in a secured cluster

Posted by Manoj Samel <ma...@gmail.com>.
Hi Gour,

Will try to access the AM Web UI next time the issue happens. Is there
anything specific that should be checked within the AM UI ? Or is the test
just to see if AM UI is accessible at all ?

Thanks,

Manoj

On Thu, Sep 21, 2017 at 11:26 AM, Gour Saha <gs...@hortonworks.com> wrote:

> Are you able to go to the RM UI and load the ApplicationMaster web ui for
> this app?
>
> -Gour
>
> On 9/21/17, 11:00 AM, "Manoj Samel" <ma...@gmail.com> wrote:
>
> >Any thoughts ?
> >
> >On Mon, Sep 18, 2017 at 3:22 PM, Manoj Samel <ma...@gmail.com>
> >wrote:
> >
> >>
> >> CDH 5.5.1 cluster with Kerberos, slider version 0.80
> >>
> >> Sometimes Slider commands start hanging
> >>
> >> slider list <app> --containers
> >>
> >> [root@s-76zyl02.sys.az1.eng.pdx.wd ~]# slider list spas --containers
> >> 2017-09-18 21:44:45,659 [main] INFO  tools.SliderUtils - JVM initialized
> >> into secure mode with kerberos realm BIGDATA
> >> Exception: Call From <host running command>/<host_ip> to
> >><slider_AM_HOST>
> >> failed on socket timeout exception: java.net.SocketTimeoutException:
> >> 15000 millis timeout while waiting for channel to be ready for read. ch
> >>:
> >> java.nio.channels.SocketChannel[connected local=/<slider
> >> command_host>:46777 remote=<host_running_slider_am>/<IP of host running
> >> slider am>:32120]; For more details see:  http://wiki.apache.org/
> >> hadoop/SocketTimeout
> >>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> >> Method)
> >>     at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> >> NativeConstructorAccessorImpl.java:62)
> >>     at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> >> DelegatingConstructorAccessorImpl.java:45)
> >>     at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> >>     at org.apache.hadoop.net.NetUtils.wrapWithMessage(
> NetUtils.java:791)
> >>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:750)
> >>     at org.apache.hadoop.ipc.Client.call(Client.java:1476)
> >>     at org.apache.hadoop.ipc.Client.call(Client.java:1403)
> >>     at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.
> >> invoke(ProtobufRpcEngine.java:230)
> >>     at com.sun.proxy.$Proxy19.getLiveContainers(Unknown Source)
> >>     at
> >>org.apache.slider.server.appmaster.rpc.SliderClusterProtocolProxy.
> >> getLiveContainers(SliderClusterProtocolProxy.java:229)
> >>     at
> >>org.apache.slider.client.ipc.SliderClusterOperations.getContainers(
> >> SliderClusterOperations.java:458)
> >>     at org.apache.slider.client.SliderClient.getContainers(
> >> SliderClient.java:2763)
> >>     at org.apache.slider.client.SliderClient.actionList(
> >> SliderClient.java:2735)
> >>     at org.apache.slider.client.SliderClient.exec(
> SliderClient.java:510)
> >>     at org.apache.slider.client.SliderClient.runService(
> >> SliderClient.java:424)
> >>     at org.apache.slider.core.main.ServiceLauncher.launchService(
> >> ServiceLauncher.java:188)
> >>     at
> >>org.apache.slider.core.main.ServiceLauncher.launchServiceRobustly(
> >> ServiceLauncher.java:475)
> >>     at org.apache.slider.core.main.ServiceLauncher.
> launchServiceAndExit(
> >> ServiceLauncher.java:403)
> >>     at org.apache.slider.core.main.ServiceLauncher.serviceMain(
> >> ServiceLauncher.java:630)
> >>     at org.apache.slider.Slider.main(Slider.java:49)
> >> Caused by: java.net.SocketTimeoutException: 15000 millis timeout while
> >> waiting for channel to be ready for read. ch :
> >>java.nio.channels.SocketChannel[connected
> >> local=/<Local_IP>:46777
> >>remote=<Slider_AM_HOST>/<slider_am_host_ip>:32120]
> >>     at org.apache.hadoop.net.SocketIOWithTimeout.doIO(
> >> SocketIOWithTimeout.java:164)
> >>     at org.apache.hadoop.net.SocketInputStream.read(
> >> SocketInputStream.java:161)
> >>     at org.apache.hadoop.net.SocketInputStream.read(
> >> SocketInputStream.java:131)
> >>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
> >>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
> >>     at org.apache.hadoop.ipc.Client$Connection$PingInputStream.
> >> read(Client.java:515)
> >>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> >>     at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> >>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
> >>     at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(
> >> Client.java:1075)
> >>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:970)
> >> 2017-09-18 21:45:01,499 [main] INFO  util.ExitUtil - Exiting with
> >>status 56
> >>
> >>
> >> Slider AM Log Shows no errors. The only warning I can see is about TGT
> >> renewer
> >>
> >> 2017-09-18 15:40:57,009 [TGT Renewer for xyz@mydomain] WARN
> >>  security.UserGroupInformation - Exception encountered while running the
> >> renewal command. Aborting renew thread. ExitCodeException exitCode=1:
> >> kinit: Ticket expired while renewing credentials
> >> 2017-09-18 15:43:29,536 [Socket Reader #1 for port 32120] INFO
> >>ipc.Server
> >> - Auth successful for xyz@mydomain (auth:SIMPLE)
> >> 2017-09-18 15:43:29,537 [Socket Reader #1 for port 32120] INFO
> >>authorize.ServiceAuthorizationManager
> >> - Authorization successful for xyz@mydomain (auth:TOKEN) for
> >> protocol=interface org.apache.slider.server.appmaster.rpc.
> >> SliderClusterProtocolPB
> >> 2017-09-18 15:48:29,569 [Socket Reader #1 for port 32120] INFO
> >>ipc.Server
> >> - Auth successful for xyz@mydomain (auth:SIMPLE)
> >> 2017-09-18 15:48:29,570 [Socket Reader #1 for port 32120] INFO
> >>authorize.ServiceAuthorizationManager
> >> - Authorization successful for xyz@mydomain (auth:TOKEN) for
> >> protocol=interface org.apache.slider.server.appmaster.rpc.
> >> SliderClusterProtocolPB
> >>
>
>

Re: Sometimes slider commands time out in a secured cluster

Posted by Gour Saha <gs...@hortonworks.com>.
Are you able to go to the RM UI and load the ApplicationMaster web ui for
this app?

-Gour

On 9/21/17, 11:00 AM, "Manoj Samel" <ma...@gmail.com> wrote:

>Any thoughts ?
>
>On Mon, Sep 18, 2017 at 3:22 PM, Manoj Samel <ma...@gmail.com>
>wrote:
>
>>
>> CDH 5.5.1 cluster with Kerberos, slider version 0.80
>>
>> Sometimes Slider commands start hanging
>>
>> slider list <app> --containers
>>
>> [root@s-76zyl02.sys.az1.eng.pdx.wd ~]# slider list spas --containers
>> 2017-09-18 21:44:45,659 [main] INFO  tools.SliderUtils - JVM initialized
>> into secure mode with kerberos realm BIGDATA
>> Exception: Call From <host running command>/<host_ip> to
>><slider_AM_HOST>
>> failed on socket timeout exception: java.net.SocketTimeoutException:
>> 15000 millis timeout while waiting for channel to be ready for read. ch
>>:
>> java.nio.channels.SocketChannel[connected local=/<slider
>> command_host>:46777 remote=<host_running_slider_am>/<IP of host running
>> slider am>:32120]; For more details see:  http://wiki.apache.org/
>> hadoop/SocketTimeout
>>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>>     at sun.reflect.NativeConstructorAccessorImpl.newInstance(
>> NativeConstructorAccessorImpl.java:62)
>>     at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
>> DelegatingConstructorAccessorImpl.java:45)
>>     at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>>     at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
>>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:750)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1476)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1403)
>>     at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.
>> invoke(ProtobufRpcEngine.java:230)
>>     at com.sun.proxy.$Proxy19.getLiveContainers(Unknown Source)
>>     at 
>>org.apache.slider.server.appmaster.rpc.SliderClusterProtocolProxy.
>> getLiveContainers(SliderClusterProtocolProxy.java:229)
>>     at 
>>org.apache.slider.client.ipc.SliderClusterOperations.getContainers(
>> SliderClusterOperations.java:458)
>>     at org.apache.slider.client.SliderClient.getContainers(
>> SliderClient.java:2763)
>>     at org.apache.slider.client.SliderClient.actionList(
>> SliderClient.java:2735)
>>     at org.apache.slider.client.SliderClient.exec(SliderClient.java:510)
>>     at org.apache.slider.client.SliderClient.runService(
>> SliderClient.java:424)
>>     at org.apache.slider.core.main.ServiceLauncher.launchService(
>> ServiceLauncher.java:188)
>>     at 
>>org.apache.slider.core.main.ServiceLauncher.launchServiceRobustly(
>> ServiceLauncher.java:475)
>>     at org.apache.slider.core.main.ServiceLauncher.launchServiceAndExit(
>> ServiceLauncher.java:403)
>>     at org.apache.slider.core.main.ServiceLauncher.serviceMain(
>> ServiceLauncher.java:630)
>>     at org.apache.slider.Slider.main(Slider.java:49)
>> Caused by: java.net.SocketTimeoutException: 15000 millis timeout while
>> waiting for channel to be ready for read. ch :
>>java.nio.channels.SocketChannel[connected
>> local=/<Local_IP>:46777
>>remote=<Slider_AM_HOST>/<slider_am_host_ip>:32120]
>>     at org.apache.hadoop.net.SocketIOWithTimeout.doIO(
>> SocketIOWithTimeout.java:164)
>>     at org.apache.hadoop.net.SocketInputStream.read(
>> SocketInputStream.java:161)
>>     at org.apache.hadoop.net.SocketInputStream.read(
>> SocketInputStream.java:131)
>>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
>>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
>>     at org.apache.hadoop.ipc.Client$Connection$PingInputStream.
>> read(Client.java:515)
>>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>>     at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
>>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
>>     at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(
>> Client.java:1075)
>>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:970)
>> 2017-09-18 21:45:01,499 [main] INFO  util.ExitUtil - Exiting with
>>status 56
>>
>>
>> Slider AM Log Shows no errors. The only warning I can see is about TGT
>> renewer
>>
>> 2017-09-18 15:40:57,009 [TGT Renewer for xyz@mydomain] WARN
>>  security.UserGroupInformation - Exception encountered while running the
>> renewal command. Aborting renew thread. ExitCodeException exitCode=1:
>> kinit: Ticket expired while renewing credentials
>> 2017-09-18 15:43:29,536 [Socket Reader #1 for port 32120] INFO
>>ipc.Server
>> - Auth successful for xyz@mydomain (auth:SIMPLE)
>> 2017-09-18 15:43:29,537 [Socket Reader #1 for port 32120] INFO
>>authorize.ServiceAuthorizationManager
>> - Authorization successful for xyz@mydomain (auth:TOKEN) for
>> protocol=interface org.apache.slider.server.appmaster.rpc.
>> SliderClusterProtocolPB
>> 2017-09-18 15:48:29,569 [Socket Reader #1 for port 32120] INFO
>>ipc.Server
>> - Auth successful for xyz@mydomain (auth:SIMPLE)
>> 2017-09-18 15:48:29,570 [Socket Reader #1 for port 32120] INFO
>>authorize.ServiceAuthorizationManager
>> - Authorization successful for xyz@mydomain (auth:TOKEN) for
>> protocol=interface org.apache.slider.server.appmaster.rpc.
>> SliderClusterProtocolPB
>>


Re: Sometimes slider commands time out in a secured cluster

Posted by Manoj Samel <ma...@gmail.com>.
Any thoughts ?

On Mon, Sep 18, 2017 at 3:22 PM, Manoj Samel <ma...@gmail.com>
wrote:

>
> CDH 5.5.1 cluster with Kerberos, slider version 0.80
>
> Sometimes Slider commands start hanging
>
> slider list <app> --containers
>
> [root@s-76zyl02.sys.az1.eng.pdx.wd ~]# slider list spas --containers
> 2017-09-18 21:44:45,659 [main] INFO  tools.SliderUtils - JVM initialized
> into secure mode with kerberos realm BIGDATA
> Exception: Call From <host running command>/<host_ip> to <slider_AM_HOST>
> failed on socket timeout exception: java.net.SocketTimeoutException:
> 15000 millis timeout while waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/<slider
> command_host>:46777 remote=<host_running_slider_am>/<IP of host running
> slider am>:32120]; For more details see:  http://wiki.apache.org/
> hadoop/SocketTimeout
>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>     at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:62)
>     at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
>     at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>     at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:750)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1476)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1403)
>     at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.
> invoke(ProtobufRpcEngine.java:230)
>     at com.sun.proxy.$Proxy19.getLiveContainers(Unknown Source)
>     at org.apache.slider.server.appmaster.rpc.SliderClusterProtocolProxy.
> getLiveContainers(SliderClusterProtocolProxy.java:229)
>     at org.apache.slider.client.ipc.SliderClusterOperations.getContainers(
> SliderClusterOperations.java:458)
>     at org.apache.slider.client.SliderClient.getContainers(
> SliderClient.java:2763)
>     at org.apache.slider.client.SliderClient.actionList(
> SliderClient.java:2735)
>     at org.apache.slider.client.SliderClient.exec(SliderClient.java:510)
>     at org.apache.slider.client.SliderClient.runService(
> SliderClient.java:424)
>     at org.apache.slider.core.main.ServiceLauncher.launchService(
> ServiceLauncher.java:188)
>     at org.apache.slider.core.main.ServiceLauncher.launchServiceRobustly(
> ServiceLauncher.java:475)
>     at org.apache.slider.core.main.ServiceLauncher.launchServiceAndExit(
> ServiceLauncher.java:403)
>     at org.apache.slider.core.main.ServiceLauncher.serviceMain(
> ServiceLauncher.java:630)
>     at org.apache.slider.Slider.main(Slider.java:49)
> Caused by: java.net.SocketTimeoutException: 15000 millis timeout while
> waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected
> local=/<Local_IP>:46777 remote=<Slider_AM_HOST>/<slider_am_host_ip>:32120]
>     at org.apache.hadoop.net.SocketIOWithTimeout.doIO(
> SocketIOWithTimeout.java:164)
>     at org.apache.hadoop.net.SocketInputStream.read(
> SocketInputStream.java:161)
>     at org.apache.hadoop.net.SocketInputStream.read(
> SocketInputStream.java:131)
>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
>     at org.apache.hadoop.ipc.Client$Connection$PingInputStream.
> read(Client.java:515)
>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>     at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
>     at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(
> Client.java:1075)
>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:970)
> 2017-09-18 21:45:01,499 [main] INFO  util.ExitUtil - Exiting with status 56
>
>
> Slider AM Log Shows no errors. The only warning I can see is about TGT
> renewer
>
> 2017-09-18 15:40:57,009 [TGT Renewer for xyz@mydomain] WARN
>  security.UserGroupInformation - Exception encountered while running the
> renewal command. Aborting renew thread. ExitCodeException exitCode=1:
> kinit: Ticket expired while renewing credentials
> 2017-09-18 15:43:29,536 [Socket Reader #1 for port 32120] INFO  ipc.Server
> - Auth successful for xyz@mydomain (auth:SIMPLE)
> 2017-09-18 15:43:29,537 [Socket Reader #1 for port 32120] INFO  authorize.ServiceAuthorizationManager
> - Authorization successful for xyz@mydomain (auth:TOKEN) for
> protocol=interface org.apache.slider.server.appmaster.rpc.
> SliderClusterProtocolPB
> 2017-09-18 15:48:29,569 [Socket Reader #1 for port 32120] INFO  ipc.Server
> - Auth successful for xyz@mydomain (auth:SIMPLE)
> 2017-09-18 15:48:29,570 [Socket Reader #1 for port 32120] INFO  authorize.ServiceAuthorizationManager
> - Authorization successful for xyz@mydomain (auth:TOKEN) for
> protocol=interface org.apache.slider.server.appmaster.rpc.
> SliderClusterProtocolPB
>