You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@slider.apache.org by Tim Israel <ti...@timisrael.com> on 2014/09/09 17:17:10 UTC

NullPointerException when Slider AM tries to start containers

Hi everyone,

I've been trying to deploy storm and accumulo on slider on a kerberized
cluster for the past few days.

My issue seems identical to an issue that was posted by Jon Maron in July (
http://mail-archives.apache.org/mod_mbox/incubator-slider-dev/201407.mbox/%3CB7CEFF5C-4DE4-4B26-89F1-68A9A4932A44@hortonworks.com%3E
)

Cluster : HDP 2.1.5 - Kerberized
Slider Version : 0.30 (from HDP Slider Tech Preview) and 0.40 (Apache)
app-packages: storm_v091 (tested in 0.30 and 0.40) and accumulo_v151
(tested in 0.30 only)


I get the same error for each app-package tested.  I'm sending a partial
stack trace below (I can send a more complete one if you're interested).
 It is identical to Jon Maron's.

14/09/09 13:50:53 ERROR appmaster.SliderAppMaster: Failed to start
Container container_1409855300917_0090_01_000003
org.apache.hadoop.yarn.exceptions.YarnException: java.lang.NullPointerException
	at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:41)
	at org.apache.hadoop.yarn.client.api.impl.NMClientImpl.startContainer(NMClientImpl.java:224)

...

Any help would be greatly appreciated.

Thanks,

Tim

Re: NullPointerException when Slider AM tries to start containers

Posted by Jon Maron <jm...@hortonworks.com>.
Thanks for the feedback.  I’ve also been thru the process a number of times, so I’ll use this info and my experiences to draft some additional documentation.

— Jon

On Sep 10, 2014, at 4:24 PM, Tim Israel <ti...@timisrael.com> wrote:

> Jon,
> 
> Your recommendation worked. Thank you.
> 
> I'm currently running the following successfully:
> * HDP 2.1.5 (Kerberized)
> * Slider release-0.50.2-incubating-rc0
> * Accumulo 1.6.0 (from app-packages)
> * Storm 0.91 (from app-packages)
> 
> I haven't done any significant testing, but it seems to be working as
> expected.
> 
> For the benefit of future readers some notes that may help you with some of
> the hurdles I encountered while debugging:
> I created a separate principal to run slider instead of using "yarn" as
> most of the instructions specify because I didn't want to modify
> container-executor.cfg
> (which prevents certain users and uids from running jobs).
> 
> I also rearranged the layout of slider in HDFS which required adjusting
> application.def.
> 
> Inside of appConfig.json, I changed site.global.app_user to my executing
> user.  I also set site.global.security_enabled to true.
> 
> -----Without the user and security settings, storm produce logs like
> this-----
>    14/09/09 21:54:31 INFO agent.AgentProviderService: Start of NIMBUS on
> container_1409855300917_0287_01_000002 delayed as dependencies have not
> started.
>    14/09/09 21:54:41 INFO agent.ComponentCommandOrder: Cannot schedule
> STORM_UI_SERVER START as dependency NIMBUS is INSTALLED
>    14/09/09 21:54:41 INFO agent.AgentProviderService: Start of
> STORM_UI_SERVER on container_1409855300917_0287_01_000004 delayed as
> dependencies have not started.
>    14/09/09 21:54:41 INFO agent.ComponentCommandOrder: Cannot schedule
> DRPC_SERVER START as dependency NIMBUS is INSTALLED
>    14/09/09 21:54:41 INFO agent.AgentProviderService: Start of DRPC_SERVER
> on container_1409855300917_0287_01_000005 delayed as dependencies have not
> started.
>    14/09/09 21:54:41 INFO agent.ComponentCommandOrder: Cannot schedule
> SUPERVISOR START as dependency NIMBUS is INSTALLED
>    14/09/09 21:54:41 INFO agent.AgentProviderService: Start of SUPERVISOR
> on container_1409855300917_0287_01_000006 delayed as dependencies have not
> started.
>    14/09/09 21:54:41 INFO agent.ComponentCommandOrder: Cannot schedule
> NIMBUS START as dependency STORM_REST_API is INSTALL_FAILED
> 
> -----Without the user and security settings, accumulo produce logs like
> this-----
> 14/09/09 22:49:18 INFO agent.ComponentCommandOrder: Cannot schedule
> ACCUMULO_MONITOR START as dependency ACCUMULO_MASTER is INIT
> 14/09/09 22:49:18 INFO agent.AgentProviderService: Start of
> ACCUMULO_MONITOR on container_1409855300917_0292_01_000004 delayed as
> dependencies have not started.
> 14/09/09 22:49:18 INFO agent.ComponentCommandOrder: Cannot schedule
> ACCUMULO_GC START as dependency ACCUMULO_MASTER is INIT
> 14/09/09 22:49:18 INFO agent.AgentProviderService: Start of ACCUMULO_GC on
> container_1409855300917_0292_01_000005 delayed as dependencies have not
> started.
> 14/09/09 22:49:18 INFO agent.ComponentCommandOrder: Cannot schedule
> ACCUMULO_TRACER START as dependency ACCUMULO_MASTER is INIT
> 14/09/09 22:49:18 INFO agent.AgentProviderService: Start of ACCUMULO_TRACER
> on container_1409855300917_0292_01_000006 delayed as dependencies have not
> started.
> 14/09/09 22:49:18 INFO agent.ComponentCommandOrder: Cannot schedule
> ACCUMULO_TSERVER START as dependency ACCUMULO_MASTER is INIT
> 14/09/09 22:49:18 INFO agent.AgentProviderService: Start of
> ACCUMULO_TSERVER on container_1409855300917_0292_01_000003 delayed as
> dependencies have not started.
> 14/09/09 22:49:18 INFO agent.AgentProviderService: Installing
> ACCUMULO_MASTER on container_1409855300917_0292_01_000008.
> 14/09/09 22:49:18 INFO agent.AgentProviderService: Component operation.
> Status: IN_PROGRESS
> 14/09/09 22:49:19 INFO agent.AgentProviderService: Component operation.
> Status: COMPLETED
> 14/09/09 22:49:19 INFO agent.AgentProviderService: publishing
> PublishedConfiguration{description='LogFolders' entries = 14}
> 14/09/09 22:49:19 INFO agent.AgentProviderService: Starting ACCUMULO_MASTER
> on container_1409855300917_0292_01_000008.
> 14/09/09 22:49:21 INFO agent.AgentProviderService: Component operation.
> Status: IN_PROGRESS
> 14/09/09 22:49:21 INFO agent.AgentProviderService: Component operation.
> Status: FAILED
> 
> Thanks,
> 
> Tim
> 
> On Tue, Sep 9, 2014 at 11:54 AM, Tim Israel <ti...@timisrael.com> wrote:
> 
>> I will give that a shot.  Thanks Jon.
>> 
>> Tim
>> 
>> On Tue, Sep 9, 2014 at 11:38 AM, Jon Maron <jm...@hortonworks.com> wrote:
>> 
>>> I would try to use a newer version of Slider.  I believe the issue you’re
>>> encountering is SLIDER-266.
>>> 
>>> — Jon
>>> 
>>> On Sep 9, 2014, at 11:17 AM, Tim Israel <ti...@timisrael.com> wrote:
>>> 
>>>> Hi everyone,
>>>> 
>>>> I've been trying to deploy storm and accumulo on slider on a kerberized
>>>> cluster for the past few days.
>>>> 
>>>> My issue seems identical to an issue that was posted by Jon Maron in
>>> July (
>>>> 
>>> http://mail-archives.apache.org/mod_mbox/incubator-slider-dev/201407.mbox/%3CB7CEFF5C-4DE4-4B26-89F1-68A9A4932A44@hortonworks.com%3E
>>>> )
>>>> 
>>>> Cluster : HDP 2.1.5 - Kerberized
>>>> Slider Version : 0.30 (from HDP Slider Tech Preview) and 0.40 (Apache)
>>>> app-packages: storm_v091 (tested in 0.30 and 0.40) and accumulo_v151
>>>> (tested in 0.30 only)
>>>> 
>>>> 
>>>> I get the same error for each app-package tested.  I'm sending a partial
>>>> stack trace below (I can send a more complete one if you're interested).
>>>> It is identical to Jon Maron's.
>>>> 
>>>> 14/09/09 13:50:53 ERROR appmaster.SliderAppMaster: Failed to start
>>>> Container container_1409855300917_0090_01_000003
>>>> org.apache.hadoop.yarn.exceptions.YarnException:
>>> java.lang.NullPointerException
>>>>      at
>>> org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:41)
>>>>      at
>>> org.apache.hadoop.yarn.client.api.impl.NMClientImpl.startContainer(NMClientImpl.java:224)
>>>> 
>>>> ...
>>>> 
>>>> Any help would be greatly appreciated.
>>>> 
>>>> Thanks,
>>>> 
>>>> Tim
>>> 
>>> 
>>> --
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to
>>> which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified
>>> that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender
>>> immediately
>>> and delete it from your system. Thank You.
>>> 
>> 
>> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: NullPointerException when Slider AM tries to start containers

Posted by Tim Israel <ti...@timisrael.com>.
Jon,

Your recommendation worked. Thank you.

I'm currently running the following successfully:
* HDP 2.1.5 (Kerberized)
* Slider release-0.50.2-incubating-rc0
* Accumulo 1.6.0 (from app-packages)
* Storm 0.91 (from app-packages)

I haven't done any significant testing, but it seems to be working as
expected.

For the benefit of future readers some notes that may help you with some of
the hurdles I encountered while debugging:
I created a separate principal to run slider instead of using "yarn" as
most of the instructions specify because I didn't want to modify
container-executor.cfg
(which prevents certain users and uids from running jobs).

I also rearranged the layout of slider in HDFS which required adjusting
application.def.

Inside of appConfig.json, I changed site.global.app_user to my executing
user.  I also set site.global.security_enabled to true.

-----Without the user and security settings, storm produce logs like
this-----
    14/09/09 21:54:31 INFO agent.AgentProviderService: Start of NIMBUS on
container_1409855300917_0287_01_000002 delayed as dependencies have not
started.
    14/09/09 21:54:41 INFO agent.ComponentCommandOrder: Cannot schedule
STORM_UI_SERVER START as dependency NIMBUS is INSTALLED
    14/09/09 21:54:41 INFO agent.AgentProviderService: Start of
STORM_UI_SERVER on container_1409855300917_0287_01_000004 delayed as
dependencies have not started.
    14/09/09 21:54:41 INFO agent.ComponentCommandOrder: Cannot schedule
DRPC_SERVER START as dependency NIMBUS is INSTALLED
    14/09/09 21:54:41 INFO agent.AgentProviderService: Start of DRPC_SERVER
on container_1409855300917_0287_01_000005 delayed as dependencies have not
started.
    14/09/09 21:54:41 INFO agent.ComponentCommandOrder: Cannot schedule
SUPERVISOR START as dependency NIMBUS is INSTALLED
    14/09/09 21:54:41 INFO agent.AgentProviderService: Start of SUPERVISOR
on container_1409855300917_0287_01_000006 delayed as dependencies have not
started.
    14/09/09 21:54:41 INFO agent.ComponentCommandOrder: Cannot schedule
NIMBUS START as dependency STORM_REST_API is INSTALL_FAILED

-----Without the user and security settings, accumulo produce logs like
this-----
14/09/09 22:49:18 INFO agent.ComponentCommandOrder: Cannot schedule
ACCUMULO_MONITOR START as dependency ACCUMULO_MASTER is INIT
14/09/09 22:49:18 INFO agent.AgentProviderService: Start of
ACCUMULO_MONITOR on container_1409855300917_0292_01_000004 delayed as
dependencies have not started.
14/09/09 22:49:18 INFO agent.ComponentCommandOrder: Cannot schedule
ACCUMULO_GC START as dependency ACCUMULO_MASTER is INIT
14/09/09 22:49:18 INFO agent.AgentProviderService: Start of ACCUMULO_GC on
container_1409855300917_0292_01_000005 delayed as dependencies have not
started.
14/09/09 22:49:18 INFO agent.ComponentCommandOrder: Cannot schedule
ACCUMULO_TRACER START as dependency ACCUMULO_MASTER is INIT
14/09/09 22:49:18 INFO agent.AgentProviderService: Start of ACCUMULO_TRACER
on container_1409855300917_0292_01_000006 delayed as dependencies have not
started.
14/09/09 22:49:18 INFO agent.ComponentCommandOrder: Cannot schedule
ACCUMULO_TSERVER START as dependency ACCUMULO_MASTER is INIT
14/09/09 22:49:18 INFO agent.AgentProviderService: Start of
ACCUMULO_TSERVER on container_1409855300917_0292_01_000003 delayed as
dependencies have not started.
14/09/09 22:49:18 INFO agent.AgentProviderService: Installing
ACCUMULO_MASTER on container_1409855300917_0292_01_000008.
14/09/09 22:49:18 INFO agent.AgentProviderService: Component operation.
Status: IN_PROGRESS
14/09/09 22:49:19 INFO agent.AgentProviderService: Component operation.
Status: COMPLETED
14/09/09 22:49:19 INFO agent.AgentProviderService: publishing
PublishedConfiguration{description='LogFolders' entries = 14}
14/09/09 22:49:19 INFO agent.AgentProviderService: Starting ACCUMULO_MASTER
on container_1409855300917_0292_01_000008.
14/09/09 22:49:21 INFO agent.AgentProviderService: Component operation.
Status: IN_PROGRESS
14/09/09 22:49:21 INFO agent.AgentProviderService: Component operation.
Status: FAILED

Thanks,

Tim

On Tue, Sep 9, 2014 at 11:54 AM, Tim Israel <ti...@timisrael.com> wrote:

> I will give that a shot.  Thanks Jon.
>
> Tim
>
> On Tue, Sep 9, 2014 at 11:38 AM, Jon Maron <jm...@hortonworks.com> wrote:
>
>> I would try to use a newer version of Slider.  I believe the issue you’re
>> encountering is SLIDER-266.
>>
>> — Jon
>>
>> On Sep 9, 2014, at 11:17 AM, Tim Israel <ti...@timisrael.com> wrote:
>>
>> > Hi everyone,
>> >
>> > I've been trying to deploy storm and accumulo on slider on a kerberized
>> > cluster for the past few days.
>> >
>> > My issue seems identical to an issue that was posted by Jon Maron in
>> July (
>> >
>> http://mail-archives.apache.org/mod_mbox/incubator-slider-dev/201407.mbox/%3CB7CEFF5C-4DE4-4B26-89F1-68A9A4932A44@hortonworks.com%3E
>> > )
>> >
>> > Cluster : HDP 2.1.5 - Kerberized
>> > Slider Version : 0.30 (from HDP Slider Tech Preview) and 0.40 (Apache)
>> > app-packages: storm_v091 (tested in 0.30 and 0.40) and accumulo_v151
>> > (tested in 0.30 only)
>> >
>> >
>> > I get the same error for each app-package tested.  I'm sending a partial
>> > stack trace below (I can send a more complete one if you're interested).
>> > It is identical to Jon Maron's.
>> >
>> > 14/09/09 13:50:53 ERROR appmaster.SliderAppMaster: Failed to start
>> > Container container_1409855300917_0090_01_000003
>> > org.apache.hadoop.yarn.exceptions.YarnException:
>> java.lang.NullPointerException
>> >       at
>> org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:41)
>> >       at
>> org.apache.hadoop.yarn.client.api.impl.NMClientImpl.startContainer(NMClientImpl.java:224)
>> >
>> > ...
>> >
>> > Any help would be greatly appreciated.
>> >
>> > Thanks,
>> >
>> > Tim
>>
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified
>> that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender
>> immediately
>> and delete it from your system. Thank You.
>>
>
>

Re: NullPointerException when Slider AM tries to start containers

Posted by Tim Israel <ti...@timisrael.com>.
I will give that a shot.  Thanks Jon.

Tim

On Tue, Sep 9, 2014 at 11:38 AM, Jon Maron <jm...@hortonworks.com> wrote:

> I would try to use a newer version of Slider.  I believe the issue you’re
> encountering is SLIDER-266.
>
> — Jon
>
> On Sep 9, 2014, at 11:17 AM, Tim Israel <ti...@timisrael.com> wrote:
>
> > Hi everyone,
> >
> > I've been trying to deploy storm and accumulo on slider on a kerberized
> > cluster for the past few days.
> >
> > My issue seems identical to an issue that was posted by Jon Maron in
> July (
> >
> http://mail-archives.apache.org/mod_mbox/incubator-slider-dev/201407.mbox/%3CB7CEFF5C-4DE4-4B26-89F1-68A9A4932A44@hortonworks.com%3E
> > )
> >
> > Cluster : HDP 2.1.5 - Kerberized
> > Slider Version : 0.30 (from HDP Slider Tech Preview) and 0.40 (Apache)
> > app-packages: storm_v091 (tested in 0.30 and 0.40) and accumulo_v151
> > (tested in 0.30 only)
> >
> >
> > I get the same error for each app-package tested.  I'm sending a partial
> > stack trace below (I can send a more complete one if you're interested).
> > It is identical to Jon Maron's.
> >
> > 14/09/09 13:50:53 ERROR appmaster.SliderAppMaster: Failed to start
> > Container container_1409855300917_0090_01_000003
> > org.apache.hadoop.yarn.exceptions.YarnException:
> java.lang.NullPointerException
> >       at
> org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:41)
> >       at
> org.apache.hadoop.yarn.client.api.impl.NMClientImpl.startContainer(NMClientImpl.java:224)
> >
> > ...
> >
> > Any help would be greatly appreciated.
> >
> > Thanks,
> >
> > Tim
>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: NullPointerException when Slider AM tries to start containers

Posted by Jon Maron <jm...@hortonworks.com>.
I would try to use a newer version of Slider.  I believe the issue you’re encountering is SLIDER-266.

— Jon

On Sep 9, 2014, at 11:17 AM, Tim Israel <ti...@timisrael.com> wrote:

> Hi everyone,
> 
> I've been trying to deploy storm and accumulo on slider on a kerberized
> cluster for the past few days.
> 
> My issue seems identical to an issue that was posted by Jon Maron in July (
> http://mail-archives.apache.org/mod_mbox/incubator-slider-dev/201407.mbox/%3CB7CEFF5C-4DE4-4B26-89F1-68A9A4932A44@hortonworks.com%3E
> )
> 
> Cluster : HDP 2.1.5 - Kerberized
> Slider Version : 0.30 (from HDP Slider Tech Preview) and 0.40 (Apache)
> app-packages: storm_v091 (tested in 0.30 and 0.40) and accumulo_v151
> (tested in 0.30 only)
> 
> 
> I get the same error for each app-package tested.  I'm sending a partial
> stack trace below (I can send a more complete one if you're interested).
> It is identical to Jon Maron's.
> 
> 14/09/09 13:50:53 ERROR appmaster.SliderAppMaster: Failed to start
> Container container_1409855300917_0090_01_000003
> org.apache.hadoop.yarn.exceptions.YarnException: java.lang.NullPointerException
> 	at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:41)
> 	at org.apache.hadoop.yarn.client.api.impl.NMClientImpl.startContainer(NMClientImpl.java:224)
> 
> ...
> 
> Any help would be greatly appreciated.
> 
> Thanks,
> 
> Tim


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.