You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by blah blah <tm...@gmail.com> on 2013/06/26 02:09:17 UTC

Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Hi All

First let me excuse for the poor thread title but I have no idea how to
express the problem in one sentence.

I have implemented new Application Master with the use of Yarn. I am using
old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT
3.0.0). I can not update to current trunk version, as prototype deadline is
soon, and I don't have time to include Yarn API changes.

Currently I execute experiments in pseudo-distributed mode, I use guava
version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
"larger" datasets. My AM works fine and I can execute it without a problem
for a debug dataset (1MB size). But when I increase the size of input to
6.8 MB, I am getting the following exceptions:

AM_Exceptions_Stack

Exception in thread "Thread-3"
java.lang.reflect.UndeclaredThrowableException
    at
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
    at
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
    at
org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
    at
org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
    at
org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
    at java.lang.Thread.run(Thread.java:662)
Caused by: com.google.protobuf.ServiceException: java.io.IOException:
Failed on local exception: java.io.IOException: Response is null.; Host
Details : local host is: "linux-ljc5.site/127.0.0.1"; destination host is:
"0.0.0.0":8030;
    at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
    at $Proxy10.allocate(Unknown Source)
    at
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
    ... 4 more
Caused by: java.io.IOException: Failed on local exception:
java.io.IOException: Response is null.; Host Details : local host is:
"linux-ljc5.site/127.0.0.1"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
    at org.apache.hadoop.ipc.Client.call(Client.java:1240)
    at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
    ... 6 more
Caused by: java.io.IOException: Response is null.
    at
org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)

Container_Exception

Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866"
java.lang.NoSuchMethodError:
com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
    at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
    at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
    at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
    at java.lang.Thread.run(Thread.java:662)


As I said this problem does not occur for the 1MB input. For the 6MB input
nothing is changed except the input dataset. Now a little bit of what am I
doing, to give you the context of the problem. My AM starts N (debug 4)
containers and each container reads its input data part. When this process
is finished I am exchanging parts of input between containers (exchanging
IDs of input structures, to provide means for communication between data
structures). During the process of exchanging IDs these exceptions occur. I
start Netty Server/Client on each container and I use ports 12000-12099 as
mean of communicating these IDs.

Any help will be greatly appreciated. Sorry for any typos and if the
explanation is not clear just ask for any details you are interested in.
Currently it is after 2 AM I hope this will be a valid excuse.

regards
tmp

Re: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by Omkar Joshi <oj...@hortonworks.com>.
Also due you see any exception in RM / NM logs?

Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Mon, Jul 1, 2013 at 11:19 AM, Omkar Joshi <oj...@hortonworks.com> wrote:

> Hi,
>
> As I don't know your complete AM code and how your containers are
> communicating with each other...Certain things which might help you in
> debugging.... where you are starting your RM (is it really running on
> 8030???? are you sure there is no previously started RM still running
> there?) Also in yarn-site.xml can you try changing RM address to something
> like "localhost:<free-port-but-not-default>" and configure maximum client
> thread size for handling AM requests? only your AM is expected to
> communicate with RM on AM-RM protocol.. by any chance in your code; are
> containers directly communicating with RM on AM-RM protocol??
>
>   <property>
>
>     <description>The address of the scheduler interface.</description>
>
>     <name>yarn.resourcemanager.scheduler.address</name>
>
>     <value>${yarn.resourcemanager.hostname}:8030</value>
>
>   </property>
>
>
>   <property>
>
>     <description>Number of threads to handle scheduler interface.</
> description>
>
>     <name>yarn.resourcemanager.scheduler.client.thread-count</name>
>
>     <value>50</value>
>
>   </property>
>
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* <http://www.hortonworks.com>
>
>
> On Fri, Jun 28, 2013 at 5:35 AM, blah blah <tm...@gmail.com> wrote:
>
>> Hi
>>
>> Sorry to reply so late. I don't have the data you requested (sorry I have
>> no time, my deadline is within 3 days). However I have observed that this
>> issue occurs not only for the "larger" datasets (6.8MB), but for all
>> datasets and all jobs in general. However for smaller datasets (1MB) the AM
>> does not throw the Exception, only containers throw exceptions (same as in
>> previous e-mail). When these exception are throws my code (AM and
>> containers) does not perform any operations on HDFS, they only perform
>> in-memory computation and communication. Also I have observed that these
>> exception occur at "random", I couldn't observe any pattern. I can execute
>> job successfully, then resubmit the job repeating the experiment and these
>> exceptions occur (no change was made to src code, input dataset,or
>> execution/input parameters).
>>
>> As for the high network usage, as I said I don't have the data. But YARN
>> is running on nodes which are exclusive for my experiments no other
>> software runs on these nodes (only OS and YARN). Besides I don't think that
>> 20 containers working on 1MB dataset (total) can be called high network
>> usage.
>>
>> regards
>> tmp
>>
>>
>>
>> 2013/6/26 Devaraj k <de...@huawei.com>
>>
>>>  Hi,****
>>>
>>> ** **
>>>
>>>    Could you check the network usage in the cluster when this problem
>>> occurs? Probably it is causing due to high network usage. ****
>>>
>>> ** **
>>>
>>> Thanks****
>>>
>>> Devaraj k****
>>>
>>> ** **
>>>
>>> *From:* blah blah [mailto:tmp5330@gmail.com]
>>> *Sent:* 26 June 2013 05:39
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Yarn HDFS and Yarn Exceptions when processing "larger"
>>> datasets.****
>>>
>>> ** **
>>>
>>> Hi All****
>>>
>>> First let me excuse for the poor thread title but I have no idea how to
>>> express the problem in one sentence. ****
>>>
>>> I have implemented new Application Master with the use of Yarn. I am
>>> using old Yarn development version. Revision 1437315, from 2013-01-23
>>> (SNAPSHOT 3.0.0). I can not update to current trunk version, as prototype
>>> deadline is soon, and I don't have time to include Yarn API changes.****
>>>
>>> Currently I execute experiments in pseudo-distributed mode, I use guava
>>> version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
>>> "larger" datasets. My AM works fine and I can execute it without a problem
>>> for a debug dataset (1MB size). But when I increase the size of input to
>>> 6.8 MB, I am getting the following exceptions:****
>>>
>>> AM_Exceptions_Stack
>>>
>>> Exception in thread "Thread-3"
>>> java.lang.reflect.UndeclaredThrowableException
>>>     at
>>> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
>>>     at
>>> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
>>>     at
>>> org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
>>>     at
>>> org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
>>>     at
>>> org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
>>>     at java.lang.Thread.run(Thread.java:662)
>>> Caused by: com.google.protobuf.ServiceException: java.io.IOException:
>>> Failed on local exception: java.io.IOException: Response is null.; Host
>>> Details : local host is: "linux-ljc5.site/127.0.0.1"; destination host
>>> is: "0.0.0.0":8030;
>>>     at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>>>     at $Proxy10.allocate(Unknown Source)
>>>     at
>>> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
>>>     ... 4 more
>>> Caused by: java.io.IOException: Failed on local exception:
>>> java.io.IOException: Response is null.; Host Details : local host is:
>>> "linux-ljc5.site/127.0.0.1"; destination host is: "0.0.0.0":8030;
>>>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1240)
>>>     at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>>>     ... 6 more
>>> Caused by: java.io.IOException: Response is null.
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
>>>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)****
>>>
>>> Container_Exception
>>>
>>> Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866"
>>> java.lang.NoSuchMethodError:
>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>     at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
>>>     at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
>>>     at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
>>>     at java.lang.Thread.run(Thread.java:662)
>>>
>>> ****
>>>
>>> As I said this problem does not occur for the 1MB input. For the 6MB
>>> input nothing is changed except the input dataset. Now a little bit of what
>>> am I doing, to give you the context of the problem. My AM starts N (debug
>>> 4) containers and each container reads its input data part. When this
>>> process is finished I am exchanging parts of input between containers
>>> (exchanging IDs of input structures, to provide means for communication
>>> between data structures). During the process of exchanging IDs these
>>> exceptions occur. I start Netty Server/Client on each container and I use
>>> ports 12000-12099 as mean of communicating these IDs. ****
>>>
>>> Any help will be greatly appreciated. Sorry for any typos and if the
>>> explanation is not clear just ask for any details you are interested in.
>>> Currently it is after 2 AM I hope this will be a valid excuse.****
>>>
>>> regards****
>>>
>>> tmp****
>>>
>>
>>
>

Re: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by Omkar Joshi <oj...@hortonworks.com>.
Also due you see any exception in RM / NM logs?

Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Mon, Jul 1, 2013 at 11:19 AM, Omkar Joshi <oj...@hortonworks.com> wrote:

> Hi,
>
> As I don't know your complete AM code and how your containers are
> communicating with each other...Certain things which might help you in
> debugging.... where you are starting your RM (is it really running on
> 8030???? are you sure there is no previously started RM still running
> there?) Also in yarn-site.xml can you try changing RM address to something
> like "localhost:<free-port-but-not-default>" and configure maximum client
> thread size for handling AM requests? only your AM is expected to
> communicate with RM on AM-RM protocol.. by any chance in your code; are
> containers directly communicating with RM on AM-RM protocol??
>
>   <property>
>
>     <description>The address of the scheduler interface.</description>
>
>     <name>yarn.resourcemanager.scheduler.address</name>
>
>     <value>${yarn.resourcemanager.hostname}:8030</value>
>
>   </property>
>
>
>   <property>
>
>     <description>Number of threads to handle scheduler interface.</
> description>
>
>     <name>yarn.resourcemanager.scheduler.client.thread-count</name>
>
>     <value>50</value>
>
>   </property>
>
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* <http://www.hortonworks.com>
>
>
> On Fri, Jun 28, 2013 at 5:35 AM, blah blah <tm...@gmail.com> wrote:
>
>> Hi
>>
>> Sorry to reply so late. I don't have the data you requested (sorry I have
>> no time, my deadline is within 3 days). However I have observed that this
>> issue occurs not only for the "larger" datasets (6.8MB), but for all
>> datasets and all jobs in general. However for smaller datasets (1MB) the AM
>> does not throw the Exception, only containers throw exceptions (same as in
>> previous e-mail). When these exception are throws my code (AM and
>> containers) does not perform any operations on HDFS, they only perform
>> in-memory computation and communication. Also I have observed that these
>> exception occur at "random", I couldn't observe any pattern. I can execute
>> job successfully, then resubmit the job repeating the experiment and these
>> exceptions occur (no change was made to src code, input dataset,or
>> execution/input parameters).
>>
>> As for the high network usage, as I said I don't have the data. But YARN
>> is running on nodes which are exclusive for my experiments no other
>> software runs on these nodes (only OS and YARN). Besides I don't think that
>> 20 containers working on 1MB dataset (total) can be called high network
>> usage.
>>
>> regards
>> tmp
>>
>>
>>
>> 2013/6/26 Devaraj k <de...@huawei.com>
>>
>>>  Hi,****
>>>
>>> ** **
>>>
>>>    Could you check the network usage in the cluster when this problem
>>> occurs? Probably it is causing due to high network usage. ****
>>>
>>> ** **
>>>
>>> Thanks****
>>>
>>> Devaraj k****
>>>
>>> ** **
>>>
>>> *From:* blah blah [mailto:tmp5330@gmail.com]
>>> *Sent:* 26 June 2013 05:39
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Yarn HDFS and Yarn Exceptions when processing "larger"
>>> datasets.****
>>>
>>> ** **
>>>
>>> Hi All****
>>>
>>> First let me excuse for the poor thread title but I have no idea how to
>>> express the problem in one sentence. ****
>>>
>>> I have implemented new Application Master with the use of Yarn. I am
>>> using old Yarn development version. Revision 1437315, from 2013-01-23
>>> (SNAPSHOT 3.0.0). I can not update to current trunk version, as prototype
>>> deadline is soon, and I don't have time to include Yarn API changes.****
>>>
>>> Currently I execute experiments in pseudo-distributed mode, I use guava
>>> version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
>>> "larger" datasets. My AM works fine and I can execute it without a problem
>>> for a debug dataset (1MB size). But when I increase the size of input to
>>> 6.8 MB, I am getting the following exceptions:****
>>>
>>> AM_Exceptions_Stack
>>>
>>> Exception in thread "Thread-3"
>>> java.lang.reflect.UndeclaredThrowableException
>>>     at
>>> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
>>>     at
>>> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
>>>     at
>>> org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
>>>     at
>>> org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
>>>     at
>>> org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
>>>     at java.lang.Thread.run(Thread.java:662)
>>> Caused by: com.google.protobuf.ServiceException: java.io.IOException:
>>> Failed on local exception: java.io.IOException: Response is null.; Host
>>> Details : local host is: "linux-ljc5.site/127.0.0.1"; destination host
>>> is: "0.0.0.0":8030;
>>>     at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>>>     at $Proxy10.allocate(Unknown Source)
>>>     at
>>> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
>>>     ... 4 more
>>> Caused by: java.io.IOException: Failed on local exception:
>>> java.io.IOException: Response is null.; Host Details : local host is:
>>> "linux-ljc5.site/127.0.0.1"; destination host is: "0.0.0.0":8030;
>>>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1240)
>>>     at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>>>     ... 6 more
>>> Caused by: java.io.IOException: Response is null.
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
>>>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)****
>>>
>>> Container_Exception
>>>
>>> Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866"
>>> java.lang.NoSuchMethodError:
>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>     at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
>>>     at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
>>>     at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
>>>     at java.lang.Thread.run(Thread.java:662)
>>>
>>> ****
>>>
>>> As I said this problem does not occur for the 1MB input. For the 6MB
>>> input nothing is changed except the input dataset. Now a little bit of what
>>> am I doing, to give you the context of the problem. My AM starts N (debug
>>> 4) containers and each container reads its input data part. When this
>>> process is finished I am exchanging parts of input between containers
>>> (exchanging IDs of input structures, to provide means for communication
>>> between data structures). During the process of exchanging IDs these
>>> exceptions occur. I start Netty Server/Client on each container and I use
>>> ports 12000-12099 as mean of communicating these IDs. ****
>>>
>>> Any help will be greatly appreciated. Sorry for any typos and if the
>>> explanation is not clear just ask for any details you are interested in.
>>> Currently it is after 2 AM I hope this will be a valid excuse.****
>>>
>>> regards****
>>>
>>> tmp****
>>>
>>
>>
>

Re: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by Omkar Joshi <oj...@hortonworks.com>.
Also due you see any exception in RM / NM logs?

Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Mon, Jul 1, 2013 at 11:19 AM, Omkar Joshi <oj...@hortonworks.com> wrote:

> Hi,
>
> As I don't know your complete AM code and how your containers are
> communicating with each other...Certain things which might help you in
> debugging.... where you are starting your RM (is it really running on
> 8030???? are you sure there is no previously started RM still running
> there?) Also in yarn-site.xml can you try changing RM address to something
> like "localhost:<free-port-but-not-default>" and configure maximum client
> thread size for handling AM requests? only your AM is expected to
> communicate with RM on AM-RM protocol.. by any chance in your code; are
> containers directly communicating with RM on AM-RM protocol??
>
>   <property>
>
>     <description>The address of the scheduler interface.</description>
>
>     <name>yarn.resourcemanager.scheduler.address</name>
>
>     <value>${yarn.resourcemanager.hostname}:8030</value>
>
>   </property>
>
>
>   <property>
>
>     <description>Number of threads to handle scheduler interface.</
> description>
>
>     <name>yarn.resourcemanager.scheduler.client.thread-count</name>
>
>     <value>50</value>
>
>   </property>
>
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* <http://www.hortonworks.com>
>
>
> On Fri, Jun 28, 2013 at 5:35 AM, blah blah <tm...@gmail.com> wrote:
>
>> Hi
>>
>> Sorry to reply so late. I don't have the data you requested (sorry I have
>> no time, my deadline is within 3 days). However I have observed that this
>> issue occurs not only for the "larger" datasets (6.8MB), but for all
>> datasets and all jobs in general. However for smaller datasets (1MB) the AM
>> does not throw the Exception, only containers throw exceptions (same as in
>> previous e-mail). When these exception are throws my code (AM and
>> containers) does not perform any operations on HDFS, they only perform
>> in-memory computation and communication. Also I have observed that these
>> exception occur at "random", I couldn't observe any pattern. I can execute
>> job successfully, then resubmit the job repeating the experiment and these
>> exceptions occur (no change was made to src code, input dataset,or
>> execution/input parameters).
>>
>> As for the high network usage, as I said I don't have the data. But YARN
>> is running on nodes which are exclusive for my experiments no other
>> software runs on these nodes (only OS and YARN). Besides I don't think that
>> 20 containers working on 1MB dataset (total) can be called high network
>> usage.
>>
>> regards
>> tmp
>>
>>
>>
>> 2013/6/26 Devaraj k <de...@huawei.com>
>>
>>>  Hi,****
>>>
>>> ** **
>>>
>>>    Could you check the network usage in the cluster when this problem
>>> occurs? Probably it is causing due to high network usage. ****
>>>
>>> ** **
>>>
>>> Thanks****
>>>
>>> Devaraj k****
>>>
>>> ** **
>>>
>>> *From:* blah blah [mailto:tmp5330@gmail.com]
>>> *Sent:* 26 June 2013 05:39
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Yarn HDFS and Yarn Exceptions when processing "larger"
>>> datasets.****
>>>
>>> ** **
>>>
>>> Hi All****
>>>
>>> First let me excuse for the poor thread title but I have no idea how to
>>> express the problem in one sentence. ****
>>>
>>> I have implemented new Application Master with the use of Yarn. I am
>>> using old Yarn development version. Revision 1437315, from 2013-01-23
>>> (SNAPSHOT 3.0.0). I can not update to current trunk version, as prototype
>>> deadline is soon, and I don't have time to include Yarn API changes.****
>>>
>>> Currently I execute experiments in pseudo-distributed mode, I use guava
>>> version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
>>> "larger" datasets. My AM works fine and I can execute it without a problem
>>> for a debug dataset (1MB size). But when I increase the size of input to
>>> 6.8 MB, I am getting the following exceptions:****
>>>
>>> AM_Exceptions_Stack
>>>
>>> Exception in thread "Thread-3"
>>> java.lang.reflect.UndeclaredThrowableException
>>>     at
>>> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
>>>     at
>>> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
>>>     at
>>> org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
>>>     at
>>> org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
>>>     at
>>> org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
>>>     at java.lang.Thread.run(Thread.java:662)
>>> Caused by: com.google.protobuf.ServiceException: java.io.IOException:
>>> Failed on local exception: java.io.IOException: Response is null.; Host
>>> Details : local host is: "linux-ljc5.site/127.0.0.1"; destination host
>>> is: "0.0.0.0":8030;
>>>     at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>>>     at $Proxy10.allocate(Unknown Source)
>>>     at
>>> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
>>>     ... 4 more
>>> Caused by: java.io.IOException: Failed on local exception:
>>> java.io.IOException: Response is null.; Host Details : local host is:
>>> "linux-ljc5.site/127.0.0.1"; destination host is: "0.0.0.0":8030;
>>>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1240)
>>>     at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>>>     ... 6 more
>>> Caused by: java.io.IOException: Response is null.
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
>>>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)****
>>>
>>> Container_Exception
>>>
>>> Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866"
>>> java.lang.NoSuchMethodError:
>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>     at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
>>>     at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
>>>     at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
>>>     at java.lang.Thread.run(Thread.java:662)
>>>
>>> ****
>>>
>>> As I said this problem does not occur for the 1MB input. For the 6MB
>>> input nothing is changed except the input dataset. Now a little bit of what
>>> am I doing, to give you the context of the problem. My AM starts N (debug
>>> 4) containers and each container reads its input data part. When this
>>> process is finished I am exchanging parts of input between containers
>>> (exchanging IDs of input structures, to provide means for communication
>>> between data structures). During the process of exchanging IDs these
>>> exceptions occur. I start Netty Server/Client on each container and I use
>>> ports 12000-12099 as mean of communicating these IDs. ****
>>>
>>> Any help will be greatly appreciated. Sorry for any typos and if the
>>> explanation is not clear just ask for any details you are interested in.
>>> Currently it is after 2 AM I hope this will be a valid excuse.****
>>>
>>> regards****
>>>
>>> tmp****
>>>
>>
>>
>

Re: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by Omkar Joshi <oj...@hortonworks.com>.
Also due you see any exception in RM / NM logs?

Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Mon, Jul 1, 2013 at 11:19 AM, Omkar Joshi <oj...@hortonworks.com> wrote:

> Hi,
>
> As I don't know your complete AM code and how your containers are
> communicating with each other...Certain things which might help you in
> debugging.... where you are starting your RM (is it really running on
> 8030???? are you sure there is no previously started RM still running
> there?) Also in yarn-site.xml can you try changing RM address to something
> like "localhost:<free-port-but-not-default>" and configure maximum client
> thread size for handling AM requests? only your AM is expected to
> communicate with RM on AM-RM protocol.. by any chance in your code; are
> containers directly communicating with RM on AM-RM protocol??
>
>   <property>
>
>     <description>The address of the scheduler interface.</description>
>
>     <name>yarn.resourcemanager.scheduler.address</name>
>
>     <value>${yarn.resourcemanager.hostname}:8030</value>
>
>   </property>
>
>
>   <property>
>
>     <description>Number of threads to handle scheduler interface.</
> description>
>
>     <name>yarn.resourcemanager.scheduler.client.thread-count</name>
>
>     <value>50</value>
>
>   </property>
>
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* <http://www.hortonworks.com>
>
>
> On Fri, Jun 28, 2013 at 5:35 AM, blah blah <tm...@gmail.com> wrote:
>
>> Hi
>>
>> Sorry to reply so late. I don't have the data you requested (sorry I have
>> no time, my deadline is within 3 days). However I have observed that this
>> issue occurs not only for the "larger" datasets (6.8MB), but for all
>> datasets and all jobs in general. However for smaller datasets (1MB) the AM
>> does not throw the Exception, only containers throw exceptions (same as in
>> previous e-mail). When these exception are throws my code (AM and
>> containers) does not perform any operations on HDFS, they only perform
>> in-memory computation and communication. Also I have observed that these
>> exception occur at "random", I couldn't observe any pattern. I can execute
>> job successfully, then resubmit the job repeating the experiment and these
>> exceptions occur (no change was made to src code, input dataset,or
>> execution/input parameters).
>>
>> As for the high network usage, as I said I don't have the data. But YARN
>> is running on nodes which are exclusive for my experiments no other
>> software runs on these nodes (only OS and YARN). Besides I don't think that
>> 20 containers working on 1MB dataset (total) can be called high network
>> usage.
>>
>> regards
>> tmp
>>
>>
>>
>> 2013/6/26 Devaraj k <de...@huawei.com>
>>
>>>  Hi,****
>>>
>>> ** **
>>>
>>>    Could you check the network usage in the cluster when this problem
>>> occurs? Probably it is causing due to high network usage. ****
>>>
>>> ** **
>>>
>>> Thanks****
>>>
>>> Devaraj k****
>>>
>>> ** **
>>>
>>> *From:* blah blah [mailto:tmp5330@gmail.com]
>>> *Sent:* 26 June 2013 05:39
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Yarn HDFS and Yarn Exceptions when processing "larger"
>>> datasets.****
>>>
>>> ** **
>>>
>>> Hi All****
>>>
>>> First let me excuse for the poor thread title but I have no idea how to
>>> express the problem in one sentence. ****
>>>
>>> I have implemented new Application Master with the use of Yarn. I am
>>> using old Yarn development version. Revision 1437315, from 2013-01-23
>>> (SNAPSHOT 3.0.0). I can not update to current trunk version, as prototype
>>> deadline is soon, and I don't have time to include Yarn API changes.****
>>>
>>> Currently I execute experiments in pseudo-distributed mode, I use guava
>>> version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
>>> "larger" datasets. My AM works fine and I can execute it without a problem
>>> for a debug dataset (1MB size). But when I increase the size of input to
>>> 6.8 MB, I am getting the following exceptions:****
>>>
>>> AM_Exceptions_Stack
>>>
>>> Exception in thread "Thread-3"
>>> java.lang.reflect.UndeclaredThrowableException
>>>     at
>>> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
>>>     at
>>> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
>>>     at
>>> org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
>>>     at
>>> org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
>>>     at
>>> org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
>>>     at java.lang.Thread.run(Thread.java:662)
>>> Caused by: com.google.protobuf.ServiceException: java.io.IOException:
>>> Failed on local exception: java.io.IOException: Response is null.; Host
>>> Details : local host is: "linux-ljc5.site/127.0.0.1"; destination host
>>> is: "0.0.0.0":8030;
>>>     at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>>>     at $Proxy10.allocate(Unknown Source)
>>>     at
>>> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
>>>     ... 4 more
>>> Caused by: java.io.IOException: Failed on local exception:
>>> java.io.IOException: Response is null.; Host Details : local host is:
>>> "linux-ljc5.site/127.0.0.1"; destination host is: "0.0.0.0":8030;
>>>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1240)
>>>     at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>>>     ... 6 more
>>> Caused by: java.io.IOException: Response is null.
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
>>>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)****
>>>
>>> Container_Exception
>>>
>>> Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866"
>>> java.lang.NoSuchMethodError:
>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>     at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
>>>     at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
>>>     at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
>>>     at java.lang.Thread.run(Thread.java:662)
>>>
>>> ****
>>>
>>> As I said this problem does not occur for the 1MB input. For the 6MB
>>> input nothing is changed except the input dataset. Now a little bit of what
>>> am I doing, to give you the context of the problem. My AM starts N (debug
>>> 4) containers and each container reads its input data part. When this
>>> process is finished I am exchanging parts of input between containers
>>> (exchanging IDs of input structures, to provide means for communication
>>> between data structures). During the process of exchanging IDs these
>>> exceptions occur. I start Netty Server/Client on each container and I use
>>> ports 12000-12099 as mean of communicating these IDs. ****
>>>
>>> Any help will be greatly appreciated. Sorry for any typos and if the
>>> explanation is not clear just ask for any details you are interested in.
>>> Currently it is after 2 AM I hope this will be a valid excuse.****
>>>
>>> regards****
>>>
>>> tmp****
>>>
>>
>>
>

Re: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by Omkar Joshi <oj...@hortonworks.com>.
Hi,

As I don't know your complete AM code and how your containers are
communicating with each other...Certain things which might help you in
debugging.... where you are starting your RM (is it really running on
8030???? are you sure there is no previously started RM still running
there?) Also in yarn-site.xml can you try changing RM address to something
like "localhost:<free-port-but-not-default>" and configure maximum client
thread size for handling AM requests? only your AM is expected to
communicate with RM on AM-RM protocol.. by any chance in your code; are
containers directly communicating with RM on AM-RM protocol??

  <property>

    <description>The address of the scheduler interface.</description>

    <name>yarn.resourcemanager.scheduler.address</name>

    <value>${yarn.resourcemanager.hostname}:8030</value>

  </property>


  <property>

    <description>Number of threads to handle scheduler interface.</
description>

    <name>yarn.resourcemanager.scheduler.client.thread-count</name>

    <value>50</value>

  </property>


Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Fri, Jun 28, 2013 at 5:35 AM, blah blah <tm...@gmail.com> wrote:

> Hi
>
> Sorry to reply so late. I don't have the data you requested (sorry I have
> no time, my deadline is within 3 days). However I have observed that this
> issue occurs not only for the "larger" datasets (6.8MB), but for all
> datasets and all jobs in general. However for smaller datasets (1MB) the AM
> does not throw the Exception, only containers throw exceptions (same as in
> previous e-mail). When these exception are throws my code (AM and
> containers) does not perform any operations on HDFS, they only perform
> in-memory computation and communication. Also I have observed that these
> exception occur at "random", I couldn't observe any pattern. I can execute
> job successfully, then resubmit the job repeating the experiment and these
> exceptions occur (no change was made to src code, input dataset,or
> execution/input parameters).
>
> As for the high network usage, as I said I don't have the data. But YARN
> is running on nodes which are exclusive for my experiments no other
> software runs on these nodes (only OS and YARN). Besides I don't think that
> 20 containers working on 1MB dataset (total) can be called high network
> usage.
>
> regards
> tmp
>
>
>
> 2013/6/26 Devaraj k <de...@huawei.com>
>
>>  Hi,****
>>
>> ** **
>>
>>    Could you check the network usage in the cluster when this problem
>> occurs? Probably it is causing due to high network usage. ****
>>
>> ** **
>>
>> Thanks****
>>
>> Devaraj k****
>>
>> ** **
>>
>> *From:* blah blah [mailto:tmp5330@gmail.com]
>> *Sent:* 26 June 2013 05:39
>> *To:* user@hadoop.apache.org
>> *Subject:* Yarn HDFS and Yarn Exceptions when processing "larger"
>> datasets.****
>>
>> ** **
>>
>> Hi All****
>>
>> First let me excuse for the poor thread title but I have no idea how to
>> express the problem in one sentence. ****
>>
>> I have implemented new Application Master with the use of Yarn. I am
>> using old Yarn development version. Revision 1437315, from 2013-01-23
>> (SNAPSHOT 3.0.0). I can not update to current trunk version, as prototype
>> deadline is soon, and I don't have time to include Yarn API changes.****
>>
>> Currently I execute experiments in pseudo-distributed mode, I use guava
>> version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
>> "larger" datasets. My AM works fine and I can execute it without a problem
>> for a debug dataset (1MB size). But when I increase the size of input to
>> 6.8 MB, I am getting the following exceptions:****
>>
>> AM_Exceptions_Stack
>>
>> Exception in thread "Thread-3"
>> java.lang.reflect.UndeclaredThrowableException
>>     at
>> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
>>     at
>> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
>>     at
>> org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
>>     at
>> org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
>>     at
>> org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
>>     at java.lang.Thread.run(Thread.java:662)
>> Caused by: com.google.protobuf.ServiceException: java.io.IOException:
>> Failed on local exception: java.io.IOException: Response is null.; Host
>> Details : local host is: "linux-ljc5.site/127.0.0.1"; destination host
>> is: "0.0.0.0":8030;
>>     at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>>     at $Proxy10.allocate(Unknown Source)
>>     at
>> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
>>     ... 4 more
>> Caused by: java.io.IOException: Failed on local exception:
>> java.io.IOException: Response is null.; Host Details : local host is:
>> "linux-ljc5.site/127.0.0.1"; destination host is: "0.0.0.0":8030;
>>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1240)
>>     at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>>     ... 6 more
>> Caused by: java.io.IOException: Response is null.
>>     at
>> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
>>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)****
>>
>> Container_Exception
>>
>> Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866"
>> java.lang.NoSuchMethodError:
>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>     at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
>>     at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
>>     at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
>>     at java.lang.Thread.run(Thread.java:662)
>>
>> ****
>>
>> As I said this problem does not occur for the 1MB input. For the 6MB
>> input nothing is changed except the input dataset. Now a little bit of what
>> am I doing, to give you the context of the problem. My AM starts N (debug
>> 4) containers and each container reads its input data part. When this
>> process is finished I am exchanging parts of input between containers
>> (exchanging IDs of input structures, to provide means for communication
>> between data structures). During the process of exchanging IDs these
>> exceptions occur. I start Netty Server/Client on each container and I use
>> ports 12000-12099 as mean of communicating these IDs. ****
>>
>> Any help will be greatly appreciated. Sorry for any typos and if the
>> explanation is not clear just ask for any details you are interested in.
>> Currently it is after 2 AM I hope this will be a valid excuse.****
>>
>> regards****
>>
>> tmp****
>>
>
>

Re: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by Omkar Joshi <oj...@hortonworks.com>.
Hi,

As I don't know your complete AM code and how your containers are
communicating with each other...Certain things which might help you in
debugging.... where you are starting your RM (is it really running on
8030???? are you sure there is no previously started RM still running
there?) Also in yarn-site.xml can you try changing RM address to something
like "localhost:<free-port-but-not-default>" and configure maximum client
thread size for handling AM requests? only your AM is expected to
communicate with RM on AM-RM protocol.. by any chance in your code; are
containers directly communicating with RM on AM-RM protocol??

  <property>

    <description>The address of the scheduler interface.</description>

    <name>yarn.resourcemanager.scheduler.address</name>

    <value>${yarn.resourcemanager.hostname}:8030</value>

  </property>


  <property>

    <description>Number of threads to handle scheduler interface.</
description>

    <name>yarn.resourcemanager.scheduler.client.thread-count</name>

    <value>50</value>

  </property>


Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Fri, Jun 28, 2013 at 5:35 AM, blah blah <tm...@gmail.com> wrote:

> Hi
>
> Sorry to reply so late. I don't have the data you requested (sorry I have
> no time, my deadline is within 3 days). However I have observed that this
> issue occurs not only for the "larger" datasets (6.8MB), but for all
> datasets and all jobs in general. However for smaller datasets (1MB) the AM
> does not throw the Exception, only containers throw exceptions (same as in
> previous e-mail). When these exception are throws my code (AM and
> containers) does not perform any operations on HDFS, they only perform
> in-memory computation and communication. Also I have observed that these
> exception occur at "random", I couldn't observe any pattern. I can execute
> job successfully, then resubmit the job repeating the experiment and these
> exceptions occur (no change was made to src code, input dataset,or
> execution/input parameters).
>
> As for the high network usage, as I said I don't have the data. But YARN
> is running on nodes which are exclusive for my experiments no other
> software runs on these nodes (only OS and YARN). Besides I don't think that
> 20 containers working on 1MB dataset (total) can be called high network
> usage.
>
> regards
> tmp
>
>
>
> 2013/6/26 Devaraj k <de...@huawei.com>
>
>>  Hi,****
>>
>> ** **
>>
>>    Could you check the network usage in the cluster when this problem
>> occurs? Probably it is causing due to high network usage. ****
>>
>> ** **
>>
>> Thanks****
>>
>> Devaraj k****
>>
>> ** **
>>
>> *From:* blah blah [mailto:tmp5330@gmail.com]
>> *Sent:* 26 June 2013 05:39
>> *To:* user@hadoop.apache.org
>> *Subject:* Yarn HDFS and Yarn Exceptions when processing "larger"
>> datasets.****
>>
>> ** **
>>
>> Hi All****
>>
>> First let me excuse for the poor thread title but I have no idea how to
>> express the problem in one sentence. ****
>>
>> I have implemented new Application Master with the use of Yarn. I am
>> using old Yarn development version. Revision 1437315, from 2013-01-23
>> (SNAPSHOT 3.0.0). I can not update to current trunk version, as prototype
>> deadline is soon, and I don't have time to include Yarn API changes.****
>>
>> Currently I execute experiments in pseudo-distributed mode, I use guava
>> version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
>> "larger" datasets. My AM works fine and I can execute it without a problem
>> for a debug dataset (1MB size). But when I increase the size of input to
>> 6.8 MB, I am getting the following exceptions:****
>>
>> AM_Exceptions_Stack
>>
>> Exception in thread "Thread-3"
>> java.lang.reflect.UndeclaredThrowableException
>>     at
>> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
>>     at
>> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
>>     at
>> org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
>>     at
>> org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
>>     at
>> org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
>>     at java.lang.Thread.run(Thread.java:662)
>> Caused by: com.google.protobuf.ServiceException: java.io.IOException:
>> Failed on local exception: java.io.IOException: Response is null.; Host
>> Details : local host is: "linux-ljc5.site/127.0.0.1"; destination host
>> is: "0.0.0.0":8030;
>>     at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>>     at $Proxy10.allocate(Unknown Source)
>>     at
>> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
>>     ... 4 more
>> Caused by: java.io.IOException: Failed on local exception:
>> java.io.IOException: Response is null.; Host Details : local host is:
>> "linux-ljc5.site/127.0.0.1"; destination host is: "0.0.0.0":8030;
>>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1240)
>>     at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>>     ... 6 more
>> Caused by: java.io.IOException: Response is null.
>>     at
>> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
>>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)****
>>
>> Container_Exception
>>
>> Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866"
>> java.lang.NoSuchMethodError:
>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>     at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
>>     at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
>>     at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
>>     at java.lang.Thread.run(Thread.java:662)
>>
>> ****
>>
>> As I said this problem does not occur for the 1MB input. For the 6MB
>> input nothing is changed except the input dataset. Now a little bit of what
>> am I doing, to give you the context of the problem. My AM starts N (debug
>> 4) containers and each container reads its input data part. When this
>> process is finished I am exchanging parts of input between containers
>> (exchanging IDs of input structures, to provide means for communication
>> between data structures). During the process of exchanging IDs these
>> exceptions occur. I start Netty Server/Client on each container and I use
>> ports 12000-12099 as mean of communicating these IDs. ****
>>
>> Any help will be greatly appreciated. Sorry for any typos and if the
>> explanation is not clear just ask for any details you are interested in.
>> Currently it is after 2 AM I hope this will be a valid excuse.****
>>
>> regards****
>>
>> tmp****
>>
>
>

Re: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by Omkar Joshi <oj...@hortonworks.com>.
Hi,

As I don't know your complete AM code and how your containers are
communicating with each other...Certain things which might help you in
debugging.... where you are starting your RM (is it really running on
8030???? are you sure there is no previously started RM still running
there?) Also in yarn-site.xml can you try changing RM address to something
like "localhost:<free-port-but-not-default>" and configure maximum client
thread size for handling AM requests? only your AM is expected to
communicate with RM on AM-RM protocol.. by any chance in your code; are
containers directly communicating with RM on AM-RM protocol??

  <property>

    <description>The address of the scheduler interface.</description>

    <name>yarn.resourcemanager.scheduler.address</name>

    <value>${yarn.resourcemanager.hostname}:8030</value>

  </property>


  <property>

    <description>Number of threads to handle scheduler interface.</
description>

    <name>yarn.resourcemanager.scheduler.client.thread-count</name>

    <value>50</value>

  </property>


Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Fri, Jun 28, 2013 at 5:35 AM, blah blah <tm...@gmail.com> wrote:

> Hi
>
> Sorry to reply so late. I don't have the data you requested (sorry I have
> no time, my deadline is within 3 days). However I have observed that this
> issue occurs not only for the "larger" datasets (6.8MB), but for all
> datasets and all jobs in general. However for smaller datasets (1MB) the AM
> does not throw the Exception, only containers throw exceptions (same as in
> previous e-mail). When these exception are throws my code (AM and
> containers) does not perform any operations on HDFS, they only perform
> in-memory computation and communication. Also I have observed that these
> exception occur at "random", I couldn't observe any pattern. I can execute
> job successfully, then resubmit the job repeating the experiment and these
> exceptions occur (no change was made to src code, input dataset,or
> execution/input parameters).
>
> As for the high network usage, as I said I don't have the data. But YARN
> is running on nodes which are exclusive for my experiments no other
> software runs on these nodes (only OS and YARN). Besides I don't think that
> 20 containers working on 1MB dataset (total) can be called high network
> usage.
>
> regards
> tmp
>
>
>
> 2013/6/26 Devaraj k <de...@huawei.com>
>
>>  Hi,****
>>
>> ** **
>>
>>    Could you check the network usage in the cluster when this problem
>> occurs? Probably it is causing due to high network usage. ****
>>
>> ** **
>>
>> Thanks****
>>
>> Devaraj k****
>>
>> ** **
>>
>> *From:* blah blah [mailto:tmp5330@gmail.com]
>> *Sent:* 26 June 2013 05:39
>> *To:* user@hadoop.apache.org
>> *Subject:* Yarn HDFS and Yarn Exceptions when processing "larger"
>> datasets.****
>>
>> ** **
>>
>> Hi All****
>>
>> First let me excuse for the poor thread title but I have no idea how to
>> express the problem in one sentence. ****
>>
>> I have implemented new Application Master with the use of Yarn. I am
>> using old Yarn development version. Revision 1437315, from 2013-01-23
>> (SNAPSHOT 3.0.0). I can not update to current trunk version, as prototype
>> deadline is soon, and I don't have time to include Yarn API changes.****
>>
>> Currently I execute experiments in pseudo-distributed mode, I use guava
>> version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
>> "larger" datasets. My AM works fine and I can execute it without a problem
>> for a debug dataset (1MB size). But when I increase the size of input to
>> 6.8 MB, I am getting the following exceptions:****
>>
>> AM_Exceptions_Stack
>>
>> Exception in thread "Thread-3"
>> java.lang.reflect.UndeclaredThrowableException
>>     at
>> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
>>     at
>> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
>>     at
>> org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
>>     at
>> org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
>>     at
>> org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
>>     at java.lang.Thread.run(Thread.java:662)
>> Caused by: com.google.protobuf.ServiceException: java.io.IOException:
>> Failed on local exception: java.io.IOException: Response is null.; Host
>> Details : local host is: "linux-ljc5.site/127.0.0.1"; destination host
>> is: "0.0.0.0":8030;
>>     at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>>     at $Proxy10.allocate(Unknown Source)
>>     at
>> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
>>     ... 4 more
>> Caused by: java.io.IOException: Failed on local exception:
>> java.io.IOException: Response is null.; Host Details : local host is:
>> "linux-ljc5.site/127.0.0.1"; destination host is: "0.0.0.0":8030;
>>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1240)
>>     at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>>     ... 6 more
>> Caused by: java.io.IOException: Response is null.
>>     at
>> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
>>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)****
>>
>> Container_Exception
>>
>> Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866"
>> java.lang.NoSuchMethodError:
>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>     at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
>>     at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
>>     at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
>>     at java.lang.Thread.run(Thread.java:662)
>>
>> ****
>>
>> As I said this problem does not occur for the 1MB input. For the 6MB
>> input nothing is changed except the input dataset. Now a little bit of what
>> am I doing, to give you the context of the problem. My AM starts N (debug
>> 4) containers and each container reads its input data part. When this
>> process is finished I am exchanging parts of input between containers
>> (exchanging IDs of input structures, to provide means for communication
>> between data structures). During the process of exchanging IDs these
>> exceptions occur. I start Netty Server/Client on each container and I use
>> ports 12000-12099 as mean of communicating these IDs. ****
>>
>> Any help will be greatly appreciated. Sorry for any typos and if the
>> explanation is not clear just ask for any details you are interested in.
>> Currently it is after 2 AM I hope this will be a valid excuse.****
>>
>> regards****
>>
>> tmp****
>>
>
>

Re: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by Omkar Joshi <oj...@hortonworks.com>.
Hi,

As I don't know your complete AM code and how your containers are
communicating with each other...Certain things which might help you in
debugging.... where you are starting your RM (is it really running on
8030???? are you sure there is no previously started RM still running
there?) Also in yarn-site.xml can you try changing RM address to something
like "localhost:<free-port-but-not-default>" and configure maximum client
thread size for handling AM requests? only your AM is expected to
communicate with RM on AM-RM protocol.. by any chance in your code; are
containers directly communicating with RM on AM-RM protocol??

  <property>

    <description>The address of the scheduler interface.</description>

    <name>yarn.resourcemanager.scheduler.address</name>

    <value>${yarn.resourcemanager.hostname}:8030</value>

  </property>


  <property>

    <description>Number of threads to handle scheduler interface.</
description>

    <name>yarn.resourcemanager.scheduler.client.thread-count</name>

    <value>50</value>

  </property>


Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Fri, Jun 28, 2013 at 5:35 AM, blah blah <tm...@gmail.com> wrote:

> Hi
>
> Sorry to reply so late. I don't have the data you requested (sorry I have
> no time, my deadline is within 3 days). However I have observed that this
> issue occurs not only for the "larger" datasets (6.8MB), but for all
> datasets and all jobs in general. However for smaller datasets (1MB) the AM
> does not throw the Exception, only containers throw exceptions (same as in
> previous e-mail). When these exception are throws my code (AM and
> containers) does not perform any operations on HDFS, they only perform
> in-memory computation and communication. Also I have observed that these
> exception occur at "random", I couldn't observe any pattern. I can execute
> job successfully, then resubmit the job repeating the experiment and these
> exceptions occur (no change was made to src code, input dataset,or
> execution/input parameters).
>
> As for the high network usage, as I said I don't have the data. But YARN
> is running on nodes which are exclusive for my experiments no other
> software runs on these nodes (only OS and YARN). Besides I don't think that
> 20 containers working on 1MB dataset (total) can be called high network
> usage.
>
> regards
> tmp
>
>
>
> 2013/6/26 Devaraj k <de...@huawei.com>
>
>>  Hi,****
>>
>> ** **
>>
>>    Could you check the network usage in the cluster when this problem
>> occurs? Probably it is causing due to high network usage. ****
>>
>> ** **
>>
>> Thanks****
>>
>> Devaraj k****
>>
>> ** **
>>
>> *From:* blah blah [mailto:tmp5330@gmail.com]
>> *Sent:* 26 June 2013 05:39
>> *To:* user@hadoop.apache.org
>> *Subject:* Yarn HDFS and Yarn Exceptions when processing "larger"
>> datasets.****
>>
>> ** **
>>
>> Hi All****
>>
>> First let me excuse for the poor thread title but I have no idea how to
>> express the problem in one sentence. ****
>>
>> I have implemented new Application Master with the use of Yarn. I am
>> using old Yarn development version. Revision 1437315, from 2013-01-23
>> (SNAPSHOT 3.0.0). I can not update to current trunk version, as prototype
>> deadline is soon, and I don't have time to include Yarn API changes.****
>>
>> Currently I execute experiments in pseudo-distributed mode, I use guava
>> version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
>> "larger" datasets. My AM works fine and I can execute it without a problem
>> for a debug dataset (1MB size). But when I increase the size of input to
>> 6.8 MB, I am getting the following exceptions:****
>>
>> AM_Exceptions_Stack
>>
>> Exception in thread "Thread-3"
>> java.lang.reflect.UndeclaredThrowableException
>>     at
>> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
>>     at
>> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
>>     at
>> org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
>>     at
>> org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
>>     at
>> org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
>>     at java.lang.Thread.run(Thread.java:662)
>> Caused by: com.google.protobuf.ServiceException: java.io.IOException:
>> Failed on local exception: java.io.IOException: Response is null.; Host
>> Details : local host is: "linux-ljc5.site/127.0.0.1"; destination host
>> is: "0.0.0.0":8030;
>>     at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>>     at $Proxy10.allocate(Unknown Source)
>>     at
>> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
>>     ... 4 more
>> Caused by: java.io.IOException: Failed on local exception:
>> java.io.IOException: Response is null.; Host Details : local host is:
>> "linux-ljc5.site/127.0.0.1"; destination host is: "0.0.0.0":8030;
>>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1240)
>>     at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>>     ... 6 more
>> Caused by: java.io.IOException: Response is null.
>>     at
>> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
>>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)****
>>
>> Container_Exception
>>
>> Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866"
>> java.lang.NoSuchMethodError:
>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>     at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
>>     at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
>>     at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
>>     at java.lang.Thread.run(Thread.java:662)
>>
>> ****
>>
>> As I said this problem does not occur for the 1MB input. For the 6MB
>> input nothing is changed except the input dataset. Now a little bit of what
>> am I doing, to give you the context of the problem. My AM starts N (debug
>> 4) containers and each container reads its input data part. When this
>> process is finished I am exchanging parts of input between containers
>> (exchanging IDs of input structures, to provide means for communication
>> between data structures). During the process of exchanging IDs these
>> exceptions occur. I start Netty Server/Client on each container and I use
>> ports 12000-12099 as mean of communicating these IDs. ****
>>
>> Any help will be greatly appreciated. Sorry for any typos and if the
>> explanation is not clear just ask for any details you are interested in.
>> Currently it is after 2 AM I hope this will be a valid excuse.****
>>
>> regards****
>>
>> tmp****
>>
>
>

Re: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by blah blah <tm...@gmail.com>.
Hi

Sorry to reply so late. I don't have the data you requested (sorry I have
no time, my deadline is within 3 days). However I have observed that this
issue occurs not only for the "larger" datasets (6.8MB), but for all
datasets and all jobs in general. However for smaller datasets (1MB) the AM
does not throw the Exception, only containers throw exceptions (same as in
previous e-mail). When these exception are throws my code (AM and
containers) does not perform any operations on HDFS, they only perform
in-memory computation and communication. Also I have observed that these
exception occur at "random", I couldn't observe any pattern. I can execute
job successfully, then resubmit the job repeating the experiment and these
exceptions occur (no change was made to src code, input dataset,or
execution/input parameters).

As for the high network usage, as I said I don't have the data. But YARN is
running on nodes which are exclusive for my experiments no other software
runs on these nodes (only OS and YARN). Besides I don't think that 20
containers working on 1MB dataset (total) can be called high network usage.

regards
tmp



2013/6/26 Devaraj k <de...@huawei.com>

>  Hi,****
>
> ** **
>
>    Could you check the network usage in the cluster when this problem
> occurs? Probably it is causing due to high network usage. ****
>
> ** **
>
> Thanks****
>
> Devaraj k****
>
> ** **
>
> *From:* blah blah [mailto:tmp5330@gmail.com]
> *Sent:* 26 June 2013 05:39
> *To:* user@hadoop.apache.org
> *Subject:* Yarn HDFS and Yarn Exceptions when processing "larger"
> datasets.****
>
> ** **
>
> Hi All****
>
> First let me excuse for the poor thread title but I have no idea how to
> express the problem in one sentence. ****
>
> I have implemented new Application Master with the use of Yarn. I am using
> old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT
> 3.0.0). I can not update to current trunk version, as prototype deadline is
> soon, and I don't have time to include Yarn API changes.****
>
> Currently I execute experiments in pseudo-distributed mode, I use guava
> version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
> "larger" datasets. My AM works fine and I can execute it without a problem
> for a debug dataset (1MB size). But when I increase the size of input to
> 6.8 MB, I am getting the following exceptions:****
>
> AM_Exceptions_Stack
>
> Exception in thread "Thread-3"
> java.lang.reflect.UndeclaredThrowableException
>     at
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
>     at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
>     at
> org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
>     at
> org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
>     at
> org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
>     at java.lang.Thread.run(Thread.java:662)
> Caused by: com.google.protobuf.ServiceException: java.io.IOException:
> Failed on local exception: java.io.IOException: Response is null.; Host
> Details : local host is: "linux-ljc5.site/127.0.0.1"; destination host
> is: "0.0.0.0":8030;
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>     at $Proxy10.allocate(Unknown Source)
>     at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
>     ... 4 more
> Caused by: java.io.IOException: Failed on local exception:
> java.io.IOException: Response is null.; Host Details : local host is:
> "linux-ljc5.site/127.0.0.1"; destination host is: "0.0.0.0":8030;
>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1240)
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>     ... 6 more
> Caused by: java.io.IOException: Response is null.
>     at
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)****
>
> Container_Exception
>
> Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866"
> java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>     at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
>     at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
>     at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
>     at java.lang.Thread.run(Thread.java:662)
>
> ****
>
> As I said this problem does not occur for the 1MB input. For the 6MB input
> nothing is changed except the input dataset. Now a little bit of what am I
> doing, to give you the context of the problem. My AM starts N (debug 4)
> containers and each container reads its input data part. When this process
> is finished I am exchanging parts of input between containers (exchanging
> IDs of input structures, to provide means for communication between data
> structures). During the process of exchanging IDs these exceptions occur. I
> start Netty Server/Client on each container and I use ports 12000-12099 as
> mean of communicating these IDs. ****
>
> Any help will be greatly appreciated. Sorry for any typos and if the
> explanation is not clear just ask for any details you are interested in.
> Currently it is after 2 AM I hope this will be a valid excuse.****
>
> regards****
>
> tmp****
>

Re: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by blah blah <tm...@gmail.com>.
Hi

Sorry to reply so late. I don't have the data you requested (sorry I have
no time, my deadline is within 3 days). However I have observed that this
issue occurs not only for the "larger" datasets (6.8MB), but for all
datasets and all jobs in general. However for smaller datasets (1MB) the AM
does not throw the Exception, only containers throw exceptions (same as in
previous e-mail). When these exception are throws my code (AM and
containers) does not perform any operations on HDFS, they only perform
in-memory computation and communication. Also I have observed that these
exception occur at "random", I couldn't observe any pattern. I can execute
job successfully, then resubmit the job repeating the experiment and these
exceptions occur (no change was made to src code, input dataset,or
execution/input parameters).

As for the high network usage, as I said I don't have the data. But YARN is
running on nodes which are exclusive for my experiments no other software
runs on these nodes (only OS and YARN). Besides I don't think that 20
containers working on 1MB dataset (total) can be called high network usage.

regards
tmp



2013/6/26 Devaraj k <de...@huawei.com>

>  Hi,****
>
> ** **
>
>    Could you check the network usage in the cluster when this problem
> occurs? Probably it is causing due to high network usage. ****
>
> ** **
>
> Thanks****
>
> Devaraj k****
>
> ** **
>
> *From:* blah blah [mailto:tmp5330@gmail.com]
> *Sent:* 26 June 2013 05:39
> *To:* user@hadoop.apache.org
> *Subject:* Yarn HDFS and Yarn Exceptions when processing "larger"
> datasets.****
>
> ** **
>
> Hi All****
>
> First let me excuse for the poor thread title but I have no idea how to
> express the problem in one sentence. ****
>
> I have implemented new Application Master with the use of Yarn. I am using
> old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT
> 3.0.0). I can not update to current trunk version, as prototype deadline is
> soon, and I don't have time to include Yarn API changes.****
>
> Currently I execute experiments in pseudo-distributed mode, I use guava
> version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
> "larger" datasets. My AM works fine and I can execute it without a problem
> for a debug dataset (1MB size). But when I increase the size of input to
> 6.8 MB, I am getting the following exceptions:****
>
> AM_Exceptions_Stack
>
> Exception in thread "Thread-3"
> java.lang.reflect.UndeclaredThrowableException
>     at
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
>     at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
>     at
> org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
>     at
> org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
>     at
> org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
>     at java.lang.Thread.run(Thread.java:662)
> Caused by: com.google.protobuf.ServiceException: java.io.IOException:
> Failed on local exception: java.io.IOException: Response is null.; Host
> Details : local host is: "linux-ljc5.site/127.0.0.1"; destination host
> is: "0.0.0.0":8030;
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>     at $Proxy10.allocate(Unknown Source)
>     at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
>     ... 4 more
> Caused by: java.io.IOException: Failed on local exception:
> java.io.IOException: Response is null.; Host Details : local host is:
> "linux-ljc5.site/127.0.0.1"; destination host is: "0.0.0.0":8030;
>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1240)
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>     ... 6 more
> Caused by: java.io.IOException: Response is null.
>     at
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)****
>
> Container_Exception
>
> Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866"
> java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>     at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
>     at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
>     at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
>     at java.lang.Thread.run(Thread.java:662)
>
> ****
>
> As I said this problem does not occur for the 1MB input. For the 6MB input
> nothing is changed except the input dataset. Now a little bit of what am I
> doing, to give you the context of the problem. My AM starts N (debug 4)
> containers and each container reads its input data part. When this process
> is finished I am exchanging parts of input between containers (exchanging
> IDs of input structures, to provide means for communication between data
> structures). During the process of exchanging IDs these exceptions occur. I
> start Netty Server/Client on each container and I use ports 12000-12099 as
> mean of communicating these IDs. ****
>
> Any help will be greatly appreciated. Sorry for any typos and if the
> explanation is not clear just ask for any details you are interested in.
> Currently it is after 2 AM I hope this will be a valid excuse.****
>
> regards****
>
> tmp****
>

Re: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by blah blah <tm...@gmail.com>.
Hi

Sorry to reply so late. I don't have the data you requested (sorry I have
no time, my deadline is within 3 days). However I have observed that this
issue occurs not only for the "larger" datasets (6.8MB), but for all
datasets and all jobs in general. However for smaller datasets (1MB) the AM
does not throw the Exception, only containers throw exceptions (same as in
previous e-mail). When these exception are throws my code (AM and
containers) does not perform any operations on HDFS, they only perform
in-memory computation and communication. Also I have observed that these
exception occur at "random", I couldn't observe any pattern. I can execute
job successfully, then resubmit the job repeating the experiment and these
exceptions occur (no change was made to src code, input dataset,or
execution/input parameters).

As for the high network usage, as I said I don't have the data. But YARN is
running on nodes which are exclusive for my experiments no other software
runs on these nodes (only OS and YARN). Besides I don't think that 20
containers working on 1MB dataset (total) can be called high network usage.

regards
tmp



2013/6/26 Devaraj k <de...@huawei.com>

>  Hi,****
>
> ** **
>
>    Could you check the network usage in the cluster when this problem
> occurs? Probably it is causing due to high network usage. ****
>
> ** **
>
> Thanks****
>
> Devaraj k****
>
> ** **
>
> *From:* blah blah [mailto:tmp5330@gmail.com]
> *Sent:* 26 June 2013 05:39
> *To:* user@hadoop.apache.org
> *Subject:* Yarn HDFS and Yarn Exceptions when processing "larger"
> datasets.****
>
> ** **
>
> Hi All****
>
> First let me excuse for the poor thread title but I have no idea how to
> express the problem in one sentence. ****
>
> I have implemented new Application Master with the use of Yarn. I am using
> old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT
> 3.0.0). I can not update to current trunk version, as prototype deadline is
> soon, and I don't have time to include Yarn API changes.****
>
> Currently I execute experiments in pseudo-distributed mode, I use guava
> version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
> "larger" datasets. My AM works fine and I can execute it without a problem
> for a debug dataset (1MB size). But when I increase the size of input to
> 6.8 MB, I am getting the following exceptions:****
>
> AM_Exceptions_Stack
>
> Exception in thread "Thread-3"
> java.lang.reflect.UndeclaredThrowableException
>     at
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
>     at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
>     at
> org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
>     at
> org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
>     at
> org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
>     at java.lang.Thread.run(Thread.java:662)
> Caused by: com.google.protobuf.ServiceException: java.io.IOException:
> Failed on local exception: java.io.IOException: Response is null.; Host
> Details : local host is: "linux-ljc5.site/127.0.0.1"; destination host
> is: "0.0.0.0":8030;
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>     at $Proxy10.allocate(Unknown Source)
>     at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
>     ... 4 more
> Caused by: java.io.IOException: Failed on local exception:
> java.io.IOException: Response is null.; Host Details : local host is:
> "linux-ljc5.site/127.0.0.1"; destination host is: "0.0.0.0":8030;
>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1240)
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>     ... 6 more
> Caused by: java.io.IOException: Response is null.
>     at
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)****
>
> Container_Exception
>
> Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866"
> java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>     at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
>     at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
>     at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
>     at java.lang.Thread.run(Thread.java:662)
>
> ****
>
> As I said this problem does not occur for the 1MB input. For the 6MB input
> nothing is changed except the input dataset. Now a little bit of what am I
> doing, to give you the context of the problem. My AM starts N (debug 4)
> containers and each container reads its input data part. When this process
> is finished I am exchanging parts of input between containers (exchanging
> IDs of input structures, to provide means for communication between data
> structures). During the process of exchanging IDs these exceptions occur. I
> start Netty Server/Client on each container and I use ports 12000-12099 as
> mean of communicating these IDs. ****
>
> Any help will be greatly appreciated. Sorry for any typos and if the
> explanation is not clear just ask for any details you are interested in.
> Currently it is after 2 AM I hope this will be a valid excuse.****
>
> regards****
>
> tmp****
>

Re: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by blah blah <tm...@gmail.com>.
Hi

Sorry to reply so late. I don't have the data you requested (sorry I have
no time, my deadline is within 3 days). However I have observed that this
issue occurs not only for the "larger" datasets (6.8MB), but for all
datasets and all jobs in general. However for smaller datasets (1MB) the AM
does not throw the Exception, only containers throw exceptions (same as in
previous e-mail). When these exception are throws my code (AM and
containers) does not perform any operations on HDFS, they only perform
in-memory computation and communication. Also I have observed that these
exception occur at "random", I couldn't observe any pattern. I can execute
job successfully, then resubmit the job repeating the experiment and these
exceptions occur (no change was made to src code, input dataset,or
execution/input parameters).

As for the high network usage, as I said I don't have the data. But YARN is
running on nodes which are exclusive for my experiments no other software
runs on these nodes (only OS and YARN). Besides I don't think that 20
containers working on 1MB dataset (total) can be called high network usage.

regards
tmp



2013/6/26 Devaraj k <de...@huawei.com>

>  Hi,****
>
> ** **
>
>    Could you check the network usage in the cluster when this problem
> occurs? Probably it is causing due to high network usage. ****
>
> ** **
>
> Thanks****
>
> Devaraj k****
>
> ** **
>
> *From:* blah blah [mailto:tmp5330@gmail.com]
> *Sent:* 26 June 2013 05:39
> *To:* user@hadoop.apache.org
> *Subject:* Yarn HDFS and Yarn Exceptions when processing "larger"
> datasets.****
>
> ** **
>
> Hi All****
>
> First let me excuse for the poor thread title but I have no idea how to
> express the problem in one sentence. ****
>
> I have implemented new Application Master with the use of Yarn. I am using
> old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT
> 3.0.0). I can not update to current trunk version, as prototype deadline is
> soon, and I don't have time to include Yarn API changes.****
>
> Currently I execute experiments in pseudo-distributed mode, I use guava
> version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
> "larger" datasets. My AM works fine and I can execute it without a problem
> for a debug dataset (1MB size). But when I increase the size of input to
> 6.8 MB, I am getting the following exceptions:****
>
> AM_Exceptions_Stack
>
> Exception in thread "Thread-3"
> java.lang.reflect.UndeclaredThrowableException
>     at
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
>     at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
>     at
> org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
>     at
> org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
>     at
> org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
>     at java.lang.Thread.run(Thread.java:662)
> Caused by: com.google.protobuf.ServiceException: java.io.IOException:
> Failed on local exception: java.io.IOException: Response is null.; Host
> Details : local host is: "linux-ljc5.site/127.0.0.1"; destination host
> is: "0.0.0.0":8030;
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>     at $Proxy10.allocate(Unknown Source)
>     at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
>     ... 4 more
> Caused by: java.io.IOException: Failed on local exception:
> java.io.IOException: Response is null.; Host Details : local host is:
> "linux-ljc5.site/127.0.0.1"; destination host is: "0.0.0.0":8030;
>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1240)
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>     ... 6 more
> Caused by: java.io.IOException: Response is null.
>     at
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)****
>
> Container_Exception
>
> Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866"
> java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>     at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
>     at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
>     at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
>     at java.lang.Thread.run(Thread.java:662)
>
> ****
>
> As I said this problem does not occur for the 1MB input. For the 6MB input
> nothing is changed except the input dataset. Now a little bit of what am I
> doing, to give you the context of the problem. My AM starts N (debug 4)
> containers and each container reads its input data part. When this process
> is finished I am exchanging parts of input between containers (exchanging
> IDs of input structures, to provide means for communication between data
> structures). During the process of exchanging IDs these exceptions occur. I
> start Netty Server/Client on each container and I use ports 12000-12099 as
> mean of communicating these IDs. ****
>
> Any help will be greatly appreciated. Sorry for any typos and if the
> explanation is not clear just ask for any details you are interested in.
> Currently it is after 2 AM I hope this will be a valid excuse.****
>
> regards****
>
> tmp****
>

RE: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by Devaraj k <de...@huawei.com>.
Hi,

   Could you check the network usage in the cluster when this problem occurs? Probably it is causing due to high network usage.

Thanks
Devaraj k

From: blah blah [mailto:tmp5330@gmail.com]
Sent: 26 June 2013 05:39
To: user@hadoop.apache.org
Subject: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Hi All
First let me excuse for the poor thread title but I have no idea how to express the problem in one sentence.
I have implemented new Application Master with the use of Yarn. I am using old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT 3.0.0). I can not update to current trunk version, as prototype deadline is soon, and I don't have time to include Yarn API changes.
Currently I execute experiments in pseudo-distributed mode, I use guava version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for "larger" datasets. My AM works fine and I can execute it without a problem for a debug dataset (1MB size). But when I increase the size of input to 6.8 MB, I am getting the following exceptions:
AM_Exceptions_Stack

Exception in thread "Thread-3" java.lang.reflect.UndeclaredThrowableException
    at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
    at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
    at org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
    at org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
    at org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
    at java.lang.Thread.run(Thread.java:662)
Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "linux-ljc5.site/127.0.0.1<http://127.0.0.1>"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
    at $Proxy10.allocate(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
    ... 4 more
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "linux-ljc5.site/127.0.0.1<http://127.0.0.1>"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
    at org.apache.hadoop.ipc.Client.call(Client.java:1240)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
    ... 6 more
Caused by: java.io.IOException: Response is null.
    at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)
Container_Exception

Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866<ma...@6da0d866>" java.lang.NoSuchMethodError: com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
    at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
    at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
    at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
    at java.lang.Thread.run(Thread.java:662)

As I said this problem does not occur for the 1MB input. For the 6MB input nothing is changed except the input dataset. Now a little bit of what am I doing, to give you the context of the problem. My AM starts N (debug 4) containers and each container reads its input data part. When this process is finished I am exchanging parts of input between containers (exchanging IDs of input structures, to provide means for communication between data structures). During the process of exchanging IDs these exceptions occur. I start Netty Server/Client on each container and I use ports 12000-12099 as mean of communicating these IDs.
Any help will be greatly appreciated. Sorry for any typos and if the explanation is not clear just ask for any details you are interested in. Currently it is after 2 AM I hope this will be a valid excuse.
regards
tmp

RE: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by Devaraj k <de...@huawei.com>.
Hi,

   Could you check the network usage in the cluster when this problem occurs? Probably it is causing due to high network usage.

Thanks
Devaraj k

From: blah blah [mailto:tmp5330@gmail.com]
Sent: 26 June 2013 05:39
To: user@hadoop.apache.org
Subject: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Hi All
First let me excuse for the poor thread title but I have no idea how to express the problem in one sentence.
I have implemented new Application Master with the use of Yarn. I am using old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT 3.0.0). I can not update to current trunk version, as prototype deadline is soon, and I don't have time to include Yarn API changes.
Currently I execute experiments in pseudo-distributed mode, I use guava version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for "larger" datasets. My AM works fine and I can execute it without a problem for a debug dataset (1MB size). But when I increase the size of input to 6.8 MB, I am getting the following exceptions:
AM_Exceptions_Stack

Exception in thread "Thread-3" java.lang.reflect.UndeclaredThrowableException
    at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
    at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
    at org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
    at org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
    at org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
    at java.lang.Thread.run(Thread.java:662)
Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "linux-ljc5.site/127.0.0.1<http://127.0.0.1>"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
    at $Proxy10.allocate(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
    ... 4 more
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "linux-ljc5.site/127.0.0.1<http://127.0.0.1>"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
    at org.apache.hadoop.ipc.Client.call(Client.java:1240)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
    ... 6 more
Caused by: java.io.IOException: Response is null.
    at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)
Container_Exception

Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866<ma...@6da0d866>" java.lang.NoSuchMethodError: com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
    at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
    at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
    at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
    at java.lang.Thread.run(Thread.java:662)

As I said this problem does not occur for the 1MB input. For the 6MB input nothing is changed except the input dataset. Now a little bit of what am I doing, to give you the context of the problem. My AM starts N (debug 4) containers and each container reads its input data part. When this process is finished I am exchanging parts of input between containers (exchanging IDs of input structures, to provide means for communication between data structures). During the process of exchanging IDs these exceptions occur. I start Netty Server/Client on each container and I use ports 12000-12099 as mean of communicating these IDs.
Any help will be greatly appreciated. Sorry for any typos and if the explanation is not clear just ask for any details you are interested in. Currently it is after 2 AM I hope this will be a valid excuse.
regards
tmp

Re: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by blah blah <tm...@gmail.com>.
Hi

Just a quick short reply (tomorrow is my prototype presentation).

@Omkar Joshi
- RM port 8030 already running when I start my AM
- I'll do the client thread size AM
- Only AM communicates with RM
- RM/NM no exceptions there (as far as I remember will check later [sorry])

Furthermore in fully distributed mode AM doesn't throw exceptions anymore,
only Containers.

@John Lilley
Yes the problem is with my code (I don't want to imply that it is YARN's
problem). I have successfully run Distributed Shell and YARN's MapReduce
jobs with much bigger datasets than 1mb ;). I just don't know where to
start looking for the problem, especially for the Containers exceptions as
they occur after my containers are "done" with HDFS (until they store final
output).

The only "idea" I have is that these exceptions occur during Containers
communication. Instead of sending multiple messages my containers aggregate
all messages per container into one "big" message (the biggest around
8k-10k chars), thus each container sends only 1 message to other container
(which includes multiple messages). I don't know if this information is
important, but I am planning to see what will happen if I partition the
messages (1024). I got this "idea" from the Containers exception "
org.apache.hadoop.hdfs.SocketCache", I am using SocketChannels to send
these "big" messages, so maybe I am creating some Socket "conflict" .

regards
tmp



2013/7/2 John Lilley <jo...@redpoint.net>

>  Blah blah,****
>
> Can you build and run the DistributedShell example?  If it does not run
> correctly this would tend to implicate your configuration.  If it run
> correctly then your code is suspect.****
>
> John****
>
> ** **
>
> ** **
>
> *From:* blah blah [mailto:tmp5330@gmail.com]
> *Sent:* Tuesday, June 25, 2013 6:09 PM
>
> *To:* user@hadoop.apache.org
> *Subject:* Yarn HDFS and Yarn Exceptions when processing "larger"
> datasets.****
>
> ** **
>
> Hi All****
>
> First let me excuse for the poor thread title but I have no idea how to
> express the problem in one sentence. ****
>
> I have implemented new Application Master with the use of Yarn. I am using
> old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT
> 3.0.0). I can not update to current trunk version, as prototype deadline is
> soon, and I don't have time to include Yarn API changes.****
>
> Currently I execute experiments in pseudo-distributed mode, I use guava
> version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
> "larger" datasets. My AM works fine and I can execute it without a problem
> for a debug dataset (1MB size). But when I increase the size of input to
> 6.8 MB, I am getting the following exceptions:****
>
> AM_Exceptions_Stack
>
> Exception in thread "Thread-3"
> java.lang.reflect.UndeclaredThrowableException
>     at
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
>     at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
>     at
> org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
>     at
> org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
>     at
> org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
>     at java.lang.Thread.run(Thread.java:662)
> Caused by: com.google.protobuf.ServiceException: java.io.IOException:
> Failed on local exception: java.io.IOException: Response is null.; Host
> Details : local host is: "linux-ljc5.site/127.0.0.1"; destination host
> is: "0.0.0.0":8030;
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>     at $Proxy10.allocate(Unknown Source)
>     at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
>     ... 4 more
> Caused by: java.io.IOException: Failed on local exception:
> java.io.IOException: Response is null.; Host Details : local host is:
> "linux-ljc5.site/127.0.0.1"; destination host is: "0.0.0.0":8030;
>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1240)
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>     ... 6 more
> Caused by: java.io.IOException: Response is null.
>     at
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)****
>
> Container_Exception
>
> Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866"
> java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>     at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
>     at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
>     at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
>     at java.lang.Thread.run(Thread.java:662)
>
> ****
>
> As I said this problem does not occur for the 1MB input. For the 6MB input
> nothing is changed except the input dataset. Now a little bit of what am I
> doing, to give you the context of the problem. My AM starts N (debug 4)
> containers and each container reads its input data part. When this process
> is finished I am exchanging parts of input between containers (exchanging
> IDs of input structures, to provide means for communication between data
> structures). During the process of exchanging IDs these exceptions occur. I
> start Netty Server/Client on each container and I use ports 12000-12099 as
> mean of communicating these IDs. ****
>
> Any help will be greatly appreciated. Sorry for any typos and if the
> explanation is not clear just ask for any details you are interested in.
> Currently it is after 2 AM I hope this will be a valid excuse.****
>
> regards****
>
> tmp****
>

Re: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by blah blah <tm...@gmail.com>.
Hi

Just a quick short reply (tomorrow is my prototype presentation).

@Omkar Joshi
- RM port 8030 already running when I start my AM
- I'll do the client thread size AM
- Only AM communicates with RM
- RM/NM no exceptions there (as far as I remember will check later [sorry])

Furthermore in fully distributed mode AM doesn't throw exceptions anymore,
only Containers.

@John Lilley
Yes the problem is with my code (I don't want to imply that it is YARN's
problem). I have successfully run Distributed Shell and YARN's MapReduce
jobs with much bigger datasets than 1mb ;). I just don't know where to
start looking for the problem, especially for the Containers exceptions as
they occur after my containers are "done" with HDFS (until they store final
output).

The only "idea" I have is that these exceptions occur during Containers
communication. Instead of sending multiple messages my containers aggregate
all messages per container into one "big" message (the biggest around
8k-10k chars), thus each container sends only 1 message to other container
(which includes multiple messages). I don't know if this information is
important, but I am planning to see what will happen if I partition the
messages (1024). I got this "idea" from the Containers exception "
org.apache.hadoop.hdfs.SocketCache", I am using SocketChannels to send
these "big" messages, so maybe I am creating some Socket "conflict" .

regards
tmp



2013/7/2 John Lilley <jo...@redpoint.net>

>  Blah blah,****
>
> Can you build and run the DistributedShell example?  If it does not run
> correctly this would tend to implicate your configuration.  If it run
> correctly then your code is suspect.****
>
> John****
>
> ** **
>
> ** **
>
> *From:* blah blah [mailto:tmp5330@gmail.com]
> *Sent:* Tuesday, June 25, 2013 6:09 PM
>
> *To:* user@hadoop.apache.org
> *Subject:* Yarn HDFS and Yarn Exceptions when processing "larger"
> datasets.****
>
> ** **
>
> Hi All****
>
> First let me excuse for the poor thread title but I have no idea how to
> express the problem in one sentence. ****
>
> I have implemented new Application Master with the use of Yarn. I am using
> old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT
> 3.0.0). I can not update to current trunk version, as prototype deadline is
> soon, and I don't have time to include Yarn API changes.****
>
> Currently I execute experiments in pseudo-distributed mode, I use guava
> version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
> "larger" datasets. My AM works fine and I can execute it without a problem
> for a debug dataset (1MB size). But when I increase the size of input to
> 6.8 MB, I am getting the following exceptions:****
>
> AM_Exceptions_Stack
>
> Exception in thread "Thread-3"
> java.lang.reflect.UndeclaredThrowableException
>     at
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
>     at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
>     at
> org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
>     at
> org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
>     at
> org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
>     at java.lang.Thread.run(Thread.java:662)
> Caused by: com.google.protobuf.ServiceException: java.io.IOException:
> Failed on local exception: java.io.IOException: Response is null.; Host
> Details : local host is: "linux-ljc5.site/127.0.0.1"; destination host
> is: "0.0.0.0":8030;
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>     at $Proxy10.allocate(Unknown Source)
>     at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
>     ... 4 more
> Caused by: java.io.IOException: Failed on local exception:
> java.io.IOException: Response is null.; Host Details : local host is:
> "linux-ljc5.site/127.0.0.1"; destination host is: "0.0.0.0":8030;
>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1240)
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>     ... 6 more
> Caused by: java.io.IOException: Response is null.
>     at
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)****
>
> Container_Exception
>
> Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866"
> java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>     at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
>     at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
>     at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
>     at java.lang.Thread.run(Thread.java:662)
>
> ****
>
> As I said this problem does not occur for the 1MB input. For the 6MB input
> nothing is changed except the input dataset. Now a little bit of what am I
> doing, to give you the context of the problem. My AM starts N (debug 4)
> containers and each container reads its input data part. When this process
> is finished I am exchanging parts of input between containers (exchanging
> IDs of input structures, to provide means for communication between data
> structures). During the process of exchanging IDs these exceptions occur. I
> start Netty Server/Client on each container and I use ports 12000-12099 as
> mean of communicating these IDs. ****
>
> Any help will be greatly appreciated. Sorry for any typos and if the
> explanation is not clear just ask for any details you are interested in.
> Currently it is after 2 AM I hope this will be a valid excuse.****
>
> regards****
>
> tmp****
>

Re: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by blah blah <tm...@gmail.com>.
Hi

Just a quick short reply (tomorrow is my prototype presentation).

@Omkar Joshi
- RM port 8030 already running when I start my AM
- I'll do the client thread size AM
- Only AM communicates with RM
- RM/NM no exceptions there (as far as I remember will check later [sorry])

Furthermore in fully distributed mode AM doesn't throw exceptions anymore,
only Containers.

@John Lilley
Yes the problem is with my code (I don't want to imply that it is YARN's
problem). I have successfully run Distributed Shell and YARN's MapReduce
jobs with much bigger datasets than 1mb ;). I just don't know where to
start looking for the problem, especially for the Containers exceptions as
they occur after my containers are "done" with HDFS (until they store final
output).

The only "idea" I have is that these exceptions occur during Containers
communication. Instead of sending multiple messages my containers aggregate
all messages per container into one "big" message (the biggest around
8k-10k chars), thus each container sends only 1 message to other container
(which includes multiple messages). I don't know if this information is
important, but I am planning to see what will happen if I partition the
messages (1024). I got this "idea" from the Containers exception "
org.apache.hadoop.hdfs.SocketCache", I am using SocketChannels to send
these "big" messages, so maybe I am creating some Socket "conflict" .

regards
tmp



2013/7/2 John Lilley <jo...@redpoint.net>

>  Blah blah,****
>
> Can you build and run the DistributedShell example?  If it does not run
> correctly this would tend to implicate your configuration.  If it run
> correctly then your code is suspect.****
>
> John****
>
> ** **
>
> ** **
>
> *From:* blah blah [mailto:tmp5330@gmail.com]
> *Sent:* Tuesday, June 25, 2013 6:09 PM
>
> *To:* user@hadoop.apache.org
> *Subject:* Yarn HDFS and Yarn Exceptions when processing "larger"
> datasets.****
>
> ** **
>
> Hi All****
>
> First let me excuse for the poor thread title but I have no idea how to
> express the problem in one sentence. ****
>
> I have implemented new Application Master with the use of Yarn. I am using
> old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT
> 3.0.0). I can not update to current trunk version, as prototype deadline is
> soon, and I don't have time to include Yarn API changes.****
>
> Currently I execute experiments in pseudo-distributed mode, I use guava
> version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
> "larger" datasets. My AM works fine and I can execute it without a problem
> for a debug dataset (1MB size). But when I increase the size of input to
> 6.8 MB, I am getting the following exceptions:****
>
> AM_Exceptions_Stack
>
> Exception in thread "Thread-3"
> java.lang.reflect.UndeclaredThrowableException
>     at
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
>     at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
>     at
> org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
>     at
> org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
>     at
> org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
>     at java.lang.Thread.run(Thread.java:662)
> Caused by: com.google.protobuf.ServiceException: java.io.IOException:
> Failed on local exception: java.io.IOException: Response is null.; Host
> Details : local host is: "linux-ljc5.site/127.0.0.1"; destination host
> is: "0.0.0.0":8030;
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>     at $Proxy10.allocate(Unknown Source)
>     at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
>     ... 4 more
> Caused by: java.io.IOException: Failed on local exception:
> java.io.IOException: Response is null.; Host Details : local host is:
> "linux-ljc5.site/127.0.0.1"; destination host is: "0.0.0.0":8030;
>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1240)
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>     ... 6 more
> Caused by: java.io.IOException: Response is null.
>     at
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)****
>
> Container_Exception
>
> Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866"
> java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>     at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
>     at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
>     at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
>     at java.lang.Thread.run(Thread.java:662)
>
> ****
>
> As I said this problem does not occur for the 1MB input. For the 6MB input
> nothing is changed except the input dataset. Now a little bit of what am I
> doing, to give you the context of the problem. My AM starts N (debug 4)
> containers and each container reads its input data part. When this process
> is finished I am exchanging parts of input between containers (exchanging
> IDs of input structures, to provide means for communication between data
> structures). During the process of exchanging IDs these exceptions occur. I
> start Netty Server/Client on each container and I use ports 12000-12099 as
> mean of communicating these IDs. ****
>
> Any help will be greatly appreciated. Sorry for any typos and if the
> explanation is not clear just ask for any details you are interested in.
> Currently it is after 2 AM I hope this will be a valid excuse.****
>
> regards****
>
> tmp****
>

Re: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by blah blah <tm...@gmail.com>.
Hi

Just a quick short reply (tomorrow is my prototype presentation).

@Omkar Joshi
- RM port 8030 already running when I start my AM
- I'll do the client thread size AM
- Only AM communicates with RM
- RM/NM no exceptions there (as far as I remember will check later [sorry])

Furthermore in fully distributed mode AM doesn't throw exceptions anymore,
only Containers.

@John Lilley
Yes the problem is with my code (I don't want to imply that it is YARN's
problem). I have successfully run Distributed Shell and YARN's MapReduce
jobs with much bigger datasets than 1mb ;). I just don't know where to
start looking for the problem, especially for the Containers exceptions as
they occur after my containers are "done" with HDFS (until they store final
output).

The only "idea" I have is that these exceptions occur during Containers
communication. Instead of sending multiple messages my containers aggregate
all messages per container into one "big" message (the biggest around
8k-10k chars), thus each container sends only 1 message to other container
(which includes multiple messages). I don't know if this information is
important, but I am planning to see what will happen if I partition the
messages (1024). I got this "idea" from the Containers exception "
org.apache.hadoop.hdfs.SocketCache", I am using SocketChannels to send
these "big" messages, so maybe I am creating some Socket "conflict" .

regards
tmp



2013/7/2 John Lilley <jo...@redpoint.net>

>  Blah blah,****
>
> Can you build and run the DistributedShell example?  If it does not run
> correctly this would tend to implicate your configuration.  If it run
> correctly then your code is suspect.****
>
> John****
>
> ** **
>
> ** **
>
> *From:* blah blah [mailto:tmp5330@gmail.com]
> *Sent:* Tuesday, June 25, 2013 6:09 PM
>
> *To:* user@hadoop.apache.org
> *Subject:* Yarn HDFS and Yarn Exceptions when processing "larger"
> datasets.****
>
> ** **
>
> Hi All****
>
> First let me excuse for the poor thread title but I have no idea how to
> express the problem in one sentence. ****
>
> I have implemented new Application Master with the use of Yarn. I am using
> old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT
> 3.0.0). I can not update to current trunk version, as prototype deadline is
> soon, and I don't have time to include Yarn API changes.****
>
> Currently I execute experiments in pseudo-distributed mode, I use guava
> version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for
> "larger" datasets. My AM works fine and I can execute it without a problem
> for a debug dataset (1MB size). But when I increase the size of input to
> 6.8 MB, I am getting the following exceptions:****
>
> AM_Exceptions_Stack
>
> Exception in thread "Thread-3"
> java.lang.reflect.UndeclaredThrowableException
>     at
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
>     at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
>     at
> org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
>     at
> org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
>     at
> org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
>     at java.lang.Thread.run(Thread.java:662)
> Caused by: com.google.protobuf.ServiceException: java.io.IOException:
> Failed on local exception: java.io.IOException: Response is null.; Host
> Details : local host is: "linux-ljc5.site/127.0.0.1"; destination host
> is: "0.0.0.0":8030;
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>     at $Proxy10.allocate(Unknown Source)
>     at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
>     ... 4 more
> Caused by: java.io.IOException: Failed on local exception:
> java.io.IOException: Response is null.; Host Details : local host is:
> "linux-ljc5.site/127.0.0.1"; destination host is: "0.0.0.0":8030;
>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1240)
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>     ... 6 more
> Caused by: java.io.IOException: Response is null.
>     at
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)****
>
> Container_Exception
>
> Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866"
> java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>     at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
>     at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
>     at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
>     at java.lang.Thread.run(Thread.java:662)
>
> ****
>
> As I said this problem does not occur for the 1MB input. For the 6MB input
> nothing is changed except the input dataset. Now a little bit of what am I
> doing, to give you the context of the problem. My AM starts N (debug 4)
> containers and each container reads its input data part. When this process
> is finished I am exchanging parts of input between containers (exchanging
> IDs of input structures, to provide means for communication between data
> structures). During the process of exchanging IDs these exceptions occur. I
> start Netty Server/Client on each container and I use ports 12000-12099 as
> mean of communicating these IDs. ****
>
> Any help will be greatly appreciated. Sorry for any typos and if the
> explanation is not clear just ask for any details you are interested in.
> Currently it is after 2 AM I hope this will be a valid excuse.****
>
> regards****
>
> tmp****
>

RE: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by John Lilley <jo...@redpoint.net>.
Blah blah,
Can you build and run the DistributedShell example?  If it does not run correctly this would tend to implicate your configuration.  If it run correctly then your code is suspect.
John


From: blah blah [mailto:tmp5330@gmail.com]
Sent: Tuesday, June 25, 2013 6:09 PM
To: user@hadoop.apache.org
Subject: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Hi All
First let me excuse for the poor thread title but I have no idea how to express the problem in one sentence.
I have implemented new Application Master with the use of Yarn. I am using old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT 3.0.0). I can not update to current trunk version, as prototype deadline is soon, and I don't have time to include Yarn API changes.
Currently I execute experiments in pseudo-distributed mode, I use guava version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for "larger" datasets. My AM works fine and I can execute it without a problem for a debug dataset (1MB size). But when I increase the size of input to 6.8 MB, I am getting the following exceptions:
AM_Exceptions_Stack

Exception in thread "Thread-3" java.lang.reflect.UndeclaredThrowableException
    at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
    at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
    at org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
    at org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
    at org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
    at java.lang.Thread.run(Thread.java:662)
Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "linux-ljc5.site/127.0.0.1<http://127.0.0.1>"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
    at $Proxy10.allocate(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
    ... 4 more
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "linux-ljc5.site/127.0.0.1<http://127.0.0.1>"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
    at org.apache.hadoop.ipc.Client.call(Client.java:1240)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
    ... 6 more
Caused by: java.io.IOException: Response is null.
    at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)
Container_Exception

Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866<ma...@6da0d866>" java.lang.NoSuchMethodError: com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
    at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
    at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
    at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
    at java.lang.Thread.run(Thread.java:662)

As I said this problem does not occur for the 1MB input. For the 6MB input nothing is changed except the input dataset. Now a little bit of what am I doing, to give you the context of the problem. My AM starts N (debug 4) containers and each container reads its input data part. When this process is finished I am exchanging parts of input between containers (exchanging IDs of input structures, to provide means for communication between data structures). During the process of exchanging IDs these exceptions occur. I start Netty Server/Client on each container and I use ports 12000-12099 as mean of communicating these IDs.
Any help will be greatly appreciated. Sorry for any typos and if the explanation is not clear just ask for any details you are interested in. Currently it is after 2 AM I hope this will be a valid excuse.
regards
tmp

RE: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by John Lilley <jo...@redpoint.net>.
Blah blah,
Can you build and run the DistributedShell example?  If it does not run correctly this would tend to implicate your configuration.  If it run correctly then your code is suspect.
John


From: blah blah [mailto:tmp5330@gmail.com]
Sent: Tuesday, June 25, 2013 6:09 PM
To: user@hadoop.apache.org
Subject: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Hi All
First let me excuse for the poor thread title but I have no idea how to express the problem in one sentence.
I have implemented new Application Master with the use of Yarn. I am using old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT 3.0.0). I can not update to current trunk version, as prototype deadline is soon, and I don't have time to include Yarn API changes.
Currently I execute experiments in pseudo-distributed mode, I use guava version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for "larger" datasets. My AM works fine and I can execute it without a problem for a debug dataset (1MB size). But when I increase the size of input to 6.8 MB, I am getting the following exceptions:
AM_Exceptions_Stack

Exception in thread "Thread-3" java.lang.reflect.UndeclaredThrowableException
    at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
    at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
    at org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
    at org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
    at org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
    at java.lang.Thread.run(Thread.java:662)
Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "linux-ljc5.site/127.0.0.1<http://127.0.0.1>"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
    at $Proxy10.allocate(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
    ... 4 more
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "linux-ljc5.site/127.0.0.1<http://127.0.0.1>"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
    at org.apache.hadoop.ipc.Client.call(Client.java:1240)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
    ... 6 more
Caused by: java.io.IOException: Response is null.
    at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)
Container_Exception

Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866<ma...@6da0d866>" java.lang.NoSuchMethodError: com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
    at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
    at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
    at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
    at java.lang.Thread.run(Thread.java:662)

As I said this problem does not occur for the 1MB input. For the 6MB input nothing is changed except the input dataset. Now a little bit of what am I doing, to give you the context of the problem. My AM starts N (debug 4) containers and each container reads its input data part. When this process is finished I am exchanging parts of input between containers (exchanging IDs of input structures, to provide means for communication between data structures). During the process of exchanging IDs these exceptions occur. I start Netty Server/Client on each container and I use ports 12000-12099 as mean of communicating these IDs.
Any help will be greatly appreciated. Sorry for any typos and if the explanation is not clear just ask for any details you are interested in. Currently it is after 2 AM I hope this will be a valid excuse.
regards
tmp

RE: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by John Lilley <jo...@redpoint.net>.
Blah blah,
Can you build and run the DistributedShell example?  If it does not run correctly this would tend to implicate your configuration.  If it run correctly then your code is suspect.
John


From: blah blah [mailto:tmp5330@gmail.com]
Sent: Tuesday, June 25, 2013 6:09 PM
To: user@hadoop.apache.org
Subject: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Hi All
First let me excuse for the poor thread title but I have no idea how to express the problem in one sentence.
I have implemented new Application Master with the use of Yarn. I am using old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT 3.0.0). I can not update to current trunk version, as prototype deadline is soon, and I don't have time to include Yarn API changes.
Currently I execute experiments in pseudo-distributed mode, I use guava version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for "larger" datasets. My AM works fine and I can execute it without a problem for a debug dataset (1MB size). But when I increase the size of input to 6.8 MB, I am getting the following exceptions:
AM_Exceptions_Stack

Exception in thread "Thread-3" java.lang.reflect.UndeclaredThrowableException
    at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
    at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
    at org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
    at org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
    at org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
    at java.lang.Thread.run(Thread.java:662)
Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "linux-ljc5.site/127.0.0.1<http://127.0.0.1>"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
    at $Proxy10.allocate(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
    ... 4 more
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "linux-ljc5.site/127.0.0.1<http://127.0.0.1>"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
    at org.apache.hadoop.ipc.Client.call(Client.java:1240)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
    ... 6 more
Caused by: java.io.IOException: Response is null.
    at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)
Container_Exception

Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866<ma...@6da0d866>" java.lang.NoSuchMethodError: com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
    at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
    at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
    at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
    at java.lang.Thread.run(Thread.java:662)

As I said this problem does not occur for the 1MB input. For the 6MB input nothing is changed except the input dataset. Now a little bit of what am I doing, to give you the context of the problem. My AM starts N (debug 4) containers and each container reads its input data part. When this process is finished I am exchanging parts of input between containers (exchanging IDs of input structures, to provide means for communication between data structures). During the process of exchanging IDs these exceptions occur. I start Netty Server/Client on each container and I use ports 12000-12099 as mean of communicating these IDs.
Any help will be greatly appreciated. Sorry for any typos and if the explanation is not clear just ask for any details you are interested in. Currently it is after 2 AM I hope this will be a valid excuse.
regards
tmp

RE: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by Devaraj k <de...@huawei.com>.
Hi,

   Could you check the network usage in the cluster when this problem occurs? Probably it is causing due to high network usage.

Thanks
Devaraj k

From: blah blah [mailto:tmp5330@gmail.com]
Sent: 26 June 2013 05:39
To: user@hadoop.apache.org
Subject: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Hi All
First let me excuse for the poor thread title but I have no idea how to express the problem in one sentence.
I have implemented new Application Master with the use of Yarn. I am using old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT 3.0.0). I can not update to current trunk version, as prototype deadline is soon, and I don't have time to include Yarn API changes.
Currently I execute experiments in pseudo-distributed mode, I use guava version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for "larger" datasets. My AM works fine and I can execute it without a problem for a debug dataset (1MB size). But when I increase the size of input to 6.8 MB, I am getting the following exceptions:
AM_Exceptions_Stack

Exception in thread "Thread-3" java.lang.reflect.UndeclaredThrowableException
    at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
    at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
    at org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
    at org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
    at org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
    at java.lang.Thread.run(Thread.java:662)
Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "linux-ljc5.site/127.0.0.1<http://127.0.0.1>"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
    at $Proxy10.allocate(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
    ... 4 more
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "linux-ljc5.site/127.0.0.1<http://127.0.0.1>"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
    at org.apache.hadoop.ipc.Client.call(Client.java:1240)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
    ... 6 more
Caused by: java.io.IOException: Response is null.
    at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)
Container_Exception

Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866<ma...@6da0d866>" java.lang.NoSuchMethodError: com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
    at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
    at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
    at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
    at java.lang.Thread.run(Thread.java:662)

As I said this problem does not occur for the 1MB input. For the 6MB input nothing is changed except the input dataset. Now a little bit of what am I doing, to give you the context of the problem. My AM starts N (debug 4) containers and each container reads its input data part. When this process is finished I am exchanging parts of input between containers (exchanging IDs of input structures, to provide means for communication between data structures). During the process of exchanging IDs these exceptions occur. I start Netty Server/Client on each container and I use ports 12000-12099 as mean of communicating these IDs.
Any help will be greatly appreciated. Sorry for any typos and if the explanation is not clear just ask for any details you are interested in. Currently it is after 2 AM I hope this will be a valid excuse.
regards
tmp

RE: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by Devaraj k <de...@huawei.com>.
Hi,

   Could you check the network usage in the cluster when this problem occurs? Probably it is causing due to high network usage.

Thanks
Devaraj k

From: blah blah [mailto:tmp5330@gmail.com]
Sent: 26 June 2013 05:39
To: user@hadoop.apache.org
Subject: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Hi All
First let me excuse for the poor thread title but I have no idea how to express the problem in one sentence.
I have implemented new Application Master with the use of Yarn. I am using old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT 3.0.0). I can not update to current trunk version, as prototype deadline is soon, and I don't have time to include Yarn API changes.
Currently I execute experiments in pseudo-distributed mode, I use guava version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for "larger" datasets. My AM works fine and I can execute it without a problem for a debug dataset (1MB size). But when I increase the size of input to 6.8 MB, I am getting the following exceptions:
AM_Exceptions_Stack

Exception in thread "Thread-3" java.lang.reflect.UndeclaredThrowableException
    at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
    at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
    at org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
    at org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
    at org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
    at java.lang.Thread.run(Thread.java:662)
Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "linux-ljc5.site/127.0.0.1<http://127.0.0.1>"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
    at $Proxy10.allocate(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
    ... 4 more
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "linux-ljc5.site/127.0.0.1<http://127.0.0.1>"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
    at org.apache.hadoop.ipc.Client.call(Client.java:1240)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
    ... 6 more
Caused by: java.io.IOException: Response is null.
    at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)
Container_Exception

Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866<ma...@6da0d866>" java.lang.NoSuchMethodError: com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
    at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
    at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
    at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
    at java.lang.Thread.run(Thread.java:662)

As I said this problem does not occur for the 1MB input. For the 6MB input nothing is changed except the input dataset. Now a little bit of what am I doing, to give you the context of the problem. My AM starts N (debug 4) containers and each container reads its input data part. When this process is finished I am exchanging parts of input between containers (exchanging IDs of input structures, to provide means for communication between data structures). During the process of exchanging IDs these exceptions occur. I start Netty Server/Client on each container and I use ports 12000-12099 as mean of communicating these IDs.
Any help will be greatly appreciated. Sorry for any typos and if the explanation is not clear just ask for any details you are interested in. Currently it is after 2 AM I hope this will be a valid excuse.
regards
tmp

RE: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Posted by John Lilley <jo...@redpoint.net>.
Blah blah,
Can you build and run the DistributedShell example?  If it does not run correctly this would tend to implicate your configuration.  If it run correctly then your code is suspect.
John


From: blah blah [mailto:tmp5330@gmail.com]
Sent: Tuesday, June 25, 2013 6:09 PM
To: user@hadoop.apache.org
Subject: Yarn HDFS and Yarn Exceptions when processing "larger" datasets.

Hi All
First let me excuse for the poor thread title but I have no idea how to express the problem in one sentence.
I have implemented new Application Master with the use of Yarn. I am using old Yarn development version. Revision 1437315, from 2013-01-23 (SNAPSHOT 3.0.0). I can not update to current trunk version, as prototype deadline is soon, and I don't have time to include Yarn API changes.
Currently I execute experiments in pseudo-distributed mode, I use guava version 14.0-rc1. I have a problem with Yarn's and HDFS Exceptions for "larger" datasets. My AM works fine and I can execute it without a problem for a debug dataset (1MB size). But when I increase the size of input to 6.8 MB, I am getting the following exceptions:
AM_Exceptions_Stack

Exception in thread "Thread-3" java.lang.reflect.UndeclaredThrowableException
    at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
    at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:77)
    at org.apache.hadoop.yarn.client.AMRMClientImpl.allocate(AMRMClientImpl.java:194)
    at org.tudelft.ludograph.app.AppMasterContainerRequester.sendContainerAskToRM(AppMasterContainerRequester.java:219)
    at org.tudelft.ludograph.app.AppMasterContainerRequester.run(AppMasterContainerRequester.java:315)
    at java.lang.Thread.run(Thread.java:662)
Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "linux-ljc5.site/127.0.0.1<http://127.0.0.1>"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
    at $Proxy10.allocate(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.allocate(AMRMProtocolPBClientImpl.java:75)
    ... 4 more
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: Response is null.; Host Details : local host is: "linux-ljc5.site/127.0.0.1<http://127.0.0.1>"; destination host is: "0.0.0.0":8030;
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
    at org.apache.hadoop.ipc.Client.call(Client.java:1240)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
    ... 6 more
Caused by: java.io.IOException: Response is null.
    at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:950)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)
Container_Exception

Exception in thread "org.apache.hadoop.hdfs.SocketCache@6da0d866<ma...@6da0d866>" java.lang.NoSuchMethodError: com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
    at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:257)
    at org.apache.hadoop.hdfs.SocketCache.access$100(SocketCache.java:45)
    at org.apache.hadoop.hdfs.SocketCache$1.run(SocketCache.java:126)
    at java.lang.Thread.run(Thread.java:662)

As I said this problem does not occur for the 1MB input. For the 6MB input nothing is changed except the input dataset. Now a little bit of what am I doing, to give you the context of the problem. My AM starts N (debug 4) containers and each container reads its input data part. When this process is finished I am exchanging parts of input between containers (exchanging IDs of input structures, to provide means for communication between data structures). During the process of exchanging IDs these exceptions occur. I start Netty Server/Client on each container and I use ports 12000-12099 as mean of communicating these IDs.
Any help will be greatly appreciated. Sorry for any typos and if the explanation is not clear just ask for any details you are interested in. Currently it is after 2 AM I hope this will be a valid excuse.
regards
tmp