You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hama.apache.org by "Edward J. Yoon" <ed...@oracle.com> on 2012/09/19 08:22:51 UTC

Hadoop RPC as a default

Let's change the default value of RPC in hama-default.xml to Hadoop RPC. 
I am testing Hadoop RPC and Avro RPC on 4 racks cluster. Avro RPC is 
criminal. There's no significant performance difference.

-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: Hadoop RPC as a default

Posted by "Edward J. Yoon" <ed...@apache.org>.

Sorry, same on large cluster :( Let's fix 0.7 release.

On Wed, Sep 26, 2012 at 10:46 AM, Edward J. Yoon <ed...@apache.org> wrote:
> P.S., ignore this. +      data = null;
>
> On Wed, Sep 26, 2012 at 10:16 AM, Edward J. Yoon <ed...@apache.org> wrote:
>> FYI, the problem is fixed by below small patch.
>>
>> Index: src/main/java/org/apache/hama/bsp/message/AvroMessageManagerImpl.java
>> ===================================================================
>> --- src/main/java/org/apache/hama/bsp/message/AvroMessageManagerImpl.java       (revision
>> 1389695)
>> +++ src/main/java/org/apache/hama/bsp/message/AvroMessageManagerImpl.java       (working
>> copy)
>> @@ -135,12 +135,15 @@
>>        ByteArrayInputStream inArray = new ByteArrayInputStream(byteArray);
>>        DataInputStream in = new DataInputStream(inArray);
>>        msg.readFields(in);
>> +      in.close();
>> +      inArray.close();
>>      } else {
>>        peer.incrementCounter(BSPPeerImpl.PeerCounter.COMPRESSED_BYTES_RECEIVED,
>>            byteArray.length);
>>        msg = compressor.decompressBundle(new BSPCompressedBundle(byteArray));
>>      }
>>
>> +    byteArray = null;
>>      return msg;
>>    }
>>
>> @@ -154,12 +157,14 @@
>>        byte[] byteArray = outArray.toByteArray();
>>        peer.incrementCounter(BSPPeerImpl.PeerCounter.MESSAGE_BYTES_TRANSFERED,
>>            byteArray.length);
>> +      outArray.close();
>>        return ByteBuffer.wrap(byteArray);
>>      } else {
>>        BSPCompressedBundle compMsgBundle = compressor.compressBundle(msg);
>>        byte[] data = compMsgBundle.getData();
>>        peer.incrementCounter(BSPPeerImpl.PeerCounter.COMPRESSED_BYTES_SENT,
>>            data.length);
>> +      data = null;
>>        return ByteBuffer.wrap(data);
>>      }
>>    }
>>
>> On Thu, Sep 20, 2012 at 4:22 PM, Edward J. Yoon <ed...@apache.org> wrote:
>>> It seems we didn't test 0.5 release thoroughly.
>>>
>>> Let's give ourselves more time.
>>>
>>> On Wed, Sep 19, 2012 at 8:00 PM, Thomas Jungblut
>>> <th...@gmail.com> wrote:
>>>> Wait, the problem had multiple roots. We just fixed a few. I will sit down
>>>> and take a look at the messaging.
>>>> If this is scalable we can remove the multi step partitioning.
>>>> Am 19.09.2012 12:21 schrieb "Edward J. Yoon" <ed...@apache.org>:
>>>>
>>>>> P.S., Since memory issue of graph job will be fixed by Thomas's
>>>>> HAMA-642, I'll remove my dirty multi-step partitioning code in graph
>>>>> module if there's no problem w/ Hadoop RPC tomorrow.
>>>>>
>>>>> On Wed, Sep 19, 2012 at 5:53 PM, Thomas Jungblut
>>>>> <th...@gmail.com> wrote:
>>>>> > I will give you more details what I planned on the interface changes once
>>>>> > I'm back from my lecture.
>>>>> >
>>>>> > 2012/9/19 Suraj Menon <su...@apache.org>
>>>>> >
>>>>> >> As a beginning we should have a spilling queue and the same with
>>>>> combiner
>>>>> >> running in batch if possible.
>>>>> >> I have been looking into implementing the spilling queue. Chalking out
>>>>> the
>>>>> >> requirements, we should look into the following:
>>>>> >>
>>>>> >> A queue should persist all the data if required by the framework for
>>>>> fault
>>>>> >> tolerance. ( I feel it would be a bad idea for framework to spend
>>>>> resource
>>>>> >> on making a separate copy of the file )
>>>>> >> Asynchronous communication is our next important feature required for
>>>>> >> performance.Hence we would need this queue with combiner on sender side
>>>>> to
>>>>> >> batch the messages before sending. This implies we need to support both
>>>>> >> concurrent reads and writes.
>>>>> >>
>>>>> >> -Suraj
>>>>> >>
>>>>> >> On Wed, Sep 19, 2012 at 4:21 AM, Thomas Jungblut
>>>>> >> <th...@gmail.com>wrote:
>>>>> >>
>>>>> >> > Oh okay, very interesting. Just another argument for making the
>>>>> messaging
>>>>> >> > more scalable ;)
>>>>> >> >
>>>>> >> > 2012/9/19 Edward J. Yoon <ed...@oracle.com>
>>>>> >> >
>>>>> >> > > Didn't check memory usage because each machine's memory is 48 GB,
>>>>> but I
>>>>> >> > > guess there's no big difference.
>>>>> >> > >
>>>>> >> > > In short, "bin/hama bench 16 10000 32" was maximum capacity (See
>>>>> [1]).
>>>>> >> If
>>>>> >> > > message numbers or nodes are increased, job is always fails. Hadoop
>>>>> RPC
>>>>> >> > is
>>>>> >> > > OK.
>>>>> >> > >
>>>>> >> > > Will need time to debug this.
>>>>> >> > >
>>>>> >> > > 1. http://wiki.apache.org/hama/**Benchmarks#Random_**
>>>>> >> > > Communication_Benchmark<
>>>>> >> > http://wiki.apache.org/hama/Benchmarks#Random_Communication_Benchmark
>>>>> >
>>>>> >> > >
>>>>> >> > > On 9/19/2012 4:34 PM, Thomas Jungblut wrote:
>>>>> >> > >
>>>>> >> > >> BTW after HAMA-642<https://issues.**
>>>>> apache.org/jira/browse/HAMA-**642
>>>>> >> <
>>>>> >> > https://issues.apache.org/jira/browse/HAMA-642>>
>>>>> >> > >>  I will
>>>>> >> > >>
>>>>> >> > >> redesign our messaging system to being completely disk based with
>>>>> >> > caching.
>>>>> >> > >> I will formulate a followup issue for this. However I plan to get
>>>>> rid
>>>>> >> of
>>>>> >> > >> the RPC anyway, I think it is more efficient to stream the messages
>>>>> >> from
>>>>> >> > >> disk over network to the other host via NIO (we can later replace
>>>>> it
>>>>> >> > with
>>>>> >> > >> netty). Also this saves us the time to do the checkpointing,
>>>>> because
>>>>> >> > this
>>>>> >> > >> can be combined with it pretty well. RPC requires the whole bundle
>>>>> to
>>>>> >> be
>>>>> >> > >> in
>>>>> >> > >> RAM, which is totally bad.
>>>>> >> > >> Will follow with more details later.
>>>>> >> > >>
>>>>> >> > >> 2012/9/19 Thomas Jungblut<thomas.jungblut@**gmail.com<
>>>>> >> > thomas.jungblut@gmail.com>
>>>>> >> > >> >:
>>>>> >> > >>
>>>>> >> > >>> What is more memory efficient?
>>>>> >> > >>>
>>>>> >> > >>> Am 19.09.2012 08:23 schrieb "Edward J. Yoon"<
>>>>> edward.yoon@oracle.com
>>>>> >> >:
>>>>> >> > >>>
>>>>> >> > >>>  Let's change the default value of RPC in hama-default.xml to
>>>>> Hadoop
>>>>> >> > RPC.
>>>>> >> > >>>>
>>>>> >> > >>> I
>>>>> >> > >>
>>>>> >> > >>> am testing Hadoop RPC and Avro RPC on 4 racks cluster. Avro RPC is
>>>>> >> > >>>>
>>>>> >> > >>> criminal.
>>>>> >> > >>
>>>>> >> > >>> There's no significant performance difference.
>>>>> >> > >>>>
>>>>> >> > >>>> --
>>>>> >> > >>>> Best Regards, Edward J. Yoon
>>>>> >> > >>>> @eddieyoon
>>>>> >> > >>>>
>>>>> >> > >>>>
>>>>> >> > > --
>>>>> >> > > Best Regards, Edward J. Yoon
>>>>> >> > > @eddieyoon
>>>>> >> > >
>>>>> >> > >
>>>>> >> >
>>>>> >>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards, Edward J. Yoon
>>>>> @eddieyoon
>>>>>
>>>
>>>
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: Hadoop RPC as a default

Posted by "Edward J. Yoon" <ed...@apache.org>.

P.S., ignore this. +      data = null;

On Wed, Sep 26, 2012 at 10:16 AM, Edward J. Yoon <ed...@apache.org> wrote:
> FYI, the problem is fixed by below small patch.
>
> Index: src/main/java/org/apache/hama/bsp/message/AvroMessageManagerImpl.java
> ===================================================================
> --- src/main/java/org/apache/hama/bsp/message/AvroMessageManagerImpl.java       (revision
> 1389695)
> +++ src/main/java/org/apache/hama/bsp/message/AvroMessageManagerImpl.java       (working
> copy)
> @@ -135,12 +135,15 @@
>        ByteArrayInputStream inArray = new ByteArrayInputStream(byteArray);
>        DataInputStream in = new DataInputStream(inArray);
>        msg.readFields(in);
> +      in.close();
> +      inArray.close();
>      } else {
>        peer.incrementCounter(BSPPeerImpl.PeerCounter.COMPRESSED_BYTES_RECEIVED,
>            byteArray.length);
>        msg = compressor.decompressBundle(new BSPCompressedBundle(byteArray));
>      }
>
> +    byteArray = null;
>      return msg;
>    }
>
> @@ -154,12 +157,14 @@
>        byte[] byteArray = outArray.toByteArray();
>        peer.incrementCounter(BSPPeerImpl.PeerCounter.MESSAGE_BYTES_TRANSFERED,
>            byteArray.length);
> +      outArray.close();
>        return ByteBuffer.wrap(byteArray);
>      } else {
>        BSPCompressedBundle compMsgBundle = compressor.compressBundle(msg);
>        byte[] data = compMsgBundle.getData();
>        peer.incrementCounter(BSPPeerImpl.PeerCounter.COMPRESSED_BYTES_SENT,
>            data.length);
> +      data = null;
>        return ByteBuffer.wrap(data);
>      }
>    }
>
> On Thu, Sep 20, 2012 at 4:22 PM, Edward J. Yoon <ed...@apache.org> wrote:
>> It seems we didn't test 0.5 release thoroughly.
>>
>> Let's give ourselves more time.
>>
>> On Wed, Sep 19, 2012 at 8:00 PM, Thomas Jungblut
>> <th...@gmail.com> wrote:
>>> Wait, the problem had multiple roots. We just fixed a few. I will sit down
>>> and take a look at the messaging.
>>> If this is scalable we can remove the multi step partitioning.
>>> Am 19.09.2012 12:21 schrieb "Edward J. Yoon" <ed...@apache.org>:
>>>
>>>> P.S., Since memory issue of graph job will be fixed by Thomas's
>>>> HAMA-642, I'll remove my dirty multi-step partitioning code in graph
>>>> module if there's no problem w/ Hadoop RPC tomorrow.
>>>>
>>>> On Wed, Sep 19, 2012 at 5:53 PM, Thomas Jungblut
>>>> <th...@gmail.com> wrote:
>>>> > I will give you more details what I planned on the interface changes once
>>>> > I'm back from my lecture.
>>>> >
>>>> > 2012/9/19 Suraj Menon <su...@apache.org>
>>>> >
>>>> >> As a beginning we should have a spilling queue and the same with
>>>> combiner
>>>> >> running in batch if possible.
>>>> >> I have been looking into implementing the spilling queue. Chalking out
>>>> the
>>>> >> requirements, we should look into the following:
>>>> >>
>>>> >> A queue should persist all the data if required by the framework for
>>>> fault
>>>> >> tolerance. ( I feel it would be a bad idea for framework to spend
>>>> resource
>>>> >> on making a separate copy of the file )
>>>> >> Asynchronous communication is our next important feature required for
>>>> >> performance.Hence we would need this queue with combiner on sender side
>>>> to
>>>> >> batch the messages before sending. This implies we need to support both
>>>> >> concurrent reads and writes.
>>>> >>
>>>> >> -Suraj
>>>> >>
>>>> >> On Wed, Sep 19, 2012 at 4:21 AM, Thomas Jungblut
>>>> >> <th...@gmail.com>wrote:
>>>> >>
>>>> >> > Oh okay, very interesting. Just another argument for making the
>>>> messaging
>>>> >> > more scalable ;)
>>>> >> >
>>>> >> > 2012/9/19 Edward J. Yoon <ed...@oracle.com>
>>>> >> >
>>>> >> > > Didn't check memory usage because each machine's memory is 48 GB,
>>>> but I
>>>> >> > > guess there's no big difference.
>>>> >> > >
>>>> >> > > In short, "bin/hama bench 16 10000 32" was maximum capacity (See
>>>> [1]).
>>>> >> If
>>>> >> > > message numbers or nodes are increased, job is always fails. Hadoop
>>>> RPC
>>>> >> > is
>>>> >> > > OK.
>>>> >> > >
>>>> >> > > Will need time to debug this.
>>>> >> > >
>>>> >> > > 1. http://wiki.apache.org/hama/**Benchmarks#Random_**
>>>> >> > > Communication_Benchmark<
>>>> >> > http://wiki.apache.org/hama/Benchmarks#Random_Communication_Benchmark
>>>> >
>>>> >> > >
>>>> >> > > On 9/19/2012 4:34 PM, Thomas Jungblut wrote:
>>>> >> > >
>>>> >> > >> BTW after HAMA-642<https://issues.**
>>>> apache.org/jira/browse/HAMA-**642
>>>> >> <
>>>> >> > https://issues.apache.org/jira/browse/HAMA-642>>
>>>> >> > >>  I will
>>>> >> > >>
>>>> >> > >> redesign our messaging system to being completely disk based with
>>>> >> > caching.
>>>> >> > >> I will formulate a followup issue for this. However I plan to get
>>>> rid
>>>> >> of
>>>> >> > >> the RPC anyway, I think it is more efficient to stream the messages
>>>> >> from
>>>> >> > >> disk over network to the other host via NIO (we can later replace
>>>> it
>>>> >> > with
>>>> >> > >> netty). Also this saves us the time to do the checkpointing,
>>>> because
>>>> >> > this
>>>> >> > >> can be combined with it pretty well. RPC requires the whole bundle
>>>> to
>>>> >> be
>>>> >> > >> in
>>>> >> > >> RAM, which is totally bad.
>>>> >> > >> Will follow with more details later.
>>>> >> > >>
>>>> >> > >> 2012/9/19 Thomas Jungblut<thomas.jungblut@**gmail.com<
>>>> >> > thomas.jungblut@gmail.com>
>>>> >> > >> >:
>>>> >> > >>
>>>> >> > >>> What is more memory efficient?
>>>> >> > >>>
>>>> >> > >>> Am 19.09.2012 08:23 schrieb "Edward J. Yoon"<
>>>> edward.yoon@oracle.com
>>>> >> >:
>>>> >> > >>>
>>>> >> > >>>  Let's change the default value of RPC in hama-default.xml to
>>>> Hadoop
>>>> >> > RPC.
>>>> >> > >>>>
>>>> >> > >>> I
>>>> >> > >>
>>>> >> > >>> am testing Hadoop RPC and Avro RPC on 4 racks cluster. Avro RPC is
>>>> >> > >>>>
>>>> >> > >>> criminal.
>>>> >> > >>
>>>> >> > >>> There's no significant performance difference.
>>>> >> > >>>>
>>>> >> > >>>> --
>>>> >> > >>>> Best Regards, Edward J. Yoon
>>>> >> > >>>> @eddieyoon
>>>> >> > >>>>
>>>> >> > >>>>
>>>> >> > > --
>>>> >> > > Best Regards, Edward J. Yoon
>>>> >> > > @eddieyoon
>>>> >> > >
>>>> >> > >
>>>> >> >
>>>> >>
>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards, Edward J. Yoon
>>>> @eddieyoon
>>>>
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: Hadoop RPC as a default

Posted by "Edward J. Yoon" <ed...@apache.org>.

FYI, the problem is fixed by below small patch.

Index: src/main/java/org/apache/hama/bsp/message/AvroMessageManagerImpl.java
===================================================================
--- src/main/java/org/apache/hama/bsp/message/AvroMessageManagerImpl.java	(revision
1389695)
+++ src/main/java/org/apache/hama/bsp/message/AvroMessageManagerImpl.java	(working
copy)
@@ -135,12 +135,15 @@
       ByteArrayInputStream inArray = new ByteArrayInputStream(byteArray);
       DataInputStream in = new DataInputStream(inArray);
       msg.readFields(in);
+      in.close();
+      inArray.close();
     } else {
       peer.incrementCounter(BSPPeerImpl.PeerCounter.COMPRESSED_BYTES_RECEIVED,
           byteArray.length);
       msg = compressor.decompressBundle(new BSPCompressedBundle(byteArray));
     }

+    byteArray = null;
     return msg;
   }

@@ -154,12 +157,14 @@
       byte[] byteArray = outArray.toByteArray();
       peer.incrementCounter(BSPPeerImpl.PeerCounter.MESSAGE_BYTES_TRANSFERED,
           byteArray.length);
+      outArray.close();
       return ByteBuffer.wrap(byteArray);
     } else {
       BSPCompressedBundle compMsgBundle = compressor.compressBundle(msg);
       byte[] data = compMsgBundle.getData();
       peer.incrementCounter(BSPPeerImpl.PeerCounter.COMPRESSED_BYTES_SENT,
           data.length);
+      data = null;
       return ByteBuffer.wrap(data);
     }
   }

On Thu, Sep 20, 2012 at 4:22 PM, Edward J. Yoon <ed...@apache.org> wrote:
> It seems we didn't test 0.5 release thoroughly.
>
> Let's give ourselves more time.
>
> On Wed, Sep 19, 2012 at 8:00 PM, Thomas Jungblut
> <th...@gmail.com> wrote:
>> Wait, the problem had multiple roots. We just fixed a few. I will sit down
>> and take a look at the messaging.
>> If this is scalable we can remove the multi step partitioning.
>> Am 19.09.2012 12:21 schrieb "Edward J. Yoon" <ed...@apache.org>:
>>
>>> P.S., Since memory issue of graph job will be fixed by Thomas's
>>> HAMA-642, I'll remove my dirty multi-step partitioning code in graph
>>> module if there's no problem w/ Hadoop RPC tomorrow.
>>>
>>> On Wed, Sep 19, 2012 at 5:53 PM, Thomas Jungblut
>>> <th...@gmail.com> wrote:
>>> > I will give you more details what I planned on the interface changes once
>>> > I'm back from my lecture.
>>> >
>>> > 2012/9/19 Suraj Menon <su...@apache.org>
>>> >
>>> >> As a beginning we should have a spilling queue and the same with
>>> combiner
>>> >> running in batch if possible.
>>> >> I have been looking into implementing the spilling queue. Chalking out
>>> the
>>> >> requirements, we should look into the following:
>>> >>
>>> >> A queue should persist all the data if required by the framework for
>>> fault
>>> >> tolerance. ( I feel it would be a bad idea for framework to spend
>>> resource
>>> >> on making a separate copy of the file )
>>> >> Asynchronous communication is our next important feature required for
>>> >> performance.Hence we would need this queue with combiner on sender side
>>> to
>>> >> batch the messages before sending. This implies we need to support both
>>> >> concurrent reads and writes.
>>> >>
>>> >> -Suraj
>>> >>
>>> >> On Wed, Sep 19, 2012 at 4:21 AM, Thomas Jungblut
>>> >> <th...@gmail.com>wrote:
>>> >>
>>> >> > Oh okay, very interesting. Just another argument for making the
>>> messaging
>>> >> > more scalable ;)
>>> >> >
>>> >> > 2012/9/19 Edward J. Yoon <ed...@oracle.com>
>>> >> >
>>> >> > > Didn't check memory usage because each machine's memory is 48 GB,
>>> but I
>>> >> > > guess there's no big difference.
>>> >> > >
>>> >> > > In short, "bin/hama bench 16 10000 32" was maximum capacity (See
>>> [1]).
>>> >> If
>>> >> > > message numbers or nodes are increased, job is always fails. Hadoop
>>> RPC
>>> >> > is
>>> >> > > OK.
>>> >> > >
>>> >> > > Will need time to debug this.
>>> >> > >
>>> >> > > 1. http://wiki.apache.org/hama/**Benchmarks#Random_**
>>> >> > > Communication_Benchmark<
>>> >> > http://wiki.apache.org/hama/Benchmarks#Random_Communication_Benchmark
>>> >
>>> >> > >
>>> >> > > On 9/19/2012 4:34 PM, Thomas Jungblut wrote:
>>> >> > >
>>> >> > >> BTW after HAMA-642<https://issues.**
>>> apache.org/jira/browse/HAMA-**642
>>> >> <
>>> >> > https://issues.apache.org/jira/browse/HAMA-642>>
>>> >> > >>  I will
>>> >> > >>
>>> >> > >> redesign our messaging system to being completely disk based with
>>> >> > caching.
>>> >> > >> I will formulate a followup issue for this. However I plan to get
>>> rid
>>> >> of
>>> >> > >> the RPC anyway, I think it is more efficient to stream the messages
>>> >> from
>>> >> > >> disk over network to the other host via NIO (we can later replace
>>> it
>>> >> > with
>>> >> > >> netty). Also this saves us the time to do the checkpointing,
>>> because
>>> >> > this
>>> >> > >> can be combined with it pretty well. RPC requires the whole bundle
>>> to
>>> >> be
>>> >> > >> in
>>> >> > >> RAM, which is totally bad.
>>> >> > >> Will follow with more details later.
>>> >> > >>
>>> >> > >> 2012/9/19 Thomas Jungblut<thomas.jungblut@**gmail.com<
>>> >> > thomas.jungblut@gmail.com>
>>> >> > >> >:
>>> >> > >>
>>> >> > >>> What is more memory efficient?
>>> >> > >>>
>>> >> > >>> Am 19.09.2012 08:23 schrieb "Edward J. Yoon"<
>>> edward.yoon@oracle.com
>>> >> >:
>>> >> > >>>
>>> >> > >>>  Let's change the default value of RPC in hama-default.xml to
>>> Hadoop
>>> >> > RPC.
>>> >> > >>>>
>>> >> > >>> I
>>> >> > >>
>>> >> > >>> am testing Hadoop RPC and Avro RPC on 4 racks cluster. Avro RPC is
>>> >> > >>>>
>>> >> > >>> criminal.
>>> >> > >>
>>> >> > >>> There's no significant performance difference.
>>> >> > >>>>
>>> >> > >>>> --
>>> >> > >>>> Best Regards, Edward J. Yoon
>>> >> > >>>> @eddieyoon
>>> >> > >>>>
>>> >> > >>>>
>>> >> > > --
>>> >> > > Best Regards, Edward J. Yoon
>>> >> > > @eddieyoon
>>> >> > >
>>> >> > >
>>> >> >
>>> >>
>>>
>>>
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
>>>
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: Hadoop RPC as a default

Posted by "Edward J. Yoon" <ed...@apache.org>.

It seems we didn't test 0.5 release thoroughly.

Let's give ourselves more time.

On Wed, Sep 19, 2012 at 8:00 PM, Thomas Jungblut
<th...@gmail.com> wrote:
> Wait, the problem had multiple roots. We just fixed a few. I will sit down
> and take a look at the messaging.
> If this is scalable we can remove the multi step partitioning.
> Am 19.09.2012 12:21 schrieb "Edward J. Yoon" <ed...@apache.org>:
>
>> P.S., Since memory issue of graph job will be fixed by Thomas's
>> HAMA-642, I'll remove my dirty multi-step partitioning code in graph
>> module if there's no problem w/ Hadoop RPC tomorrow.
>>
>> On Wed, Sep 19, 2012 at 5:53 PM, Thomas Jungblut
>> <th...@gmail.com> wrote:
>> > I will give you more details what I planned on the interface changes once
>> > I'm back from my lecture.
>> >
>> > 2012/9/19 Suraj Menon <su...@apache.org>
>> >
>> >> As a beginning we should have a spilling queue and the same with
>> combiner
>> >> running in batch if possible.
>> >> I have been looking into implementing the spilling queue. Chalking out
>> the
>> >> requirements, we should look into the following:
>> >>
>> >> A queue should persist all the data if required by the framework for
>> fault
>> >> tolerance. ( I feel it would be a bad idea for framework to spend
>> resource
>> >> on making a separate copy of the file )
>> >> Asynchronous communication is our next important feature required for
>> >> performance.Hence we would need this queue with combiner on sender side
>> to
>> >> batch the messages before sending. This implies we need to support both
>> >> concurrent reads and writes.
>> >>
>> >> -Suraj
>> >>
>> >> On Wed, Sep 19, 2012 at 4:21 AM, Thomas Jungblut
>> >> <th...@gmail.com>wrote:
>> >>
>> >> > Oh okay, very interesting. Just another argument for making the
>> messaging
>> >> > more scalable ;)
>> >> >
>> >> > 2012/9/19 Edward J. Yoon <ed...@oracle.com>
>> >> >
>> >> > > Didn't check memory usage because each machine's memory is 48 GB,
>> but I
>> >> > > guess there's no big difference.
>> >> > >
>> >> > > In short, "bin/hama bench 16 10000 32" was maximum capacity (See
>> [1]).
>> >> If
>> >> > > message numbers or nodes are increased, job is always fails. Hadoop
>> RPC
>> >> > is
>> >> > > OK.
>> >> > >
>> >> > > Will need time to debug this.
>> >> > >
>> >> > > 1. http://wiki.apache.org/hama/**Benchmarks#Random_**
>> >> > > Communication_Benchmark<
>> >> > http://wiki.apache.org/hama/Benchmarks#Random_Communication_Benchmark
>> >
>> >> > >
>> >> > > On 9/19/2012 4:34 PM, Thomas Jungblut wrote:
>> >> > >
>> >> > >> BTW after HAMA-642<https://issues.**
>> apache.org/jira/browse/HAMA-**642
>> >> <
>> >> > https://issues.apache.org/jira/browse/HAMA-642>>
>> >> > >>  I will
>> >> > >>
>> >> > >> redesign our messaging system to being completely disk based with
>> >> > caching.
>> >> > >> I will formulate a followup issue for this. However I plan to get
>> rid
>> >> of
>> >> > >> the RPC anyway, I think it is more efficient to stream the messages
>> >> from
>> >> > >> disk over network to the other host via NIO (we can later replace
>> it
>> >> > with
>> >> > >> netty). Also this saves us the time to do the checkpointing,
>> because
>> >> > this
>> >> > >> can be combined with it pretty well. RPC requires the whole bundle
>> to
>> >> be
>> >> > >> in
>> >> > >> RAM, which is totally bad.
>> >> > >> Will follow with more details later.
>> >> > >>
>> >> > >> 2012/9/19 Thomas Jungblut<thomas.jungblut@**gmail.com<
>> >> > thomas.jungblut@gmail.com>
>> >> > >> >:
>> >> > >>
>> >> > >>> What is more memory efficient?
>> >> > >>>
>> >> > >>> Am 19.09.2012 08:23 schrieb "Edward J. Yoon"<
>> edward.yoon@oracle.com
>> >> >:
>> >> > >>>
>> >> > >>>  Let's change the default value of RPC in hama-default.xml to
>> Hadoop
>> >> > RPC.
>> >> > >>>>
>> >> > >>> I
>> >> > >>
>> >> > >>> am testing Hadoop RPC and Avro RPC on 4 racks cluster. Avro RPC is
>> >> > >>>>
>> >> > >>> criminal.
>> >> > >>
>> >> > >>> There's no significant performance difference.
>> >> > >>>>
>> >> > >>>> --
>> >> > >>>> Best Regards, Edward J. Yoon
>> >> > >>>> @eddieyoon
>> >> > >>>>
>> >> > >>>>
>> >> > > --
>> >> > > Best Regards, Edward J. Yoon
>> >> > > @eddieyoon
>> >> > >
>> >> > >
>> >> >
>> >>
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: Hadoop RPC as a default

Posted by Thomas Jungblut <th...@gmail.com>.

Wait, the problem had multiple roots. We just fixed a few. I will sit down
and take a look at the messaging.
If this is scalable we can remove the multi step partitioning.
Am 19.09.2012 12:21 schrieb "Edward J. Yoon" <ed...@apache.org>:

> P.S., Since memory issue of graph job will be fixed by Thomas's
> HAMA-642, I'll remove my dirty multi-step partitioning code in graph
> module if there's no problem w/ Hadoop RPC tomorrow.
>
> On Wed, Sep 19, 2012 at 5:53 PM, Thomas Jungblut
> <th...@gmail.com> wrote:
> > I will give you more details what I planned on the interface changes once
> > I'm back from my lecture.
> >
> > 2012/9/19 Suraj Menon <su...@apache.org>
> >
> >> As a beginning we should have a spilling queue and the same with
> combiner
> >> running in batch if possible.
> >> I have been looking into implementing the spilling queue. Chalking out
> the
> >> requirements, we should look into the following:
> >>
> >> A queue should persist all the data if required by the framework for
> fault
> >> tolerance. ( I feel it would be a bad idea for framework to spend
> resource
> >> on making a separate copy of the file )
> >> Asynchronous communication is our next important feature required for
> >> performance.Hence we would need this queue with combiner on sender side
> to
> >> batch the messages before sending. This implies we need to support both
> >> concurrent reads and writes.
> >>
> >> -Suraj
> >>
> >> On Wed, Sep 19, 2012 at 4:21 AM, Thomas Jungblut
> >> <th...@gmail.com>wrote:
> >>
> >> > Oh okay, very interesting. Just another argument for making the
> messaging
> >> > more scalable ;)
> >> >
> >> > 2012/9/19 Edward J. Yoon <ed...@oracle.com>
> >> >
> >> > > Didn't check memory usage because each machine's memory is 48 GB,
> but I
> >> > > guess there's no big difference.
> >> > >
> >> > > In short, "bin/hama bench 16 10000 32" was maximum capacity (See
> [1]).
> >> If
> >> > > message numbers or nodes are increased, job is always fails. Hadoop
> RPC
> >> > is
> >> > > OK.
> >> > >
> >> > > Will need time to debug this.
> >> > >
> >> > > 1. http://wiki.apache.org/hama/**Benchmarks#Random_**
> >> > > Communication_Benchmark<
> >> > http://wiki.apache.org/hama/Benchmarks#Random_Communication_Benchmark
> >
> >> > >
> >> > > On 9/19/2012 4:34 PM, Thomas Jungblut wrote:
> >> > >
> >> > >> BTW after HAMA-642<https://issues.**
> apache.org/jira/browse/HAMA-**642
> >> <
> >> > https://issues.apache.org/jira/browse/HAMA-642>>
> >> > >>  I will
> >> > >>
> >> > >> redesign our messaging system to being completely disk based with
> >> > caching.
> >> > >> I will formulate a followup issue for this. However I plan to get
> rid
> >> of
> >> > >> the RPC anyway, I think it is more efficient to stream the messages
> >> from
> >> > >> disk over network to the other host via NIO (we can later replace
> it
> >> > with
> >> > >> netty). Also this saves us the time to do the checkpointing,
> because
> >> > this
> >> > >> can be combined with it pretty well. RPC requires the whole bundle
> to
> >> be
> >> > >> in
> >> > >> RAM, which is totally bad.
> >> > >> Will follow with more details later.
> >> > >>
> >> > >> 2012/9/19 Thomas Jungblut<thomas.jungblut@**gmail.com<
> >> > thomas.jungblut@gmail.com>
> >> > >> >:
> >> > >>
> >> > >>> What is more memory efficient?
> >> > >>>
> >> > >>> Am 19.09.2012 08:23 schrieb "Edward J. Yoon"<
> edward.yoon@oracle.com
> >> >:
> >> > >>>
> >> > >>>  Let's change the default value of RPC in hama-default.xml to
> Hadoop
> >> > RPC.
> >> > >>>>
> >> > >>> I
> >> > >>
> >> > >>> am testing Hadoop RPC and Avro RPC on 4 racks cluster. Avro RPC is
> >> > >>>>
> >> > >>> criminal.
> >> > >>
> >> > >>> There's no significant performance difference.
> >> > >>>>
> >> > >>>> --
> >> > >>>> Best Regards, Edward J. Yoon
> >> > >>>> @eddieyoon
> >> > >>>>
> >> > >>>>
> >> > > --
> >> > > Best Regards, Edward J. Yoon
> >> > > @eddieyoon
> >> > >
> >> > >
> >> >
> >>
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>

Re: Hadoop RPC as a default

Posted by "Edward J. Yoon" <ed...@apache.org>.

P.S., Since memory issue of graph job will be fixed by Thomas's
HAMA-642, I'll remove my dirty multi-step partitioning code in graph
module if there's no problem w/ Hadoop RPC tomorrow.

On Wed, Sep 19, 2012 at 5:53 PM, Thomas Jungblut
<th...@gmail.com> wrote:
> I will give you more details what I planned on the interface changes once
> I'm back from my lecture.
>
> 2012/9/19 Suraj Menon <su...@apache.org>
>
>> As a beginning we should have a spilling queue and the same with combiner
>> running in batch if possible.
>> I have been looking into implementing the spilling queue. Chalking out the
>> requirements, we should look into the following:
>>
>> A queue should persist all the data if required by the framework for fault
>> tolerance. ( I feel it would be a bad idea for framework to spend resource
>> on making a separate copy of the file )
>> Asynchronous communication is our next important feature required for
>> performance.Hence we would need this queue with combiner on sender side to
>> batch the messages before sending. This implies we need to support both
>> concurrent reads and writes.
>>
>> -Suraj
>>
>> On Wed, Sep 19, 2012 at 4:21 AM, Thomas Jungblut
>> <th...@gmail.com>wrote:
>>
>> > Oh okay, very interesting. Just another argument for making the messaging
>> > more scalable ;)
>> >
>> > 2012/9/19 Edward J. Yoon <ed...@oracle.com>
>> >
>> > > Didn't check memory usage because each machine's memory is 48 GB, but I
>> > > guess there's no big difference.
>> > >
>> > > In short, "bin/hama bench 16 10000 32" was maximum capacity (See [1]).
>> If
>> > > message numbers or nodes are increased, job is always fails. Hadoop RPC
>> > is
>> > > OK.
>> > >
>> > > Will need time to debug this.
>> > >
>> > > 1. http://wiki.apache.org/hama/**Benchmarks#Random_**
>> > > Communication_Benchmark<
>> > http://wiki.apache.org/hama/Benchmarks#Random_Communication_Benchmark>
>> > >
>> > > On 9/19/2012 4:34 PM, Thomas Jungblut wrote:
>> > >
>> > >> BTW after HAMA-642<https://issues.**apache.org/jira/browse/HAMA-**642
>> <
>> > https://issues.apache.org/jira/browse/HAMA-642>>
>> > >>  I will
>> > >>
>> > >> redesign our messaging system to being completely disk based with
>> > caching.
>> > >> I will formulate a followup issue for this. However I plan to get rid
>> of
>> > >> the RPC anyway, I think it is more efficient to stream the messages
>> from
>> > >> disk over network to the other host via NIO (we can later replace it
>> > with
>> > >> netty). Also this saves us the time to do the checkpointing, because
>> > this
>> > >> can be combined with it pretty well. RPC requires the whole bundle to
>> be
>> > >> in
>> > >> RAM, which is totally bad.
>> > >> Will follow with more details later.
>> > >>
>> > >> 2012/9/19 Thomas Jungblut<thomas.jungblut@**gmail.com<
>> > thomas.jungblut@gmail.com>
>> > >> >:
>> > >>
>> > >>> What is more memory efficient?
>> > >>>
>> > >>> Am 19.09.2012 08:23 schrieb "Edward J. Yoon"<edward.yoon@oracle.com
>> >:
>> > >>>
>> > >>>  Let's change the default value of RPC in hama-default.xml to Hadoop
>> > RPC.
>> > >>>>
>> > >>> I
>> > >>
>> > >>> am testing Hadoop RPC and Avro RPC on 4 racks cluster. Avro RPC is
>> > >>>>
>> > >>> criminal.
>> > >>
>> > >>> There's no significant performance difference.
>> > >>>>
>> > >>>> --
>> > >>>> Best Regards, Edward J. Yoon
>> > >>>> @eddieyoon
>> > >>>>
>> > >>>>
>> > > --
>> > > Best Regards, Edward J. Yoon
>> > > @eddieyoon
>> > >
>> > >
>> >
>>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: Hadoop RPC as a default

Posted by Thomas Jungblut <th...@gmail.com>.

I will give you more details what I planned on the interface changes once
I'm back from my lecture.

2012/9/19 Suraj Menon <su...@apache.org>

> As a beginning we should have a spilling queue and the same with combiner
> running in batch if possible.
> I have been looking into implementing the spilling queue. Chalking out the
> requirements, we should look into the following:
>
> A queue should persist all the data if required by the framework for fault
> tolerance. ( I feel it would be a bad idea for framework to spend resource
> on making a separate copy of the file )
> Asynchronous communication is our next important feature required for
> performance.Hence we would need this queue with combiner on sender side to
> batch the messages before sending. This implies we need to support both
> concurrent reads and writes.
>
> -Suraj
>
> On Wed, Sep 19, 2012 at 4:21 AM, Thomas Jungblut
> <th...@gmail.com>wrote:
>
> > Oh okay, very interesting. Just another argument for making the messaging
> > more scalable ;)
> >
> > 2012/9/19 Edward J. Yoon <ed...@oracle.com>
> >
> > > Didn't check memory usage because each machine's memory is 48 GB, but I
> > > guess there's no big difference.
> > >
> > > In short, "bin/hama bench 16 10000 32" was maximum capacity (See [1]).
> If
> > > message numbers or nodes are increased, job is always fails. Hadoop RPC
> > is
> > > OK.
> > >
> > > Will need time to debug this.
> > >
> > > 1. http://wiki.apache.org/hama/**Benchmarks#Random_**
> > > Communication_Benchmark<
> > http://wiki.apache.org/hama/Benchmarks#Random_Communication_Benchmark>
> > >
> > > On 9/19/2012 4:34 PM, Thomas Jungblut wrote:
> > >
> > >> BTW after HAMA-642<https://issues.**apache.org/jira/browse/HAMA-**642
> <
> > https://issues.apache.org/jira/browse/HAMA-642>>
> > >>  I will
> > >>
> > >> redesign our messaging system to being completely disk based with
> > caching.
> > >> I will formulate a followup issue for this. However I plan to get rid
> of
> > >> the RPC anyway, I think it is more efficient to stream the messages
> from
> > >> disk over network to the other host via NIO (we can later replace it
> > with
> > >> netty). Also this saves us the time to do the checkpointing, because
> > this
> > >> can be combined with it pretty well. RPC requires the whole bundle to
> be
> > >> in
> > >> RAM, which is totally bad.
> > >> Will follow with more details later.
> > >>
> > >> 2012/9/19 Thomas Jungblut<thomas.jungblut@**gmail.com<
> > thomas.jungblut@gmail.com>
> > >> >:
> > >>
> > >>> What is more memory efficient?
> > >>>
> > >>> Am 19.09.2012 08:23 schrieb "Edward J. Yoon"<edward.yoon@oracle.com
> >:
> > >>>
> > >>>  Let's change the default value of RPC in hama-default.xml to Hadoop
> > RPC.
> > >>>>
> > >>> I
> > >>
> > >>> am testing Hadoop RPC and Avro RPC on 4 racks cluster. Avro RPC is
> > >>>>
> > >>> criminal.
> > >>
> > >>> There's no significant performance difference.
> > >>>>
> > >>>> --
> > >>>> Best Regards, Edward J. Yoon
> > >>>> @eddieyoon
> > >>>>
> > >>>>
> > > --
> > > Best Regards, Edward J. Yoon
> > > @eddieyoon
> > >
> > >
> >
>

Re: Hadoop RPC as a default

Posted by Suraj Menon <su...@apache.org>.

As a beginning we should have a spilling queue and the same with combiner
running in batch if possible.
I have been looking into implementing the spilling queue. Chalking out the
requirements, we should look into the following:

A queue should persist all the data if required by the framework for fault
tolerance. ( I feel it would be a bad idea for framework to spend resource
on making a separate copy of the file )
Asynchronous communication is our next important feature required for
performance.Hence we would need this queue with combiner on sender side to
batch the messages before sending. This implies we need to support both
concurrent reads and writes.

-Suraj

On Wed, Sep 19, 2012 at 4:21 AM, Thomas Jungblut
<th...@gmail.com>wrote:

> Oh okay, very interesting. Just another argument for making the messaging
> more scalable ;)
>
> 2012/9/19 Edward J. Yoon <ed...@oracle.com>
>
> > Didn't check memory usage because each machine's memory is 48 GB, but I
> > guess there's no big difference.
> >
> > In short, "bin/hama bench 16 10000 32" was maximum capacity (See [1]). If
> > message numbers or nodes are increased, job is always fails. Hadoop RPC
> is
> > OK.
> >
> > Will need time to debug this.
> >
> > 1. http://wiki.apache.org/hama/**Benchmarks#Random_**
> > Communication_Benchmark<
> http://wiki.apache.org/hama/Benchmarks#Random_Communication_Benchmark>
> >
> > On 9/19/2012 4:34 PM, Thomas Jungblut wrote:
> >
> >> BTW after HAMA-642<https://issues.**apache.org/jira/browse/HAMA-**642<
> https://issues.apache.org/jira/browse/HAMA-642>>
> >>  I will
> >>
> >> redesign our messaging system to being completely disk based with
> caching.
> >> I will formulate a followup issue for this. However I plan to get rid of
> >> the RPC anyway, I think it is more efficient to stream the messages from
> >> disk over network to the other host via NIO (we can later replace it
> with
> >> netty). Also this saves us the time to do the checkpointing, because
> this
> >> can be combined with it pretty well. RPC requires the whole bundle to be
> >> in
> >> RAM, which is totally bad.
> >> Will follow with more details later.
> >>
> >> 2012/9/19 Thomas Jungblut<thomas.jungblut@**gmail.com<
> thomas.jungblut@gmail.com>
> >> >:
> >>
> >>> What is more memory efficient?
> >>>
> >>> Am 19.09.2012 08:23 schrieb "Edward J. Yoon"<ed...@oracle.com>:
> >>>
> >>>  Let's change the default value of RPC in hama-default.xml to Hadoop
> RPC.
> >>>>
> >>> I
> >>
> >>> am testing Hadoop RPC and Avro RPC on 4 racks cluster. Avro RPC is
> >>>>
> >>> criminal.
> >>
> >>> There's no significant performance difference.
> >>>>
> >>>> --
> >>>> Best Regards, Edward J. Yoon
> >>>> @eddieyoon
> >>>>
> >>>>
> > --
> > Best Regards, Edward J. Yoon
> > @eddieyoon
> >
> >
>

Re: Hadoop RPC as a default

Posted by Thomas Jungblut <th...@gmail.com>.

Oh okay, very interesting. Just another argument for making the messaging
more scalable ;)

2012/9/19 Edward J. Yoon <ed...@oracle.com>

> Didn't check memory usage because each machine's memory is 48 GB, but I
> guess there's no big difference.
>
> In short, "bin/hama bench 16 10000 32" was maximum capacity (See [1]). If
> message numbers or nodes are increased, job is always fails. Hadoop RPC is
> OK.
>
> Will need time to debug this.
>
> 1. http://wiki.apache.org/hama/**Benchmarks#Random_**
> Communication_Benchmark<http://wiki.apache.org/hama/Benchmarks#Random_Communication_Benchmark>
>
> On 9/19/2012 4:34 PM, Thomas Jungblut wrote:
>
>> BTW after HAMA-642<https://issues.**apache.org/jira/browse/HAMA-**642<https://issues.apache.org/jira/browse/HAMA-642>>
>>  I will
>>
>> redesign our messaging system to being completely disk based with caching.
>> I will formulate a followup issue for this. However I plan to get rid of
>> the RPC anyway, I think it is more efficient to stream the messages from
>> disk over network to the other host via NIO (we can later replace it with
>> netty). Also this saves us the time to do the checkpointing, because this
>> can be combined with it pretty well. RPC requires the whole bundle to be
>> in
>> RAM, which is totally bad.
>> Will follow with more details later.
>>
>> 2012/9/19 Thomas Jungblut<th...@gmail.com>
>> >:
>>
>>> What is more memory efficient?
>>>
>>> Am 19.09.2012 08:23 schrieb "Edward J. Yoon"<ed...@oracle.com>:
>>>
>>>  Let's change the default value of RPC in hama-default.xml to Hadoop RPC.
>>>>
>>> I
>>
>>> am testing Hadoop RPC and Avro RPC on 4 racks cluster. Avro RPC is
>>>>
>>> criminal.
>>
>>> There's no significant performance difference.
>>>>
>>>> --
>>>> Best Regards, Edward J. Yoon
>>>> @eddieyoon
>>>>
>>>>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>
>

Re: Hadoop RPC as a default

Posted by "Edward J. Yoon" <ed...@oracle.com>.

Didn't check memory usage because each machine's memory is 48 GB, but I 
guess there's no big difference.

In short, "bin/hama bench 16 10000 32" was maximum capacity (See [1]). 
If message numbers or nodes are increased, job is always fails. Hadoop 
RPC is OK.

Will need time to debug this.

1. http://wiki.apache.org/hama/Benchmarks#Random_Communication_Benchmark

On 9/19/2012 4:34 PM, Thomas Jungblut wrote:
> BTW after HAMA-642<https://issues.apache.org/jira/browse/HAMA-642>  I will
> redesign our messaging system to being completely disk based with caching.
> I will formulate a followup issue for this. However I plan to get rid of
> the RPC anyway, I think it is more efficient to stream the messages from
> disk over network to the other host via NIO (we can later replace it with
> netty). Also this saves us the time to do the checkpointing, because this
> can be combined with it pretty well. RPC requires the whole bundle to be in
> RAM, which is totally bad.
> Will follow with more details later.
>
> 2012/9/19 Thomas Jungblut<th...@gmail.com>:
>> What is more memory efficient?
>>
>> Am 19.09.2012 08:23 schrieb "Edward J. Yoon"<ed...@oracle.com>:
>>
>>> Let's change the default value of RPC in hama-default.xml to Hadoop RPC.
> I
>>> am testing Hadoop RPC and Avro RPC on 4 racks cluster. Avro RPC is
> criminal.
>>> There's no significant performance difference.
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
>>>

-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: Hadoop RPC as a default

Posted by Thomas Jungblut <th...@gmail.com>.

BTW after HAMA-642 <https://issues.apache.org/jira/browse/HAMA-642> I will
redesign our messaging system to being completely disk based with caching.
I will formulate a followup issue for this. However I plan to get rid of
the RPC anyway, I think it is more efficient to stream the messages from
disk over network to the other host via NIO (we can later replace it with
netty). Also this saves us the time to do the checkpointing, because this
can be combined with it pretty well. RPC requires the whole bundle to be in
RAM, which is totally bad.
Will follow with more details later.

2012/9/19 Thomas Jungblut <th...@gmail.com>:
> What is more memory efficient?
>
> Am 19.09.2012 08:23 schrieb "Edward J. Yoon" <ed...@oracle.com>:
>
>> Let's change the default value of RPC in hama-default.xml to Hadoop RPC.
I
>> am testing Hadoop RPC and Avro RPC on 4 racks cluster. Avro RPC is
criminal.
>> There's no significant performance difference.
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>
>

Re: Hadoop RPC as a default

Posted by Thomas Jungblut <th...@gmail.com>.

What is more memory efficient?
Am 19.09.2012 08:23 schrieb "Edward J. Yoon" <ed...@oracle.com>:

> Let's change the default value of RPC in hama-default.xml to Hadoop RPC. I
> am testing Hadoop RPC and Avro RPC on 4 racks cluster. Avro RPC is
> criminal. There's no significant performance difference.
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>
>