You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hama.apache.org by "Edward J. Yoon" <ed...@apache.org> on 2011/09/22 03:39:38 UTC

Awesome bench results after removing Thread.sleep in sync() method.

By ChiaHung's HAMA-387.patch, hang problem is fixed.

And also, on same environment (1 rack, 256 cores), a bench example
result is dramatically improved. (184.076 seconds from 307.129
seconds)

----
# core/bin/hama jar
examples/target/hama-examples-0.4.0-incubating-SNAPSHOT.jar bench 16
1000 512
..
11/09/22 10:27:32 INFO bsp.BSPJobClient: Current supersteps number: 504
11/09/22 10:27:35 INFO bsp.BSPJobClient: Current supersteps number: 508
11/09/22 10:27:38 INFO bsp.BSPJobClient: Current supersteps number: 512
11/09/22 10:27:38 INFO bsp.BSPJobClient: The total number of supersteps: 512
Job Finished in 184.076 seconds

Hama 0.4 (r.1163903) was:

16 bytes | 1000 | 512 | 307.129 seconds

-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: Awesome bench results after removing Thread.sleep in sync() method.

Posted by "Edward J. Yoon" <ed...@apache.org>.
Tested many times debug mode vs. non-debug mode.

There is only a small difference:
170 ~ 180 secs vs. 210 ~ 220 secs.

----
11/09/23 09:15:52 INFO bsp.BSPJobClient: Current supersteps number: 493
11/09/23 09:15:55 INFO bsp.BSPJobClient: Current supersteps number: 504
11/09/23 09:15:58 INFO bsp.BSPJobClient: Current supersteps number: 506
11/09/23 09:16:01 INFO bsp.BSPJobClient: Current supersteps number: 512
11/09/23 09:16:01 INFO bsp.BSPJobClient: The total number of supersteps: 512
Job Finished in 214.089 seconds


On Thu, Sep 22, 2011 at 8:58 PM, Thomas Jungblut
<th...@googlemail.com> wrote:
> Scripted a fix version:
>
> http://pastebin.com/WbWWxd2R
>
> You can test this as well if you like.
>
> 2011/9/22 Thomas Jungblut <th...@googlemail.com>
>
>> And I think we should change this benchmark from random to a stable
>> implementation.
>> So we should communicate with all of the other peers, not just a random
>> peername.
>> Then we can precompute (or cache) the tag message and we save the string
>> concat operations. We can put them into a directly allocated ByteBuffer as
>> well, this will save the serialization.
>> But then we cannot compare the results with the versions before.
>>
>> Additional to the log level change, we can receive a superior performance
>> improvement ;)
>>
>>
>> 2011/9/22 Edward J. Yoon <ed...@apache.org>
>>
>>> Haha, obviously it can't be ignored.
>>>
>>> Unfortunately, I can't access my test machines now. I'll check tomorrow.
>>>
>>> On Thu, Sep 22, 2011 at 5:05 PM, Thomas Jungblut
>>> <th...@googlemail.com> wrote:
>>> > You're going to laugh, but we spend 80% of the time, logging the
>>> messages.
>>> > Let's change the log level to debug or remove the logging in the bench
>>> > example.
>>> >
>>> > Sadly I still receive
>>> >
>>> > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
>>> >> NoNode for /bsp/job_201109220959_0001/224/ready
>>> >>
>>> >
>>> > and it hangs forever. Current version is after you committed ChiaHung's
>>> > patch.
>>> > I'm in pseudo-distributed mode with 3 tasks.
>>> >
>>> > Are you going to bench this without the logging? That would be
>>> interesting
>>> > though ;D
>>> >
>>> > 2011/9/22 Thomas Jungblut <th...@googlemail.com>
>>> >
>>> >> That is great. I think we can push this under 200s.
>>> >> I attach a profiler and send you a list of hotspots.
>>> >>
>>> >> lg.
>>> >>
>>> >> 2011/9/22 Edward J. Yoon <ed...@apache.org>
>>> >>
>>> >> By ChiaHung's HAMA-387.patch, hang problem is fixed.
>>> >>>
>>> >>> And also, on same environment (1 rack, 256 cores), a bench example
>>> >>> result is dramatically improved. (184.076 seconds from 307.129
>>> >>> seconds)
>>> >>>
>>> >>> ----
>>> >>> # core/bin/hama jar
>>> >>> examples/target/hama-examples-0.4.0-incubating-SNAPSHOT.jar bench 16
>>> >>> 1000 512
>>> >>> ..
>>> >>> 11/09/22 10:27:32 INFO bsp.BSPJobClient: Current supersteps number:
>>> 504
>>> >>> 11/09/22 10:27:35 INFO bsp.BSPJobClient: Current supersteps number:
>>> 508
>>> >>> 11/09/22 10:27:38 INFO bsp.BSPJobClient: Current supersteps number:
>>> 512
>>> >>> 11/09/22 10:27:38 INFO bsp.BSPJobClient: The total number of
>>> supersteps:
>>> >>> 512
>>> >>> Job Finished in 184.076 seconds
>>> >>>
>>> >>> Hama 0.4 (r.1163903) was:
>>> >>>
>>> >>> 16 bytes | 1000 | 512 | 307.129 seconds
>>> >>>
>>> >>> --
>>> >>> Best Regards, Edward J. Yoon
>>> >>> @eddieyoon
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Thomas Jungblut
>>> >> Berlin
>>> >>
>>> >> mobile: 0170-3081070
>>> >>
>>> >> business: thomas.jungblut@testberichte.de
>>> >> private: thomas.jungblut@gmail.com
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Thomas Jungblut
>>> > Berlin
>>> >
>>> > mobile: 0170-3081070
>>> >
>>> > business: thomas.jungblut@testberichte.de
>>> > private: thomas.jungblut@gmail.com
>>> >
>>>
>>>
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
>>>
>>
>>
>>
>> --
>> Thomas Jungblut
>> Berlin
>>
>> mobile: 0170-3081070
>>
>> business: thomas.jungblut@testberichte.de
>> private: thomas.jungblut@gmail.com
>>
>
>
>
> --
> Thomas Jungblut
> Berlin
>
> mobile: 0170-3081070
>
> business: thomas.jungblut@testberichte.de
> private: thomas.jungblut@gmail.com
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: Awesome bench results after removing Thread.sleep in sync() method.

Posted by Thomas Jungblut <th...@googlemail.com>.
Scripted a fix version:

http://pastebin.com/WbWWxd2R

You can test this as well if you like.

2011/9/22 Thomas Jungblut <th...@googlemail.com>

> And I think we should change this benchmark from random to a stable
> implementation.
> So we should communicate with all of the other peers, not just a random
> peername.
> Then we can precompute (or cache) the tag message and we save the string
> concat operations. We can put them into a directly allocated ByteBuffer as
> well, this will save the serialization.
> But then we cannot compare the results with the versions before.
>
> Additional to the log level change, we can receive a superior performance
> improvement ;)
>
>
> 2011/9/22 Edward J. Yoon <ed...@apache.org>
>
>> Haha, obviously it can't be ignored.
>>
>> Unfortunately, I can't access my test machines now. I'll check tomorrow.
>>
>> On Thu, Sep 22, 2011 at 5:05 PM, Thomas Jungblut
>> <th...@googlemail.com> wrote:
>> > You're going to laugh, but we spend 80% of the time, logging the
>> messages.
>> > Let's change the log level to debug or remove the logging in the bench
>> > example.
>> >
>> > Sadly I still receive
>> >
>> > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
>> >> NoNode for /bsp/job_201109220959_0001/224/ready
>> >>
>> >
>> > and it hangs forever. Current version is after you committed ChiaHung's
>> > patch.
>> > I'm in pseudo-distributed mode with 3 tasks.
>> >
>> > Are you going to bench this without the logging? That would be
>> interesting
>> > though ;D
>> >
>> > 2011/9/22 Thomas Jungblut <th...@googlemail.com>
>> >
>> >> That is great. I think we can push this under 200s.
>> >> I attach a profiler and send you a list of hotspots.
>> >>
>> >> lg.
>> >>
>> >> 2011/9/22 Edward J. Yoon <ed...@apache.org>
>> >>
>> >> By ChiaHung's HAMA-387.patch, hang problem is fixed.
>> >>>
>> >>> And also, on same environment (1 rack, 256 cores), a bench example
>> >>> result is dramatically improved. (184.076 seconds from 307.129
>> >>> seconds)
>> >>>
>> >>> ----
>> >>> # core/bin/hama jar
>> >>> examples/target/hama-examples-0.4.0-incubating-SNAPSHOT.jar bench 16
>> >>> 1000 512
>> >>> ..
>> >>> 11/09/22 10:27:32 INFO bsp.BSPJobClient: Current supersteps number:
>> 504
>> >>> 11/09/22 10:27:35 INFO bsp.BSPJobClient: Current supersteps number:
>> 508
>> >>> 11/09/22 10:27:38 INFO bsp.BSPJobClient: Current supersteps number:
>> 512
>> >>> 11/09/22 10:27:38 INFO bsp.BSPJobClient: The total number of
>> supersteps:
>> >>> 512
>> >>> Job Finished in 184.076 seconds
>> >>>
>> >>> Hama 0.4 (r.1163903) was:
>> >>>
>> >>> 16 bytes | 1000 | 512 | 307.129 seconds
>> >>>
>> >>> --
>> >>> Best Regards, Edward J. Yoon
>> >>> @eddieyoon
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Thomas Jungblut
>> >> Berlin
>> >>
>> >> mobile: 0170-3081070
>> >>
>> >> business: thomas.jungblut@testberichte.de
>> >> private: thomas.jungblut@gmail.com
>> >>
>> >
>> >
>> >
>> > --
>> > Thomas Jungblut
>> > Berlin
>> >
>> > mobile: 0170-3081070
>> >
>> > business: thomas.jungblut@testberichte.de
>> > private: thomas.jungblut@gmail.com
>> >
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>
>
>
>
> --
> Thomas Jungblut
> Berlin
>
> mobile: 0170-3081070
>
> business: thomas.jungblut@testberichte.de
> private: thomas.jungblut@gmail.com
>



-- 
Thomas Jungblut
Berlin

mobile: 0170-3081070

business: thomas.jungblut@testberichte.de
private: thomas.jungblut@gmail.com

Re: Awesome bench results after removing Thread.sleep in sync() method.

Posted by Thomas Jungblut <th...@googlemail.com>.
And I think we should change this benchmark from random to a stable
implementation.
So we should communicate with all of the other peers, not just a random
peername.
Then we can precompute (or cache) the tag message and we save the string
concat operations. We can put them into a directly allocated ByteBuffer as
well, this will save the serialization.
But then we cannot compare the results with the versions before.

Additional to the log level change, we can receive a superior performance
improvement ;)

2011/9/22 Edward J. Yoon <ed...@apache.org>

> Haha, obviously it can't be ignored.
>
> Unfortunately, I can't access my test machines now. I'll check tomorrow.
>
> On Thu, Sep 22, 2011 at 5:05 PM, Thomas Jungblut
> <th...@googlemail.com> wrote:
> > You're going to laugh, but we spend 80% of the time, logging the
> messages.
> > Let's change the log level to debug or remove the logging in the bench
> > example.
> >
> > Sadly I still receive
> >
> > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
> >> NoNode for /bsp/job_201109220959_0001/224/ready
> >>
> >
> > and it hangs forever. Current version is after you committed ChiaHung's
> > patch.
> > I'm in pseudo-distributed mode with 3 tasks.
> >
> > Are you going to bench this without the logging? That would be
> interesting
> > though ;D
> >
> > 2011/9/22 Thomas Jungblut <th...@googlemail.com>
> >
> >> That is great. I think we can push this under 200s.
> >> I attach a profiler and send you a list of hotspots.
> >>
> >> lg.
> >>
> >> 2011/9/22 Edward J. Yoon <ed...@apache.org>
> >>
> >> By ChiaHung's HAMA-387.patch, hang problem is fixed.
> >>>
> >>> And also, on same environment (1 rack, 256 cores), a bench example
> >>> result is dramatically improved. (184.076 seconds from 307.129
> >>> seconds)
> >>>
> >>> ----
> >>> # core/bin/hama jar
> >>> examples/target/hama-examples-0.4.0-incubating-SNAPSHOT.jar bench 16
> >>> 1000 512
> >>> ..
> >>> 11/09/22 10:27:32 INFO bsp.BSPJobClient: Current supersteps number: 504
> >>> 11/09/22 10:27:35 INFO bsp.BSPJobClient: Current supersteps number: 508
> >>> 11/09/22 10:27:38 INFO bsp.BSPJobClient: Current supersteps number: 512
> >>> 11/09/22 10:27:38 INFO bsp.BSPJobClient: The total number of
> supersteps:
> >>> 512
> >>> Job Finished in 184.076 seconds
> >>>
> >>> Hama 0.4 (r.1163903) was:
> >>>
> >>> 16 bytes | 1000 | 512 | 307.129 seconds
> >>>
> >>> --
> >>> Best Regards, Edward J. Yoon
> >>> @eddieyoon
> >>>
> >>
> >>
> >>
> >> --
> >> Thomas Jungblut
> >> Berlin
> >>
> >> mobile: 0170-3081070
> >>
> >> business: thomas.jungblut@testberichte.de
> >> private: thomas.jungblut@gmail.com
> >>
> >
> >
> >
> > --
> > Thomas Jungblut
> > Berlin
> >
> > mobile: 0170-3081070
> >
> > business: thomas.jungblut@testberichte.de
> > private: thomas.jungblut@gmail.com
> >
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>



-- 
Thomas Jungblut
Berlin

mobile: 0170-3081070

business: thomas.jungblut@testberichte.de
private: thomas.jungblut@gmail.com

Re: Awesome bench results after removing Thread.sleep in sync() method.

Posted by "Edward J. Yoon" <ed...@apache.org>.
Haha, obviously it can't be ignored.

Unfortunately, I can't access my test machines now. I'll check tomorrow.

On Thu, Sep 22, 2011 at 5:05 PM, Thomas Jungblut
<th...@googlemail.com> wrote:
> You're going to laugh, but we spend 80% of the time, logging the messages.
> Let's change the log level to debug or remove the logging in the bench
> example.
>
> Sadly I still receive
>
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
>> NoNode for /bsp/job_201109220959_0001/224/ready
>>
>
> and it hangs forever. Current version is after you committed ChiaHung's
> patch.
> I'm in pseudo-distributed mode with 3 tasks.
>
> Are you going to bench this without the logging? That would be interesting
> though ;D
>
> 2011/9/22 Thomas Jungblut <th...@googlemail.com>
>
>> That is great. I think we can push this under 200s.
>> I attach a profiler and send you a list of hotspots.
>>
>> lg.
>>
>> 2011/9/22 Edward J. Yoon <ed...@apache.org>
>>
>> By ChiaHung's HAMA-387.patch, hang problem is fixed.
>>>
>>> And also, on same environment (1 rack, 256 cores), a bench example
>>> result is dramatically improved. (184.076 seconds from 307.129
>>> seconds)
>>>
>>> ----
>>> # core/bin/hama jar
>>> examples/target/hama-examples-0.4.0-incubating-SNAPSHOT.jar bench 16
>>> 1000 512
>>> ..
>>> 11/09/22 10:27:32 INFO bsp.BSPJobClient: Current supersteps number: 504
>>> 11/09/22 10:27:35 INFO bsp.BSPJobClient: Current supersteps number: 508
>>> 11/09/22 10:27:38 INFO bsp.BSPJobClient: Current supersteps number: 512
>>> 11/09/22 10:27:38 INFO bsp.BSPJobClient: The total number of supersteps:
>>> 512
>>> Job Finished in 184.076 seconds
>>>
>>> Hama 0.4 (r.1163903) was:
>>>
>>> 16 bytes | 1000 | 512 | 307.129 seconds
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
>>>
>>
>>
>>
>> --
>> Thomas Jungblut
>> Berlin
>>
>> mobile: 0170-3081070
>>
>> business: thomas.jungblut@testberichte.de
>> private: thomas.jungblut@gmail.com
>>
>
>
>
> --
> Thomas Jungblut
> Berlin
>
> mobile: 0170-3081070
>
> business: thomas.jungblut@testberichte.de
> private: thomas.jungblut@gmail.com
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: Awesome bench results after removing Thread.sleep in sync() method.

Posted by Thomas Jungblut <th...@googlemail.com>.
I think when just changing the log level, log4j will take care of the
if(isEnabled) stuff, so we don't need to fragment our code.
Yes the current rev in trunk contains this snippet. I give you the rest of
the exception:

org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
> NoNode for /bsp/job_201109220959_0001/224/ready
>          at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>          at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>          at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728)
>          at org.apache.hama.bsp.BSPPeer$1.process(BSPPeer.java:396)
>          at
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:488)
>

Here is the part of the log of our zookeeper deamon:

> 2011-09-22 09:59:59,435 INFO
> org.apache.zookeeper.server.PrepRequestProcessor: Got user-level
> KeeperException when processing sessionid:0x1329025208e0003 type:delete
> cxid:0xc01 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error
> Path:/bsp/job_201109220959_0001/222/ready Error:KeeperErrorCode = NoNode for
> /bsp/job_201109220959_0001/222/ready
> 2011-09-22 09:59:59,499 INFO
> org.apache.zookeeper.server.PrepRequestProcessor: Got user-level
> KeeperException when processing sessionid:0x1329025208e0003 type:create
> cxid:0xc0e zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error
> Path:/bsp/job_201109220959_0001/223/ready Error:KeeperErrorCode = NodeExists
> for /bsp/job_201109220959_0001/223/ready
> 2011-09-22 09:59:59,627 INFO
> org.apache.zookeeper.server.PrepRequestProcessor: Got user-level
> KeeperException when processing sessionid:0x1329025208e0004 type:delete
> cxid:0xc22 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error
> Path:/bsp/job_201109220959_0001/224/ready Error:KeeperErrorCode = NoNode for
> /bsp/job_201109220959_0001/224/ready
>

2011/9/22 ChiaHung Lin <ch...@nuk.edu.tw>

> We might need to change log method by adding
>
> if(LOG.isInfoEnabled()){
>  ...
> }
>
> at least it can prevent string concatenation for performance optimization.
> (debug can be changed to if(LOG.isDebugEnabled()){} for performance
> optimization, too.)
>
> In addition, can you help check if enterBarrier() contains the following
> code snippet?
>
>   ...
>   zk.exists(pathToSuperstepZnode+"/ready", new Watcher() {
>      @Override
>      public void process(WatchedEvent event) {
>          // check if /ready znode exists, then delete it.
>          ...
>          } catch(KeeperException.NoNodeException nne) {
>            LOG.warn("Ignore because znode may be deleted.", nne);
>          }...
>      }
>    });
>    zk.create(getNodeName(), null, Ids.OPEN_ACL_UNSAFE,
> CreateMode.EPHEMERAL);
>    ...
>
> It looks like bsp peer is trying to remove /ready znode which may have
> already been removed by other bsp peer. Or stack trace in log would be
> helpful.
>
>
> -----Original message-----
> From:Thomas Jungblut <th...@googlemail.com>
> To:hama-dev@incubator.apache.org
> Date:Thu, 22 Sep 2011 10:05:52 +0200
> Subject:Re: Awesome bench results after removing Thread.sleep in sync()
> method.
>
> You're going to laugh, but we spend 80% of the time, logging the messages.
> Let's change the log level to debug or remove the logging in the bench
> example.
>
> Sadly I still receive
>
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
> > NoNode for /bsp/job_201109220959_0001/224/ready
> >
>
> and it hangs forever. Current version is after you committed ChiaHung's
> patch.
> I'm in pseudo-distributed mode with 3 tasks.
>
> Are you going to bench this without the logging? That would be interesting
> though ;D
>
> 2011/9/22 Thomas Jungblut <th...@googlemail.com>
>
> > That is great. I think we can push this under 200s.
> > I attach a profiler and send you a list of hotspots.
> >
> > lg.
> >
> > 2011/9/22 Edward J. Yoon <ed...@apache.org>
> >
> > By ChiaHung's HAMA-387.patch, hang problem is fixed.
> >>
> >> And also, on same environment (1 rack, 256 cores), a bench example
> >> result is dramatically improved. (184.076 seconds from 307.129
> >> seconds)
> >>
> >> ----
> >> # core/bin/hama jar
> >> examples/target/hama-examples-0.4.0-incubating-SNAPSHOT.jar bench 16
> >> 1000 512
> >> ..
> >> 11/09/22 10:27:32 INFO bsp.BSPJobClient: Current supersteps number: 504
> >> 11/09/22 10:27:35 INFO bsp.BSPJobClient: Current supersteps number: 508
> >> 11/09/22 10:27:38 INFO bsp.BSPJobClient: Current supersteps number: 512
> >> 11/09/22 10:27:38 INFO bsp.BSPJobClient: The total number of supersteps:
> >> 512
> >> Job Finished in 184.076 seconds
> >>
> >> Hama 0.4 (r.1163903) was:
> >>
> >> 16 bytes | 1000 | 512 | 307.129 seconds
> >>
> >> --
> >> Best Regards, Edward J. Yoon
> >> @eddieyoon
> >>
> >
> >
> >
> > --
> > Thomas Jungblut
> > Berlin
> >
> > mobile: 0170-3081070
> >
> > business: thomas.jungblut@testberichte.de
> > private: thomas.jungblut@gmail.com
> >
>
>
>
> --
> Thomas Jungblut
> Berlin
>
> mobile: 0170-3081070
>
> business: thomas.jungblut@testberichte.de
> private: thomas.jungblut@gmail.com
>
>
> --
> ChiaHung Lin
> Department of Information Management
> National University of Kaohsiung
> Taiwan
>



-- 
Thomas Jungblut
Berlin

mobile: 0170-3081070

business: thomas.jungblut@testberichte.de
private: thomas.jungblut@gmail.com

Re: Awesome bench results after removing Thread.sleep in sync() method.

Posted by ChiaHung Lin <ch...@nuk.edu.tw>.
We might need to change log method by adding  

if(LOG.isInfoEnabled()){
  ...
}

at least it can prevent string concatenation for performance optimization. (debug can be changed to if(LOG.isDebugEnabled()){} for performance optimization, too.)

In addition, can you help check if enterBarrier() contains the following code snippet?

   ...
   zk.exists(pathToSuperstepZnode+"/ready", new Watcher() {
      @Override
      public void process(WatchedEvent event) {
          // check if /ready znode exists, then delete it. 
          ... 
          } catch(KeeperException.NoNodeException nne) {
            LOG.warn("Ignore because znode may be deleted.", nne);
          }...
      }
    });
    zk.create(getNodeName(), null, Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL);
    ...

It looks like bsp peer is trying to remove /ready znode which may have already been removed by other bsp peer. Or stack trace in log would be helpful. 


-----Original message-----
From:Thomas Jungblut <th...@googlemail.com>
To:hama-dev@incubator.apache.org
Date:Thu, 22 Sep 2011 10:05:52 +0200
Subject:Re: Awesome bench results after removing Thread.sleep in sync() method.

You're going to laugh, but we spend 80% of the time, logging the messages.
Let's change the log level to debug or remove the logging in the bench
example.

Sadly I still receive

org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
> NoNode for /bsp/job_201109220959_0001/224/ready
>

and it hangs forever. Current version is after you committed ChiaHung's
patch.
I'm in pseudo-distributed mode with 3 tasks.

Are you going to bench this without the logging? That would be interesting
though ;D

2011/9/22 Thomas Jungblut <th...@googlemail.com>

> That is great. I think we can push this under 200s.
> I attach a profiler and send you a list of hotspots.
>
> lg.
>
> 2011/9/22 Edward J. Yoon <ed...@apache.org>
>
> By ChiaHung's HAMA-387.patch, hang problem is fixed.
>>
>> And also, on same environment (1 rack, 256 cores), a bench example
>> result is dramatically improved. (184.076 seconds from 307.129
>> seconds)
>>
>> ----
>> # core/bin/hama jar
>> examples/target/hama-examples-0.4.0-incubating-SNAPSHOT.jar bench 16
>> 1000 512
>> ..
>> 11/09/22 10:27:32 INFO bsp.BSPJobClient: Current supersteps number: 504
>> 11/09/22 10:27:35 INFO bsp.BSPJobClient: Current supersteps number: 508
>> 11/09/22 10:27:38 INFO bsp.BSPJobClient: Current supersteps number: 512
>> 11/09/22 10:27:38 INFO bsp.BSPJobClient: The total number of supersteps:
>> 512
>> Job Finished in 184.076 seconds
>>
>> Hama 0.4 (r.1163903) was:
>>
>> 16 bytes | 1000 | 512 | 307.129 seconds
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>
>
>
>
> --
> Thomas Jungblut
> Berlin
>
> mobile: 0170-3081070
>
> business: thomas.jungblut@testberichte.de
> private: thomas.jungblut@gmail.com
>



-- 
Thomas Jungblut
Berlin

mobile: 0170-3081070

business: thomas.jungblut@testberichte.de
private: thomas.jungblut@gmail.com


--
ChiaHung Lin
Department of Information Management
National University of Kaohsiung
Taiwan

Re: Awesome bench results after removing Thread.sleep in sync() method.

Posted by Thomas Jungblut <th...@googlemail.com>.
You're going to laugh, but we spend 80% of the time, logging the messages.
Let's change the log level to debug or remove the logging in the bench
example.

Sadly I still receive

org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
> NoNode for /bsp/job_201109220959_0001/224/ready
>

and it hangs forever. Current version is after you committed ChiaHung's
patch.
I'm in pseudo-distributed mode with 3 tasks.

Are you going to bench this without the logging? That would be interesting
though ;D

2011/9/22 Thomas Jungblut <th...@googlemail.com>

> That is great. I think we can push this under 200s.
> I attach a profiler and send you a list of hotspots.
>
> lg.
>
> 2011/9/22 Edward J. Yoon <ed...@apache.org>
>
> By ChiaHung's HAMA-387.patch, hang problem is fixed.
>>
>> And also, on same environment (1 rack, 256 cores), a bench example
>> result is dramatically improved. (184.076 seconds from 307.129
>> seconds)
>>
>> ----
>> # core/bin/hama jar
>> examples/target/hama-examples-0.4.0-incubating-SNAPSHOT.jar bench 16
>> 1000 512
>> ..
>> 11/09/22 10:27:32 INFO bsp.BSPJobClient: Current supersteps number: 504
>> 11/09/22 10:27:35 INFO bsp.BSPJobClient: Current supersteps number: 508
>> 11/09/22 10:27:38 INFO bsp.BSPJobClient: Current supersteps number: 512
>> 11/09/22 10:27:38 INFO bsp.BSPJobClient: The total number of supersteps:
>> 512
>> Job Finished in 184.076 seconds
>>
>> Hama 0.4 (r.1163903) was:
>>
>> 16 bytes | 1000 | 512 | 307.129 seconds
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>
>
>
>
> --
> Thomas Jungblut
> Berlin
>
> mobile: 0170-3081070
>
> business: thomas.jungblut@testberichte.de
> private: thomas.jungblut@gmail.com
>



-- 
Thomas Jungblut
Berlin

mobile: 0170-3081070

business: thomas.jungblut@testberichte.de
private: thomas.jungblut@gmail.com

Re: Awesome bench results after removing Thread.sleep in sync() method.

Posted by Thomas Jungblut <th...@googlemail.com>.
That is great. I think we can push this under 200s.
I attach a profiler and send you a list of hotspots.

lg.

2011/9/22 Edward J. Yoon <ed...@apache.org>

> By ChiaHung's HAMA-387.patch, hang problem is fixed.
>
> And also, on same environment (1 rack, 256 cores), a bench example
> result is dramatically improved. (184.076 seconds from 307.129
> seconds)
>
> ----
> # core/bin/hama jar
> examples/target/hama-examples-0.4.0-incubating-SNAPSHOT.jar bench 16
> 1000 512
> ..
> 11/09/22 10:27:32 INFO bsp.BSPJobClient: Current supersteps number: 504
> 11/09/22 10:27:35 INFO bsp.BSPJobClient: Current supersteps number: 508
> 11/09/22 10:27:38 INFO bsp.BSPJobClient: Current supersteps number: 512
> 11/09/22 10:27:38 INFO bsp.BSPJobClient: The total number of supersteps:
> 512
> Job Finished in 184.076 seconds
>
> Hama 0.4 (r.1163903) was:
>
> 16 bytes | 1000 | 512 | 307.129 seconds
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>



-- 
Thomas Jungblut
Berlin

mobile: 0170-3081070

business: thomas.jungblut@testberichte.de
private: thomas.jungblut@gmail.com