You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by "Rajen Bhatt (RBEI/EST1)" <Ra...@in.bosch.com> on 2011/09/29 11:13:46 UTC

Execution Timing Comparison

Guys:
I and my student are new to the MapReduce an Hadoop world, so my question may be very basic one.
We have taken one HD resolution image and performed some basic image processing operations by partitioning it and sending it to the multiple worker nodes. We are able to send JPEG and PNG images and from the reducer we are able to received an image after basic operations.
We were just debating that is it OK to perform the timing comparison of such operation on Hadoop and on single machine?
We are currently feeling that Hadoop is taking more time, may be because we are not actually implementing it on the cluster but on our QuadCore laptop.
We are also getting feeling that when we perform such operations on thousands and millions of images we may get better time achievement rather than checking it on the single image.
It would be good if experts group can say something on that.
Thanks and Regards,

~~
Dr. Rajen Bhatt
(Corporate Research @ Robert Bosch, India)
Off: +91-80-4191-2025
Mob: +91-9901241005

Re: Execution Timing Comparison

Posted by Kamesh <ka...@imaginea.com>.

On Thursday 29 September 2011 02:56 PM, Tanweiguo wrote:
> s we may get better time achievement rat

When we process a short running job on Hadoop, framework overhead is
more than job processing time. When MR executes a job, as we know TT
needs to spawn atleast one process (atleast map jvm, suppose job does
not have any reducers). But when you execute the same short running job
as a non hadoop job, there won't be no framework overhead time. I feel
it could be the reason in your experiment.

Watch out for more comments.
-- 
/Thanks&Regards,/
/Bh.V.S.Kamesh/

Re: Execution Timing Comparison

Posted by Tanweiguo <ta...@huawei.com>.

You can check how many Mappers and Reducers are executed in the job. If there is only one, it should be slower than that on single machine.

发件人: Rajen Bhatt (RBEI/EST1) [mailto:Rajen.Bhatt@in.bosch.com]
发送时间: 2011年9月29日 17:14
收件人: mapreduce-user@hadoop.apache.org
主题: Execution Timing Comparison

Guys:
I and my student are new to the MapReduce an Hadoop world, so my question may be very basic one.
We have taken one HD resolution image and performed some basic image processing operations by partitioning it and sending it to the multiple worker nodes. We are able to send JPEG and PNG images and from the reducer we are able to received an image after basic operations.
We were just debating that is it OK to perform the timing comparison of such operation on Hadoop and on single machine?
We are currently feeling that Hadoop is taking more time, may be because we are not actually implementing it on the cluster but on our QuadCore laptop.
We are also getting feeling that when we perform such operations on thousands and millions of images we may get better time achievement rather than checking it on the single image.
It would be good if experts group can say something on that.
Thanks and Regards,

~~
Dr. Rajen Bhatt
(Corporate Research @ Robert Bosch, India)
Off: +91-80-4191-2025
Mob: +91-9901241005
[cid:image001.jpg@01CC7ECC.FB5EE470]