You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by sree deepya <md...@gmail.com> on 2009/05/20 13:22:55 UTC

doubt regarding multiprocessing in hadoop

Hi,
   I have a small doubt regarding possibility of multiprocessing  in
hadoop.Is it possible ????
 To be more clear,I have some processes that take around a week or two to
execute.Is it possible to use hadoop to
  reduce this execution time by dividing  these processes into a number of
processes and executing simultaneously on
  different nodes???Or is there any other way in which my objective can be
achieved using hadoop????

   If yes, Can you please throw some light on this issue????

Thanks in advance,

regards,
SreeDeepya.

Re: doubt regarding multiprocessing in hadoop

Posted by Steve Loughran <st...@apache.org>.

Ted Dunning wrote:

> - you should realize that using hadoop, which is typical open source
> software, puts a substantial burden on you to find solutions.  Members of
> the community will help you, but they will help you enormously more if you
> make a strong effort to inform yourself first.
> 

Ted makes a good point here. You not only get the right to fiddle with 
the source, you get the responsibility to track down the problems that 
happen on your machines. The bugs may get fixed by other people -maybe- 
but without detailed diagnostics and stack traces, your problems will 
remain your problems, and not those of anyone else.

Re: doubt regarding multiprocessing in hadoop

Posted by Ted Dunning <te...@gmail.com>.

Have you read the papers on map-reduce?  It seems that you might be able to
answer the question yourself.

In particular, though,

- if you want real-time response, map-reduce in general and hadoop
specifically is probably not the answer you need.

- if your problem is just to process independent chunks which can then be
later recombined and you care about throughput instead of latency, then
map-reduce might be very good.

- you should realize that using hadoop, which is typical open source
software, puts a substantial burden on you to find solutions.  Members of
the community will help you, but they will help you enormously more if you
make a strong effort to inform yourself first.

On Wed, May 20, 2009 at 11:50 AM, sree deepya <md...@gmail.com> wrote:

>   The process is related to speech synthesis.Chunks of speech blocks are to
> be processed which may take time.
>

Re: doubt regarding multiprocessing in hadoop

Posted by sree deepya <md...@gmail.com>.

Hi,

   The process is related to speech synthesis.Chunks of speech blocks are to
be processed which may take time.

regards,
SreeDeepya.

On Wed, May 20, 2009 at 8:20 PM, Peter Chacko <pe...@gmail.com>wrote:

> If your process is not inherently designed for concurrency with
> multiple logical components as different threads of execution,
> dispatchable to multiple cores in parallel, you cannot make it
> mufti-core aware by splitting it as different threads. (if these
> threads are sequential by nature, you get the same compute  latency )
>
> If  your process is computation intensive, you can try using intel
> CMP-compilers, or  if it is IO intensive, there is nothing much you
> can do unless its a NUMA system or (any asymmetric cluster, where you
> can leverage node-locality benefits).
>
> I am just curious to know what process is this, that takes 1-2 weeks
> to execute, in   CPUs. currently available in the market..
>
> thanks
>
> On Wed, May 20, 2009 at 4:52 PM, sree deepya <md...@gmail.com> wrote:
> > Hi,
> >   I have a small doubt regarding possibility of multiprocessing  in
> > hadoop.Is it possible ????
> >  To be more clear,I have some processes that take around a week or two to
> > execute.Is it possible to use hadoop to
> >  reduce this execution time by dividing  these processes into a number of
> > processes and executing simultaneously on
> >  different nodes???Or is there any other way in which my objective can be
> > achieved using hadoop????
> >
> >   If yes, Can you please throw some light on this issue????
> >
> > Thanks in advance,
> >
> > regards,
> > SreeDeepya.
> >
>

Re: doubt regarding multiprocessing in hadoop

Posted by Peter Chacko <pe...@gmail.com>.

If your process is not inherently designed for concurrency with
multiple logical components as different threads of execution,
dispatchable to multiple cores in parallel, you cannot make it
mufti-core aware by splitting it as different threads. (if these
threads are sequential by nature, you get the same compute  latency )

If  your process is computation intensive, you can try using intel
CMP-compilers, or  if it is IO intensive, there is nothing much you
can do unless its a NUMA system or (any asymmetric cluster, where you
can leverage node-locality benefits).

I am just curious to know what process is this, that takes 1-2 weeks
to execute, in   CPUs. currently available in the market..

thanks

On Wed, May 20, 2009 at 4:52 PM, sree deepya <md...@gmail.com> wrote:
> Hi,
>   I have a small doubt regarding possibility of multiprocessing  in
> hadoop.Is it possible ????
>  To be more clear,I have some processes that take around a week or two to
> execute.Is it possible to use hadoop to
>  reduce this execution time by dividing  these processes into a number of
> processes and executing simultaneously on
>  different nodes???Or is there any other way in which my objective can be
> achieved using hadoop????
>
>   If yes, Can you please throw some light on this issue????
>
> Thanks in advance,
>
> regards,
> SreeDeepya.
>

Re: doubt regarding multiprocessing in hadoop

Posted by Steve Loughran <st...@apache.org>.

sree deepya wrote:
> Hi,
>    I have a small doubt regarding possibility of multiprocessing  in
> hadoop.Is it possible ????
>  To be more clear,I have some processes that take around a week or two to
> execute.Is it possible to use hadoop to
>   reduce this execution time by dividing  these processes into a number of
> processes and executing simultaneously on
>   different nodes???Or is there any other way in which my objective can be
> achieved using hadoop????
> 
>    If yes, Can you please throw some light on this issue????
> 
> Thanks in advance,
> 
> regards,
> SreeDeepya.
> 

If you can rewrite your algorithms as Maps and Reduces, then yes. If 
not, then no.

http://wiki.apache.org/hadoop/MapReduce