You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by dpolacza <dp...@gmail.com> on 2017/02/10 19:01:23 UTC

ignite as Hadoop's Map Reduce engine - how it works?

Hi,
Can someone describe how MapReduce is implemented when hadoop's job is
processed?
How does shuffling work?
How is Reduce node chosen?


Is it possible to speedup job scheduling by changing standard settings? I
saw in Debug log that some transactions are initiated is it possible to turn
it off?

Regards



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/ignite-as-Hadoop-s-Map-Reduce-engine-how-it-works-tp10551.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: ignite as Hadoop's Map Reduce engine - how it works?

Posted by Vladimir Ozerov <vo...@gridgain.com>.

Hi,

Our Hadoop Accelerator has its own shuffle algorithm. When jobs request
arrives, we assign mappers and reducers to the most appropriate nodes in
terms of data locality and available resources. 

Shuffle itself adds K-V pairs of local reducer to sorted collection right
away. K-V pairs of remote reducers are packed into batches and sent to them,
then added to sorted collection.

Internally we share job state between nodes with help of cache, this is why
you may see occasional transactions. 



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/ignite-as-Hadoop-s-Map-Reduce-engine-how-it-works-tp10551p10592.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.