You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by jamal sasha <ja...@gmail.com> on 2013/11/15 09:44:23 UTC

Dealing with stragglers in hadoop

Hi,
  I have a very simple use case...
Basically I have an edge list and I am trying to convert it into adjacency
list..
Basically

src target
a     b
a    c
b    d
b    e

and so on..
What I am trying to build is

a [b,c]
b [d,e]
.. and so on..

But every now and then.. I hit a super node..which has millions of edges..

Thus keying on just node id is results in poor MR execution because of this
straggler reducer..

I have been trying to understand partitioner.. but I am at lost how to use
it here?

How do i solve this straggler issue?
Thanks