You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by jamal sasha <ja...@gmail.com> on 2013/11/15 09:44:23 UTC
Dealing with stragglers in hadoop
Hi,
I have a very simple use case...
Basically I have an edge list and I am trying to convert it into adjacency
list..
Basically
src target
a b
a c
b d
b e
and so on..
What I am trying to build is
a [b,c]
b [d,e]
.. and so on..
But every now and then.. I hit a super node..which has millions of edges..
Thus keying on just node id is results in poor MR execution because of this
straggler reducer..
I have been trying to understand partitioner.. but I am at lost how to use
it here?
How do i solve this straggler issue?
Thanks