You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hama.apache.org by Praveen Sripati <pr...@gmail.com> on 2012/04/19 03:01:03 UTC

Partitioning in Hama

1. Lets say the input is partitioned into part0, part1, part2, part3 and
part4. How is it ensured that bsp0 processes part0, bsp1 processes part1
and so on and there is no mix? We don't want bsp0 to process part1.

private void send(BSPPeerProtocol peer, BSPMessage msg) throws IOException {
  int mod = ((Integer) msg.getTag()) % peer.getAllPeerNames().length;
  peer.send(peer.getAllPeerNames()[mod], msg);
}

2) If the partition file size is more than the HDFS block size and 1+ bsp
task processes a single partition, how is this scenario handled?

Thanks,
Praveen

Re: Partitioning in Hama

Posted by Thomas Jungblut <th...@googlemail.com>.
Sorry for the delay.

1) This works similar how Hadoop distributes the Keys to the reducers,
there is a HashPartitioner that rewrites the vertices to n-files where n is
the number of tasks.
2) block size doesn't matter in this case because a filesplit will be
associated with the partitioned files.

Am 19. April 2012 03:01 schrieb Praveen Sripati <pr...@gmail.com>:

> 1. Lets say the input is partitioned into part0, part1, part2, part3 and
> part4. How is it ensured that bsp0 processes part0, bsp1 processes part1
> and so on and there is no mix? We don't want bsp0 to process part1.
>
> private void send(BSPPeerProtocol peer, BSPMessage msg) throws IOException
> {
>  int mod = ((Integer) msg.getTag()) % peer.getAllPeerNames().length;
>  peer.send(peer.getAllPeerNames()[mod], msg);
> }
>
> 2) If the partition file size is more than the HDFS block size and 1+ bsp
> task processes a single partition, how is this scenario handled?
>
> Thanks,
> Praveen
>



-- 
Thomas Jungblut
Berlin <th...@gmail.com>