You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@tez.apache.org by Maria <li...@126.com> on 2016/04/24 17:22:08 UTC

How does tez know a map task's outputs route to which reduce task after auto-reduce done..

Hi,all:
     recently I am studying tez shuffle logic. I read "Apache Tez: Dynamic Graph Reconfiguration", and have two questions: 
（1）I am not very clear about this "The data samples could be sent via the VertexManager events to the vertex manager that can create the key-range histogram and determine the correct number of partitions. It can then assign the appropriate key-ranges to each partition". How does tez assign the appropriate key-ranges to each partition? by event?
（2）Before auto-reduce, a map task's outputs should go which reduce task is decided by partitioner. But after auto-reduce, number of reduce tasks desc. 
how does tez decide that a map task's outputs route to which reduce task after auto-reduce done?

thanks in advance for any reply.


Maria.Lu