You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Siddharth Seth (JIRA)" <ji...@apache.org> on 2014/04/30 01:24:20 UTC

[jira] [Updated] (TEZ-661) Implement a non-sorted partitioned output

     [ https://issues.apache.org/jira/browse/TEZ-661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Siddharth Seth updated TEZ-661:
-------------------------------

    Summary: Implement a non-sorted partitioned output  (was: Implement a non-sorted scatter-gather input/output)

> Implement a non-sorted partitioned output
> -----------------------------------------
>
>                 Key: TEZ-661
>                 URL: https://issues.apache.org/jira/browse/TEZ-661
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Daniel Dai
>            Assignee: Siddharth Seth
>         Attachments: TEZ-661.1.txt
>
>
> When implementing Pig union, we need to gather data from two or more upstream vertexes without sorting. The vertex itself might consists of several tasks. Ideally, it should use OnFileUnorderedKVOutput with DataMovementType.SCATTER_GATHER. However, this combination does not work according to [~hitesh]. We need to implement that. Also, key is meaningless in this scenario, we just want to evenly distribute the output records to tasks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)