You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Bikas Saha (JIRA)" <ji...@apache.org> on 2014/08/14 00:42:16 UTC

[jira] [Comment Edited] (TEZ-1132) Consistent naming of Input and Outputs

    [ https://issues.apache.org/jira/browse/TEZ-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14096248#comment-14096248 ] 

Bikas Saha edited comment on TEZ-1132 at 8/13/14 10:41 PM:
-----------------------------------------------------------

LocalOnFileSorterOutput should probably be removed.
OnFileSortedOutput -> OnFileOrderedPartitionedKVOutput
Change KV to KeyValue in all names.

Do we need the OnFile prefix on these? These could potentially write to HDFS?

LocalMergedInput should probably be moved out.
SortedGroupedMergedInput -> OrderedGroupedMergedInput
ShuffledMergedInput -> ShuffledOrderedGroupedInput
ShuffledMergedInputLegacy -> ShuffledOrderedGroupedInput

Is the Shuffled prefix needed? The reader threads could potentially read from HDFS?


[~zjffdu] Do you mind if I take this over. This may be easier to do in PST as most of the Hive/Pig people who will get broken because of this are in the same time zone and could iterate faster over it and ask for help if needed.




was (Author: bikassaha):
LocalOnFileSorterOutput should probably be removed.
OnFileSortedOutput -> OnFileOrderedPartitionedKVOutput
Change KV to KeyValue in all names.

Do we need the OnFile prefix on these? These could potentially write to HDFS?

LocalMergedInput should probably be moved out.
SortedGroupedMergedInput -> OrderedGroupedMergedInput
ShuffledMergedInput -> ShuffledOrderedGroupedInput
ShuffledMergedInputLegacy -> ShuffledOrderedGroupedInput

[~zjffdu] Do you mind if I take this over. This may be easier to do in PST as most of the Hive/Pig people who will get broken because of this are in the same time zone and could iterate faster over it and ask for help if needed.



> Consistent naming of Input and Outputs
> --------------------------------------
>
>                 Key: TEZ-1132
>                 URL: https://issues.apache.org/jira/browse/TEZ-1132
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Jeff Zhang
>            Priority: Blocker
>
> Some places we should Sorted Partitioned. In others we should Shuffled. We should use a consistent naming scheme based on Sorted, Grouped, Partitioned sub-terms so that the function is clear from the name.



--
This message was sent by Atlassian JIRA
(v6.2#6252)