You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Bikas Saha (JIRA)" <ji...@apache.org> on 2013/10/08 03:15:43 UTC

[jira] [Updated] (TEZ-534) Add an InputFormat that combines original splits into groups

     [ https://issues.apache.org/jira/browse/TEZ-534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bikas Saha updated TEZ-534:
---------------------------

    Issue Type: Improvement  (was: Bug)

> Add an InputFormat that combines original splits into groups
> ------------------------------------------------------------
>
>                 Key: TEZ-534
>                 URL: https://issues.apache.org/jira/browse/TEZ-534
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>
> It would be useful if we could take the original input splits generated by a FileFormat and combine them to create a desired number of splits by combining them using location information. We get the advantage of using native logic to create the original splits as opposed to CombineFileInputFormat that does its own split calculation. However, native splits may be too large in number for efficient task execution. Combining them into a desired number of grouped splits will help.



--
This message was sent by Atlassian JIRA
(v6.1#6144)