You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Navis (JIRA)" <ji...@apache.org> on 2012/07/23 10:02:34 UTC

[jira] [Created] (HIVE-3290) BucketizedHiveInputFormat should support combining files having same bucket number

Navis created HIVE-3290:
---------------------------

             Summary: BucketizedHiveInputFormat should support combining files having same bucket number
                 Key: HIVE-3290
                 URL: https://issues.apache.org/jira/browse/HIVE-3290
             Project: Hive
          Issue Type: Improvement
          Components: Query Processor
    Affects Versions: 0.10.0
            Reporter: Navis
            Assignee: Navis
            Priority: Minor


Current BucketizedHiveInputFormat creates one split per one input file, which could result too many map tasks. If input files are not so big (make configurable threshold?), combining files with same bucket number and same input format could help reducing total execution time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira