You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Namit Jain (JIRA)" <ji...@apache.org> on 2009/11/13 03:14:39 UTC

[jira] Created: (HIVE-929) hive.map.mergefiles increases the size in some cases

hive.map.mergefiles increases the size in some cases
----------------------------------------------------

                 Key: HIVE-929
                 URL: https://issues.apache.org/jira/browse/HIVE-929
             Project: Hadoop Hive
          Issue Type: Bug
            Reporter: Namit Jain


Due to random clustering, the size is increased in some cases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HIVE-929) hive.map.mergefiles increases the size in some cases

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain reassigned HIVE-929:
-------------------------------

    Assignee: He Yongqiang

> hive.map.mergefiles increases the size in some cases
> ----------------------------------------------------
>
>                 Key: HIVE-929
>                 URL: https://issues.apache.org/jira/browse/HIVE-929
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Namit Jain
>            Assignee: He Yongqiang
>
> Due to random clustering, the size is increased in some cases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HIVE-929) hive.map.mergefiles increases the size in some cases

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain resolved HIVE-929.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 0.5.0
     Hadoop Flags: [Reviewed]

Committed. Thanks Yonqiang

> hive.map.mergefiles increases the size in some cases
> ----------------------------------------------------
>
>                 Key: HIVE-929
>                 URL: https://issues.apache.org/jira/browse/HIVE-929
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Namit Jain
>            Assignee: He Yongqiang
>             Fix For: 0.5.0
>
>         Attachments: hive-929-2009-11-13.patch
>
>
> Due to random clustering, the size is increased in some cases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-929) hive.map.mergefiles increases the size in some cases

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777348#action_12777348 ] 

Namit Jain commented on HIVE-929:
---------------------------------

Currently, we use only one size:

"hive.merge.size.per.task"

whose default value is 256M.

We should add another parameter

"hive.merge.smallfiles.avgsize"

whose default value can be much smaller, say 16M.

We will only merge if the current average size of a file < "hive.merge.smallfiles.avgsize".

This will make sure that merging will happen only in very bad cases.

> hive.map.mergefiles increases the size in some cases
> ----------------------------------------------------
>
>                 Key: HIVE-929
>                 URL: https://issues.apache.org/jira/browse/HIVE-929
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Namit Jain
>
> Due to random clustering, the size is increased in some cases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-929) hive.map.mergefiles increases the size in some cases

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777810#action_12777810 ] 

Namit Jain commented on HIVE-929:
---------------------------------

+1

looks good - will commit if the tests pass

> hive.map.mergefiles increases the size in some cases
> ----------------------------------------------------
>
>                 Key: HIVE-929
>                 URL: https://issues.apache.org/jira/browse/HIVE-929
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Namit Jain
>            Assignee: He Yongqiang
>         Attachments: hive-929-2009-11-13.patch
>
>
> Due to random clustering, the size is increased in some cases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-929) hive.map.mergefiles increases the size in some cases

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

He Yongqiang updated HIVE-929:
------------------------------

    Attachment: hive-929-2009-11-13.patch

> hive.map.mergefiles increases the size in some cases
> ----------------------------------------------------
>
>                 Key: HIVE-929
>                 URL: https://issues.apache.org/jira/browse/HIVE-929
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Namit Jain
>            Assignee: He Yongqiang
>         Attachments: hive-929-2009-11-13.patch
>
>
> Due to random clustering, the size is increased in some cases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.