You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ning Zhang (JIRA)" <ji...@apache.org> on 2011/03/09 08:13:59 UTC

[jira] Created: (HIVE-2037) Merge result file size should honor hive.merge.size.per.task

Merge result file size should honor hive.merge.size.per.task
------------------------------------------------------------

                 Key: HIVE-2037
                 URL: https://issues.apache.org/jira/browse/HIVE-2037
             Project: Hive
          Issue Type: Bug
            Reporter: Ning Zhang
            Assignee: Ning Zhang
         Attachments: HIVE-2037.patch

The merge job set mapred.min.split.size to the value of hive.merge.size.per.task, which roughly equals to the output file size. However the input split size is also determined by mapred.min.split.size.per.node, mapred.min.split.size.per.rack, and mapred.max.split.size. They should be set the same as hive.merge.size.per.task as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-2037) Merge result file size should honor hive.merge.size.per.task

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-2037:
-----------------------------

    Attachment: HIVE-2037.patch

> Merge result file size should honor hive.merge.size.per.task
> ------------------------------------------------------------
>
>                 Key: HIVE-2037
>                 URL: https://issues.apache.org/jira/browse/HIVE-2037
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-2037.patch
>
>
> The merge job set mapred.min.split.size to the value of hive.merge.size.per.task, which roughly equals to the output file size. However the input split size is also determined by mapred.min.split.size.per.node, mapred.min.split.size.per.rack, and mapred.max.split.size. They should be set the same as hive.merge.size.per.task as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-2037) Merge result file size should honor hive.merge.size.per.task

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004852#comment-13004852 ] 

Ning Zhang commented on HIVE-2037:
----------------------------------

@joy, the unit tests are clean. 

> Merge result file size should honor hive.merge.size.per.task
> ------------------------------------------------------------
>
>                 Key: HIVE-2037
>                 URL: https://issues.apache.org/jira/browse/HIVE-2037
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-2037.patch
>
>
> The merge job set mapred.min.split.size to the value of hive.merge.size.per.task, which roughly equals to the output file size. However the input split size is also determined by mapred.min.split.size.per.node, mapred.min.split.size.per.rack, and mapred.max.split.size. They should be set the same as hive.merge.size.per.task as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-2037) Merge result file size should honor hive.merge.size.per.task

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-2037:
-----------------------------

    Status: Patch Available  (was: Open)

> Merge result file size should honor hive.merge.size.per.task
> ------------------------------------------------------------
>
>                 Key: HIVE-2037
>                 URL: https://issues.apache.org/jira/browse/HIVE-2037
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-2037.patch
>
>
> The merge job set mapred.min.split.size to the value of hive.merge.size.per.task, which roughly equals to the output file size. However the input split size is also determined by mapred.min.split.size.per.node, mapred.min.split.size.per.rack, and mapred.max.split.size. They should be set the same as hive.merge.size.per.task as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-2037) Merge result file size should honor hive.merge.size.per.task

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004866#comment-13004866 ] 

Joydeep Sen Sarma commented on HIVE-2037:
-----------------------------------------

committed. thanks Ning.

> Merge result file size should honor hive.merge.size.per.task
> ------------------------------------------------------------
>
>                 Key: HIVE-2037
>                 URL: https://issues.apache.org/jira/browse/HIVE-2037
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.8.0
>
>         Attachments: HIVE-2037.patch
>
>
> The merge job set mapred.min.split.size to the value of hive.merge.size.per.task, which roughly equals to the output file size. However the input split size is also determined by mapred.min.split.size.per.node, mapred.min.split.size.per.rack, and mapred.max.split.size. They should be set the same as hive.merge.size.per.task as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-2037) Merge result file size should honor hive.merge.size.per.task

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004802#comment-13004802 ] 

Joydeep Sen Sarma commented on HIVE-2037:
-----------------------------------------

looks ok - please run the tests and i will commit.

> Merge result file size should honor hive.merge.size.per.task
> ------------------------------------------------------------
>
>                 Key: HIVE-2037
>                 URL: https://issues.apache.org/jira/browse/HIVE-2037
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-2037.patch
>
>
> The merge job set mapred.min.split.size to the value of hive.merge.size.per.task, which roughly equals to the output file size. However the input split size is also determined by mapred.min.split.size.per.node, mapred.min.split.size.per.rack, and mapred.max.split.size. They should be set the same as hive.merge.size.per.task as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-2037) Merge result file size should honor hive.merge.size.per.task

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joydeep Sen Sarma updated HIVE-2037:
------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.8.0
           Status: Resolved  (was: Patch Available)

> Merge result file size should honor hive.merge.size.per.task
> ------------------------------------------------------------
>
>                 Key: HIVE-2037
>                 URL: https://issues.apache.org/jira/browse/HIVE-2037
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.8.0
>
>         Attachments: HIVE-2037.patch
>
>
> The merge job set mapred.min.split.size to the value of hive.merge.size.per.task, which roughly equals to the output file size. However the input split size is also determined by mapred.min.split.size.per.node, mapred.min.split.size.per.rack, and mapred.max.split.size. They should be set the same as hive.merge.size.per.task as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira