You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mrql.apache.org by "Leonidas Fegaras (JIRA)" <ji...@apache.org> on 2014/10/19 00:45:34 UTC

[jira] [Resolved] (MRQL-54) Adjust the split size of a map-reduce input file based on the number of requested nodes

     [ https://issues.apache.org/jira/browse/MRQL-54?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Leonidas Fegaras resolved MRQL-54.
----------------------------------
    Resolution: Fixed

The patch was applied to the git master.

> Adjust the split size of a map-reduce input file based on the number of requested nodes
> ---------------------------------------------------------------------------------------
>
>                 Key: MRQL-54
>                 URL: https://issues.apache.org/jira/browse/MRQL-54
>             Project: MRQL
>          Issue Type: Improvement
>          Components: Run-Time/MapReduce
>    Affects Versions: 0.9.4
>            Reporter: Leonidas Fegaras
>            Assignee: Leonidas Fegaras
>            Priority: Critical
>         Attachments: MRQL-54.patch
>
>
> This patch fixes a performance problem reported by Eldon Carman. It improves the degree of parallelism of map tasks in map-reduce mode. Before this, the mapred.min.split.size was set to 256MBs before each map-reduce task, which prevented mappers to use all requested cluster nodes (but the number of reducers was set correctly using setNumReduceTasks). Now the mapred.min.split.size and mapred.max.split.size are set correctly based on the input size and the number of requested nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)