You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Chris Douglas (JIRA)" <ji...@apache.org> on 2009/12/04 08:55:20 UTC

[jira] Commented: (MAPREDUCE-1257) Ability to grab the number of spills

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785805#action_12785805 ] 

Chris Douglas commented on MAPREDUCE-1257:
------------------------------------------

I disagree with the premise of this issue. What information does the number of spills provide? What's important for performance is how many records are hitting local disk as intermediate data. Consider applications running with a combiner; only a small fraction of data spilled may hit disk. The number of spills is a crude approximation of what the existing metrics provide: guidance on how to set io.sort.record.percent, which will be removed in 0.22.

> Ability to grab the number of spills
> ------------------------------------
>
>                 Key: MAPREDUCE-1257
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1257
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>    Affects Versions: 0.22.0
>            Reporter: Sriranjan Manjunath
>            Assignee: Todd Lipcon
>             Fix For: 0.22.0
>
>         Attachments: mapreduce-1257.txt
>
>
> The counters should have information about the number of spills in addition to the number of spill records.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.