You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2008/05/31 01:04:45 UTC

[jira] Commented: (HADOOP-3473) io.sort.factor should default to 100 instead of 10

    [ https://issues.apache.org/jira/browse/HADOOP-3473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12601290#action_12601290 ] 

Doug Cutting commented on HADOOP-3473:
--------------------------------------

Changing this has memory implications, no?  Buffers are allocated for each stream being merged.  Buffers should be large enough so that transfer dominates seek, i.e., @ 10ms/seek, 100MB/s transfer, seek=transfer at 1MB.  So for merging not to be seek-bound with 100 buffers, the total buffer size needs to be substantially larger than 100MB, which is currently the default for io.sort.mb.  So I can see increasing this to 50 w/o changing the default for io.sort.mb.

BTW, you've proposed a solution in the description rather than a problem.  The problem, I assume, is that the sort-factor is non-optimal.  Perhaps a better solution to this problem is to not specify the sort factor at all, but rather to have the sort code determine it automatically based on io.sort.mb?  So if you increase io.sort.mb, you'd get a larger sort factor.  Of course, then we'd have to make some assumptions about disk performance...

> io.sort.factor should default to 100 instead of 10
> --------------------------------------------------
>
>                 Key: HADOOP-3473
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3473
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: conf
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.18.0
>
>
> 10 is *really* conservative and can make merges much much more expensive.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.