You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Arun C Murthy (JIRA)" <ji...@apache.org> on 2007/10/15 12:34:50 UTC

[jira] Created: (HADOOP-2054) Improve memory model for map-side sorts

Improve memory model for map-side sorts
---------------------------------------

                 Key: HADOOP-2054
                 URL: https://issues.apache.org/jira/browse/HADOOP-2054
             Project: Hadoop
          Issue Type: Improvement
          Components: mapred
            Reporter: Arun C Murthy
            Assignee: Arun C Murthy
             Fix For: 0.16.0


{{MapTask#MapOutputBuffer}} uses a plain-jane {{DataOutputBuffer}} which defaults to a buffer of size 32-bytes, and the {{DataOutputBuffer#write}} call doubles the underlying byte-array when it needs more space.

However for maps which output any decent amount of data (e.g. 128MB in examples/Sort.java) this means the buffer grows painfully slowly from 2^6 to 2^28, and each time this results in a new array being created, followed by an array-copy:

{noformat}
    public void write(DataInput in, int len) throws IOException {
      int newcount = count + len;
      if (newcount > buf.length) {
        byte newbuf[] = new byte[Math.max(buf.length << 1, newcount)];
        System.arraycopy(buf, 0, newbuf, 0, count);
        buf = newbuf;
      }
      in.readFully(buf, count, len);
      count = newcount;
    }
{noformat}

I reckon we could do much better in the {{MapTask}}, specifically... 

For e.g. we start with a buffer of size 1/4KB and quadruple, rather than double, upto, say 4/8/16MB. Then we resume doubling (or less).

This means that it quickly ramps up to minimize no. of {{System.arrayCopy}} calls and small-sized buffers to GC; and later start doubling to ensure we don't ramp-up too quickly to minimize memory wastage due to fragmentation.

Of course, this issue is about benchmarking and figuring if all this is worth it, and, if so, what are the right set of trade-offs to make.

Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2054) Improve memory model for map-side sorts

Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534860 ] 

Koji Noguchi commented on HADOOP-2054:
--------------------------------------

Does map output always go through  DataOutputBuffer.Buffer.write?

>From HADOOP-2053 stack trace, 

<stack>
task_200710112103_0001_m_000015_1: java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at org.apache.hadoop.io.Text.write(Text.java:243)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:340)
</stack>

Text.write is directly calling DataOutputStream.write in which calls ByteArrayOutputStream.write. 
What I expected was  DataOutputBuffer.write --> DataOutputBufffer.Buffer.write. 



> Improve memory model for map-side sorts
> ---------------------------------------
>
>                 Key: HADOOP-2054
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2054
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>             Fix For: 0.16.0
>
>
> {{MapTask#MapOutputBuffer}} uses a plain-jane {{DataOutputBuffer}} which defaults to a buffer of size 32-bytes, and the {{DataOutputBuffer#write}} call doubles the underlying byte-array when it needs more space.
> However for maps which output any decent amount of data (e.g. 128MB in examples/Sort.java) this means the buffer grows painfully slowly from 2^6 to 2^28, and each time this results in a new array being created, followed by an array-copy:
> {noformat}
>     public void write(DataInput in, int len) throws IOException {
>       int newcount = count + len;
>       if (newcount > buf.length) {
>         byte newbuf[] = new byte[Math.max(buf.length << 1, newcount)];
>         System.arraycopy(buf, 0, newbuf, 0, count);
>         buf = newbuf;
>       }
>       in.readFully(buf, count, len);
>       count = newcount;
>     }
> {noformat}
> I reckon we could do much better in the {{MapTask}}, specifically... 
> For e.g. we start with a buffer of size 1/4KB and quadruple, rather than double, upto, say 4/8/16MB. Then we resume doubling (or less).
> This means that it quickly ramps up to minimize no. of {{System.arrayCopy}} calls and small-sized buffers to GC; and later start doubling to ensure we don't ramp-up too quickly to minimize memory wastage due to fragmentation.
> Of course, this issue is about benchmarking and figuring if all this is worth it, and, if so, what are the right set of trade-offs to make.
> Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2054) Improve memory model for map-side sorts

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-2054:
----------------------------------

    Fix Version/s:     (was: 0.16.0)

Pushing this to 0.17.0 and beyond...

> Improve memory model for map-side sorts
> ---------------------------------------
>
>                 Key: HADOOP-2054
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2054
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>
> {{MapTask#MapOutputBuffer}} uses a plain-jane {{DataOutputBuffer}} which defaults to a buffer of size 32-bytes, and the {{DataOutputBuffer#write}} call doubles the underlying byte-array when it needs more space.
> However for maps which output any decent amount of data (e.g. 128MB in examples/Sort.java) this means the buffer grows painfully slowly from 2^6 to 2^28, and each time this results in a new array being created, followed by an array-copy:
> {noformat}
>     public void write(DataInput in, int len) throws IOException {
>       int newcount = count + len;
>       if (newcount > buf.length) {
>         byte newbuf[] = new byte[Math.max(buf.length << 1, newcount)];
>         System.arraycopy(buf, 0, newbuf, 0, count);
>         buf = newbuf;
>       }
>       in.readFully(buf, count, len);
>       count = newcount;
>     }
> {noformat}
> I reckon we could do much better in the {{MapTask}}, specifically... 
> For e.g. we start with a buffer of size 1/4KB and quadruple, rather than double, upto, say 4/8/16MB. Then we resume doubling (or less).
> This means that it quickly ramps up to minimize no. of {{System.arrayCopy}} calls and small-sized buffers to GC; and later start doubling to ensure we don't ramp-up too quickly to minimize memory wastage due to fragmentation.
> Of course, this issue is about benchmarking and figuring if all this is worth it, and, if so, what are the right set of trade-offs to make.
> Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.