You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Hairong Kuang (JIRA)" <ji...@apache.org> on 2007/04/03 01:53:32 UTC
[jira] Created: (HADOOP-1193) Map/reduce job gets
OutOfMemoryException when set map out to be compressed
Map/reduce job gets OutOfMemoryException when set map out to be compressed
--------------------------------------------------------------------------
Key: HADOOP-1193
URL: https://issues.apache.org/jira/browse/HADOOP-1193
Project: Hadoop
Issue Type: Bug
Components: mapred
Affects Versions: 0.12.2
Reporter: Hairong Kuang
Assigned To: Arun C Murthy
Fix For: 0.13.0
One of my jobs quickly fails with the OutOfMemoryException when I set the map out to be compressed. But it worked fine with release 0.10.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1193) Map/reduce job gets
OutOfMemoryException when set map out to be compressed
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-1193:
----------------------------------
Attachment: HADOOP-1193_3_20070611.patch
Thanks for the review Devaraj, patched-anew incorporating the comments...
> Map/reduce job gets OutOfMemoryException when set map out to be compressed
> --------------------------------------------------------------------------
>
> Key: HADOOP-1193
> URL: https://issues.apache.org/jira/browse/HADOOP-1193
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.2
> Reporter: Hairong Kuang
> Assignee: Arun C Murthy
> Fix For: 0.14.0
>
> Attachments: HADOOP-1193_1_20070517.patch, HADOOP-1193_2_20070524.patch, HADOOP-1193_3_20070611.patch
>
>
> One of my jobs quickly fails with the OutOfMemoryException when I set the map out to be compressed. But it worked fine with release 0.10.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1193) Map/reduce job gets
OutOfMemoryException when set map out to be compressed
Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491475 ]
Hairong Kuang commented on HADOOP-1193:
---------------------------------------
More details about the failed job:
1. It uses record-level compression
2. mapred.child.java.opts is set to be the default value: --Xmx512m
3. For the mapout, each key is a text and very small, but each value is a jute record, with an average size of approximate 25K. Some might be as big as mega bytes.
> Map/reduce job gets OutOfMemoryException when set map out to be compressed
> --------------------------------------------------------------------------
>
> Key: HADOOP-1193
> URL: https://issues.apache.org/jira/browse/HADOOP-1193
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.2
> Reporter: Hairong Kuang
> Assigned To: Arun C Murthy
> Fix For: 0.13.0
>
>
> One of my jobs quickly fails with the OutOfMemoryException when I set the map out to be compressed. But it worked fine with release 0.10.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1193) Map/reduce job gets
OutOfMemoryException when set map out to be compressed
Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12486182 ]
Hairong Kuang commented on HADOOP-1193:
---------------------------------------
Forgot to mention that the cluster was run with the natvie compression lib.
> Map/reduce job gets OutOfMemoryException when set map out to be compressed
> --------------------------------------------------------------------------
>
> Key: HADOOP-1193
> URL: https://issues.apache.org/jira/browse/HADOOP-1193
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.2
> Reporter: Hairong Kuang
> Assigned To: Arun C Murthy
> Fix For: 0.13.0
>
>
> One of my jobs quickly fails with the OutOfMemoryException when I set the map out to be compressed. But it worked fine with release 0.10.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1193) Map/reduce job gets
OutOfMemoryException when set map out to be compressed
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-1193:
----------------------------------
Status: Open (was: Patch Available)
> Map/reduce job gets OutOfMemoryException when set map out to be compressed
> --------------------------------------------------------------------------
>
> Key: HADOOP-1193
> URL: https://issues.apache.org/jira/browse/HADOOP-1193
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.2
> Reporter: Hairong Kuang
> Assignee: Arun C Murthy
> Fix For: 0.14.0
>
> Attachments: HADOOP-1193_1_20070517.patch, HADOOP-1193_2_20070524.patch, HADOOP-1193_3_20070611.patch
>
>
> One of my jobs quickly fails with the OutOfMemoryException when I set the map out to be compressed. But it worked fine with release 0.10.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1193) Map/reduce job gets
OutOfMemoryException when set map out to be compressed
Posted by "Marco Nicosia (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marco Nicosia updated HADOOP-1193:
----------------------------------
Priority: Blocker (was: Major)
> Map/reduce job gets OutOfMemoryException when set map out to be compressed
> --------------------------------------------------------------------------
>
> Key: HADOOP-1193
> URL: https://issues.apache.org/jira/browse/HADOOP-1193
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.2
> Reporter: Hairong Kuang
> Assignee: Arun C Murthy
> Priority: Blocker
> Fix For: 0.14.0
>
> Attachments: HADOOP-1193_1_20070517.patch, HADOOP-1193_2_20070524.patch, HADOOP-1193_3_20070611.patch, HADOOP-1193_4_20070614.patch
>
>
> One of my jobs quickly fails with the OutOfMemoryException when I set the map out to be compressed. But it worked fine with release 0.10.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1193) Map/reduce job gets
OutOfMemoryException when set map out to be compressed
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-1193:
----------------------------------
Attachment: HADOOP-1193_2_20070524.patch
Here is an updated version of the patch with the changes I made to BigMapOutput to help test it (basically made it extend ToolBase and added an option to create the large map input too)...
I have tested this with large map inputs (>2G) and seems to hold up well i.e. the codec pool ensure we create only 1 compressor and very small no. of decompressors (less than 10) even for extremely large map inputs (>2G).
> Map/reduce job gets OutOfMemoryException when set map out to be compressed
> --------------------------------------------------------------------------
>
> Key: HADOOP-1193
> URL: https://issues.apache.org/jira/browse/HADOOP-1193
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.2
> Reporter: Hairong Kuang
> Assigned To: Arun C Murthy
> Attachments: HADOOP-1193_1_20070517.patch, HADOOP-1193_2_20070524.patch
>
>
> One of my jobs quickly fails with the OutOfMemoryException when I set the map out to be compressed. But it worked fine with release 0.10.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1193) Map/reduce job gets
OutOfMemoryException when set map out to be compressed
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12504876 ]
Hadoop QA commented on HADOOP-1193:
-----------------------------------
+1
http://issues.apache.org/jira/secure/attachment/12359765/HADOOP-1193_4_20070614.patch applied and successfully tested against trunk revision r547159.
Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/284/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/284/console
> Map/reduce job gets OutOfMemoryException when set map out to be compressed
> --------------------------------------------------------------------------
>
> Key: HADOOP-1193
> URL: https://issues.apache.org/jira/browse/HADOOP-1193
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.2
> Reporter: Hairong Kuang
> Assignee: Arun C Murthy
> Fix For: 0.14.0
>
> Attachments: HADOOP-1193_1_20070517.patch, HADOOP-1193_2_20070524.patch, HADOOP-1193_3_20070611.patch, HADOOP-1193_4_20070614.patch
>
>
> One of my jobs quickly fails with the OutOfMemoryException when I set the map out to be compressed. But it worked fine with release 0.10.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1193) Map/reduce job gets
OutOfMemoryException when set map out to be compressed
Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Doug Cutting updated HADOOP-1193:
---------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
I just committed this. Thanks, Arun!
> Map/reduce job gets OutOfMemoryException when set map out to be compressed
> --------------------------------------------------------------------------
>
> Key: HADOOP-1193
> URL: https://issues.apache.org/jira/browse/HADOOP-1193
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.2
> Reporter: Hairong Kuang
> Assignee: Arun C Murthy
> Fix For: 0.14.0
>
> Attachments: HADOOP-1193_1_20070517.patch, HADOOP-1193_2_20070524.patch, HADOOP-1193_3_20070611.patch, HADOOP-1193_4_20070614.patch
>
>
> One of my jobs quickly fails with the OutOfMemoryException when I set the map out to be compressed. But it worked fine with release 0.10.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1193) Map/reduce job gets
OutOfMemoryException when set map out to be compressed
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-1193:
----------------------------------
Attachment: HADOOP-1193_4_20070614.patch
Updated patch to reflect changes to trunk...
> Map/reduce job gets OutOfMemoryException when set map out to be compressed
> --------------------------------------------------------------------------
>
> Key: HADOOP-1193
> URL: https://issues.apache.org/jira/browse/HADOOP-1193
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.2
> Reporter: Hairong Kuang
> Assignee: Arun C Murthy
> Fix For: 0.14.0
>
> Attachments: HADOOP-1193_1_20070517.patch, HADOOP-1193_2_20070524.patch, HADOOP-1193_3_20070611.patch, HADOOP-1193_4_20070614.patch
>
>
> One of my jobs quickly fails with the OutOfMemoryException when I set the map out to be compressed. But it worked fine with release 0.10.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1193) Map/reduce job gets
OutOfMemoryException when set map out to be compressed
Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12498994 ]
Devaraj Das commented on HADOOP-1193:
-------------------------------------
One comment - it would be nice to have the 'tmpReader' logic be triggered when such a flag is passed in the constructor of Reader. Apart from that there are some whitespace changes which can be removed.
> Map/reduce job gets OutOfMemoryException when set map out to be compressed
> --------------------------------------------------------------------------
>
> Key: HADOOP-1193
> URL: https://issues.apache.org/jira/browse/HADOOP-1193
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.2
> Reporter: Hairong Kuang
> Assigned To: Arun C Murthy
> Attachments: HADOOP-1193_1_20070517.patch, HADOOP-1193_2_20070524.patch
>
>
> One of my jobs quickly fails with the OutOfMemoryException when I set the map out to be compressed. But it worked fine with release 0.10.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1193) Map/reduce job gets
OutOfMemoryException when set map out to be compressed
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-1193:
----------------------------------
Status: Patch Available (was: Open)
> Map/reduce job gets OutOfMemoryException when set map out to be compressed
> --------------------------------------------------------------------------
>
> Key: HADOOP-1193
> URL: https://issues.apache.org/jira/browse/HADOOP-1193
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.2
> Reporter: Hairong Kuang
> Assignee: Arun C Murthy
> Fix For: 0.14.0
>
> Attachments: HADOOP-1193_1_20070517.patch, HADOOP-1193_2_20070524.patch, HADOOP-1193_3_20070611.patch, HADOOP-1193_4_20070614.patch
>
>
> One of my jobs quickly fails with the OutOfMemoryException when I set the map out to be compressed. But it worked fine with release 0.10.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1193) Map/reduce job gets
OutOfMemoryException when set map out to be compressed
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12506043 ]
Hadoop QA commented on HADOOP-1193:
-----------------------------------
Integrated in Hadoop-Nightly #127 (See [http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/127/])
> Map/reduce job gets OutOfMemoryException when set map out to be compressed
> --------------------------------------------------------------------------
>
> Key: HADOOP-1193
> URL: https://issues.apache.org/jira/browse/HADOOP-1193
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.2
> Reporter: Hairong Kuang
> Assignee: Arun C Murthy
> Priority: Blocker
> Fix For: 0.14.0
>
> Attachments: HADOOP-1193_1_20070517.patch, HADOOP-1193_2_20070524.patch, HADOOP-1193_3_20070611.patch, HADOOP-1193_4_20070614.patch
>
>
> One of my jobs quickly fails with the OutOfMemoryException when I set the map out to be compressed. But it worked fine with release 0.10.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1193) Map/reduce job gets
OutOfMemoryException when set map out to be compressed
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-1193:
----------------------------------
Attachment: HADOOP-1193_1_20070517.patch
Here is a patch while I continue further testing... Hairong could you try to see if it works for you? Thanks!
Basically I went ahead and implemented a 'codec pool' to reuse the direct-buffer based codecs so as to not create too many of them...
Results while trying to sort 1Million records via TestSequenceFile with RECORD compression:
trunk H-1193
Compressors: 1382 3
Decompressors: 1520 12
-----------------------------------------------------
Total: 2902 15
Results are even more dramatic for BLOCK compression (we need 4 codecs per Reader with BLOCK compression for key, keyLen, val & valLen) ... in fact I have gone ahead and bumped up the default direct buffer size for zlib to 64K from 1K which should lead to improved performance too, on the back of this patch.
Appreciate any review/feedback.
> Map/reduce job gets OutOfMemoryException when set map out to be compressed
> --------------------------------------------------------------------------
>
> Key: HADOOP-1193
> URL: https://issues.apache.org/jira/browse/HADOOP-1193
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.2
> Reporter: Hairong Kuang
> Assigned To: Arun C Murthy
> Attachments: HADOOP-1193_1_20070517.patch
>
>
> One of my jobs quickly fails with the OutOfMemoryException when I set the map out to be compressed. But it worked fine with release 0.10.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1193) Map/reduce job gets
OutOfMemoryException when set map out to be compressed
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-1193:
----------------------------------
Fix Version/s: 0.14.0
Status: Patch Available (was: Open)
> Map/reduce job gets OutOfMemoryException when set map out to be compressed
> --------------------------------------------------------------------------
>
> Key: HADOOP-1193
> URL: https://issues.apache.org/jira/browse/HADOOP-1193
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.2
> Reporter: Hairong Kuang
> Assignee: Arun C Murthy
> Fix For: 0.14.0
>
> Attachments: HADOOP-1193_1_20070517.patch, HADOOP-1193_2_20070524.patch, HADOOP-1193_3_20070611.patch
>
>
> One of my jobs quickly fails with the OutOfMemoryException when I set the map out to be compressed. But it worked fine with release 0.10.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1193) Map/reduce job gets
OutOfMemoryException when set map out to be compressed
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12503875 ]
Hadoop QA commented on HADOOP-1193:
-----------------------------------
-1, build or testing failed
2 attempts failed to build and test the latest attachment http://issues.apache.org/jira/secure/attachment/12359395/HADOOP-1193_3_20070611.patch against trunk revision r546310.
Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/269/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/269/console
Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.
> Map/reduce job gets OutOfMemoryException when set map out to be compressed
> --------------------------------------------------------------------------
>
> Key: HADOOP-1193
> URL: https://issues.apache.org/jira/browse/HADOOP-1193
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.2
> Reporter: Hairong Kuang
> Assignee: Arun C Murthy
> Fix For: 0.14.0
>
> Attachments: HADOOP-1193_1_20070517.patch, HADOOP-1193_2_20070524.patch, HADOOP-1193_3_20070611.patch
>
>
> One of my jobs quickly fails with the OutOfMemoryException when I set the map out to be compressed. But it worked fine with release 0.10.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.