You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2011/02/17 21:51:12 UTC

[jira] Commented: (PIG-1803) Maps are failing if combiner is enabled

    [ https://issues.apache.org/jira/browse/PIG-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12996062#comment-12996062 ] 

Olga Natkovich commented on PIG-1803:
-------------------------------------

Alex, did you get a chance to try whether your script works with latest code on Pig 0.8 branch or Pig 0.9? We will be releasing Pig 0.8.1 that would address the problem that Thejas fixed.

If this does work for you, would you be able to move to 0.8? We do not have plans to backport the fix to Pig 0.7 but you could apply the patch and see if it works as is or with small tweaks.

Please, let us know how you want to proceed and whether we can close this ticket, thanks

> Maps are failing if combiner is enabled
> ---------------------------------------
>
>                 Key: PIG-1803
>                 URL: https://issues.apache.org/jira/browse/PIG-1803
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Alex Rovner
>             Fix For: 0.7.0
>
>
> We are constantly hitting the java heap space memory issue if the combiner is enabled on our jobs.
> Configs:
> pig.cachedbag.memusage=20
> io.sort.mb=300
> pig.exec.nocombiner=false
> mapred.child.java.opts=-Xmx750m
> Sample job:
> {noformat} 
> A = LOAD '$INPUT' USING com.contextweb.pig.CWHeaderLoader('$WORK_DIR/schema/rpt.xml');
> AA = foreach A GENERATE checkPointStart, PublisherId, TagId,
> ContextCategoryId,Impressions, Clicks, Actions;
> DESCRIBE AA;
> B = GROUP AA BY (checkPointStart, PublisherId, TagId,
> ContextCategoryId);
> result = FOREACH B GENERATE group, SUM(AA.Impressions) as Impressions, SUM(AA.Clicks) as Clicks, SUM(AA.Actions) as Actions;
> DESCRIBE result;
> STORE result INTO '$OUTPUT' USING com.contextweb.pig.CWHeaderStore();
> {noformat} 
> Mapper Error Log:
> 2011-01-12 18:43:22,084 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space
> 	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:799)
> 	at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:549)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:631)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:315)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:211)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira