You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2009/12/10 02:22:18 UTC

[jira] Commented: (PIG-1144) set default_parallelism construct does not set the number of reducers correctly

    [ https://issues.apache.org/jira/browse/PIG-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788434#action_12788434 ] 

Daniel Dai commented on PIG-1144:
---------------------------------

Hi, Viraj,
default parallelism is set in JobControlCompiler, which is after MRCompiler. Also if you just do explain, this code will not be invoked. Did you see in the real cluster, it actually use 1 reducer?

> set default_parallelism construct does not set the number of reducers correctly
> -------------------------------------------------------------------------------
>
>                 Key: PIG-1144
>                 URL: https://issues.apache.org/jira/browse/PIG-1144
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.7.0
>         Environment: Hadoop 20 cluster with multi-node installation
>            Reporter: Viraj Bhat
>             Fix For: 0.7.0
>
>         Attachments: brokenparallel.out, genericscript_broken_parallel.pig
>
>
> Hi all,
>  I have a Pig script where I set the parallelism using the following set construct: "set default_parallel 100" . I modified the "MRPrinter.java" to printout the parallelism
> {code}
> ...
> public void visitMROp(MapReduceOper mr)
> mStream.println("MapReduce node " + mr.getOperatorKey().toString() + " Parallelism " + mr.getRequestedParallelism());
> ...
> {code}
> When I run an explain on the script, I see that the last job which does the actual sort, runs as a single reducer job. This can be corrected, by adding the PARALLEL keyword in front of the ORDER BY.
> Attaching the script and the explain output
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.