You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2009/12/10 04:01:18 UTC
[jira] Assigned: (PIG-1144) set default_parallelism construct does
not set the number of reducers correctly
[ https://issues.apache.org/jira/browse/PIG-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai reassigned PIG-1144:
-------------------------------
Assignee: Daniel Dai
> set default_parallelism construct does not set the number of reducers correctly
> -------------------------------------------------------------------------------
>
> Key: PIG-1144
> URL: https://issues.apache.org/jira/browse/PIG-1144
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.6.0
> Environment: Hadoop 20 cluster with multi-node installation
> Reporter: Viraj Bhat
> Assignee: Daniel Dai
> Fix For: 0.7.0
>
> Attachments: brokenparallel.out, genericscript_broken_parallel.pig, PIG-1144-1.patch
>
>
> Hi all,
> I have a Pig script where I set the parallelism using the following set construct: "set default_parallel 100" . I modified the "MRPrinter.java" to printout the parallelism
> {code}
> ...
> public void visitMROp(MapReduceOper mr)
> mStream.println("MapReduce node " + mr.getOperatorKey().toString() + " Parallelism " + mr.getRequestedParallelism());
> ...
> {code}
> When I run an explain on the script, I see that the last job which does the actual sort, runs as a single reducer job. This can be corrected, by adding the PARALLEL keyword in front of the ORDER BY.
> Attaching the script and the explain output
> Viraj
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.