You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org> on 2011/09/22 06:52:26 UTC
[jira] [Commented] (PIG-2298) Accumulative mode is turned off when
MultiQueryOptimizer merges jobs
[ https://issues.apache.org/jira/browse/PIG-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13112325#comment-13112325 ]
Dmitriy V. Ryaboy commented on PIG-2298:
----------------------------------------
Consider this script:
{code}
data = load 'tmp/numbers.txt' as (number:int, key:chararray);
grp1 = foreach (group data by key parallel 1)
generate group, SomethingAccumulative(data.number);
grp2 = foreach (group data all)
generate group, SomethingAccumulative(data.number);
store grp1 into '/tmp/grp1';
store grp2 into '/tmp/grp2';
{code}
Independently, just running jobs for grp1 and grp2 results in the reducer running in accumulative mode. However, when put together, the accumulative mode does not kick in. I suspect that's because we do not match the Demux operator at the top of the reduce plan?
{code}
Reduce Plan
Demux [2] scope-44
| |
| grp1: Store(/tmp/grp1:org.apache.pig.builtin.PigStorage) - scope-22
| |
| |---grp1: New For Each(false,false)[bag] - scope-21
| | |
| | Project[chararray][0] - scope-15
| | |
| | POUserFunc(com.twitter.twadoop.pig.SomethingAccumulative)[double] - scope-19
| | |
| | |---Project[bag][0] - scope-18
| | |
| | |---Project[bag][1] - scope-17
| |
| grp2: Store(/tmp/grp2:org.apache.pig.builtin.PigStorage) - scope-36
| |
| |---grp2: New For Each(false,false)[bag] - scope-35
| | |
| | Project[chararray][0] - scope-29
| | |
| | POUserFunc(com.twitter.twadoop.pig.SomethingAccumulative)[double] - scope-33
| | |
| | |---Project[bag][0] - scope-32
| | |
| | |---Project[bag][1] - scope-31
|
|---MultiQuery Package [[true, true]] - scope-45
|
|---1-10: Package[tuple]{chararray} - scope-12
|
|---1-11: Package[tuple]{chararray} - scope-26--------
{code}
> Accumulative mode is turned off when MultiQueryOptimizer merges jobs
> --------------------------------------------------------------------
>
> Key: PIG-2298
> URL: https://issues.apache.org/jira/browse/PIG-2298
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.8.1, 0.9.0, 0.10
> Reporter: Dmitriy V. Ryaboy
>
> Accumulator mode does not kick in when multiple MR jobs are merged into one by the optimizer.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira