You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org> on 2011/09/22 06:52:26 UTC

[jira] [Commented] (PIG-2298) Accumulative mode is turned off when MultiQueryOptimizer merges jobs

    [ https://issues.apache.org/jira/browse/PIG-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13112325#comment-13112325 ] 

Dmitriy V. Ryaboy commented on PIG-2298:
----------------------------------------

Consider this script:

{code}

data = load 'tmp/numbers.txt' as (number:int, key:chararray);

grp1 = foreach (group data by key parallel 1)
  generate group, SomethingAccumulative(data.number);

grp2 = foreach (group data all)
  generate group, SomethingAccumulative(data.number);

store grp1 into '/tmp/grp1';
store grp2 into '/tmp/grp2';
{code}

Independently, just running jobs for grp1 and grp2 results in the reducer running in accumulative mode. However, when put together, the accumulative mode does not kick in. I suspect that's because we do not match the Demux operator at the top of the reduce plan?

{code}

Reduce Plan
Demux [2] scope-44
|   |
|   grp1: Store(/tmp/grp1:org.apache.pig.builtin.PigStorage) - scope-22
|   |
|   |---grp1: New For Each(false,false)[bag] - scope-21
|       |   |
|       |   Project[chararray][0] - scope-15
|       |   |
|       |   POUserFunc(com.twitter.twadoop.pig.SomethingAccumulative)[double] - scope-19
|       |   |
|       |   |---Project[bag][0] - scope-18
|       |       |
|       |       |---Project[bag][1] - scope-17
|   |
|   grp2: Store(/tmp/grp2:org.apache.pig.builtin.PigStorage) - scope-36
|   |
|   |---grp2: New For Each(false,false)[bag] - scope-35
|       |   |
|       |   Project[chararray][0] - scope-29
|       |   |
|       |   POUserFunc(com.twitter.twadoop.pig.SomethingAccumulative)[double] - scope-33
|       |   |
|       |   |---Project[bag][0] - scope-32
|       |       |
|       |       |---Project[bag][1] - scope-31
|
|---MultiQuery Package [[true, true]] - scope-45
    |
    |---1-10: Package[tuple]{chararray} - scope-12
    |
    |---1-11: Package[tuple]{chararray} - scope-26--------
{code}

> Accumulative mode is turned off when MultiQueryOptimizer merges jobs
> --------------------------------------------------------------------
>
>                 Key: PIG-2298
>                 URL: https://issues.apache.org/jira/browse/PIG-2298
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1, 0.9.0, 0.10
>            Reporter: Dmitriy V. Ryaboy
>
> Accumulator mode does not kick in when multiple MR jobs are merged into one by the optimizer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira