You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pig.apache.org by "Pradeep Kamath (JIRA)" <ji...@apache.org> on 2009/06/05 02:09:07 UTC

[jira] Created: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)

Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
----------------------------------------------------------------------------------------------------------------------------------------------

                 Key: PIG-835
                 URL: https://issues.apache.org/jira/browse/PIG-835
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.2.1
            Reporter: Pradeep Kamath
            Assignee: Pradeep Kamath
             Fix For: 0.3.0


A query like the following results in an exception on execution:
{noformat}
a = load 'mult.input' as (name, age, gpa);
b = group a ALL;
c = foreach b generate group, COUNT(a);
store c into 'foo';
d = group a by (name, gpa);
e = foreach d generate flatten(group), MIN(a.age);
store e into 'bar';
{noformat}

Exception on execution:
09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_000000_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-835:
-------------------------------

    Status: Patch Available  (was: Open)

> Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-835
>                 URL: https://issues.apache.org/jira/browse/PIG-835
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.3.0
>
>         Attachments: PIG-835-v2.patch, PIG-835.patch
>
>
> A query like the following results in an exception on execution:
> {noformat}
> a = load 'mult.input' as (name, age, gpa);
> b = group a ALL;
> c = foreach b generate group, COUNT(a);
> store c into 'foo';
> d = group a by (name, gpa);
> e = foreach d generate flatten(group), MIN(a.age);
> store e into 'bar';
> {noformat}
> Exception on execution:
> 09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_000000_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-835:
-------------------------------

    Attachment: PIG-846-v2.patch

New patch - the only change is to not add extra information in POLocalRearrange.name() - was in the earlier patch only to add more information in explain outputs but this breaks some unit tests. 

TestHBaseStorage unit test still fails for me but the failure is not related to the changes in the patch - am assuming that is an environment issue on my machine.

> Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-835
>                 URL: https://issues.apache.org/jira/browse/PIG-835
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.3.0
>
>         Attachments: PIG-835-v2.patch, PIG-835.patch, PIG-846-v2.patch
>
>
> A query like the following results in an exception on execution:
> {noformat}
> a = load 'mult.input' as (name, age, gpa);
> b = group a ALL;
> c = foreach b generate group, COUNT(a);
> store c into 'foo';
> d = group a by (name, gpa);
> e = foreach d generate flatten(group), MIN(a.age);
> store e into 'bar';
> {noformat}
> Exception on execution:
> 09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_000000_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-835:
-------------------------------

    Status: Patch Available  (was: Open)

> Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-835
>                 URL: https://issues.apache.org/jira/browse/PIG-835
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.3.0
>
>         Attachments: PIG-835.patch
>
>
> A query like the following results in an exception on execution:
> {noformat}
> a = load 'mult.input' as (name, age, gpa);
> b = group a ALL;
> c = foreach b generate group, COUNT(a);
> store c into 'foo';
> d = group a by (name, gpa);
> e = foreach d generate flatten(group), MIN(a.age);
> store e into 'bar';
> {noformat}
> Exception on execution:
> 09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_000000_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/PIG-835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717512#action_12717512 ] 

Hadoop QA commented on PIG-835:
-------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12410186/PIG-835-v2.patch
  against trunk revision 782790.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/77/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/77/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/77/console

This message is automatically generated.

> Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-835
>                 URL: https://issues.apache.org/jira/browse/PIG-835
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.3.0
>
>         Attachments: PIG-835-v2.patch, PIG-835.patch
>
>
> A query like the following results in an exception on execution:
> {noformat}
> a = load 'mult.input' as (name, age, gpa);
> b = group a ALL;
> c = foreach b generate group, COUNT(a);
> store c into 'foo';
> d = group a by (name, gpa);
> e = foreach d generate flatten(group), MIN(a.age);
> store e into 'bar';
> {noformat}
> Exception on execution:
> 09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_000000_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/PIG-835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717817#action_12717817 ] 

Olga Natkovich commented on PIG-835:
------------------------------------

+1, the patch looks good.

> Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-835
>                 URL: https://issues.apache.org/jira/browse/PIG-835
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.3.0
>
>         Attachments: PIG-835-v2.patch, PIG-835.patch
>
>
> A query like the following results in an exception on execution:
> {noformat}
> a = load 'mult.input' as (name, age, gpa);
> b = group a ALL;
> c = foreach b generate group, COUNT(a);
> store c into 'foo';
> d = group a by (name, gpa);
> e = foreach d generate flatten(group), MIN(a.age);
> store e into 'bar';
> {noformat}
> Exception on execution:
> 09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_000000_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-835:
-------------------------------

    Attachment: PIG-835.patch

The root cause of the issue is that the current multiQueryOptimizer checks if the map key is of the same type for different map plans it merges. If they are of different types, it ensures that the type is made tuple for all map plans - this implies keys which are not tuples will be wrapped in an extra tuple and keys which are already of Tuple type will be left alone (this is ensured in POLocalRearrange). However the Demux operator which passes the key and bag of values to the merged reduce plan currently always unwraps the tuple whenever the map keys are different. This results in unwrapping of keys which were originally tuples and should not be unwrapped. 

The attached patch fixes this by storing an array of boolean flags in the Demux operator to indicates which map keys are wrapped and which are not so that unwrapping occurs only in cases where the original map key was not already a tuple and was wrapped.

> Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-835
>                 URL: https://issues.apache.org/jira/browse/PIG-835
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.3.0
>
>         Attachments: PIG-835.patch
>
>
> A query like the following results in an exception on execution:
> {noformat}
> a = load 'mult.input' as (name, age, gpa);
> b = group a ALL;
> c = foreach b generate group, COUNT(a);
> store c into 'foo';
> d = group a by (name, gpa);
> e = foreach d generate flatten(group), MIN(a.age);
> store e into 'bar';
> {noformat}
> Exception on execution:
> 09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_000000_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-835:
-------------------------------

    Attachment:     (was: PIG-846-v2.patch)

> Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-835
>                 URL: https://issues.apache.org/jira/browse/PIG-835
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.3.0
>
>         Attachments: PIG-835-v2.patch, PIG-835.patch
>
>
> A query like the following results in an exception on execution:
> {noformat}
> a = load 'mult.input' as (name, age, gpa);
> b = group a ALL;
> c = foreach b generate group, COUNT(a);
> store c into 'foo';
> d = group a by (name, gpa);
> e = foreach d generate flatten(group), MIN(a.age);
> store e into 'bar';
> {noformat}
> Exception on execution:
> 09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_000000_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)

Posted by "Giridharan Kesavan (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Giridharan Kesavan updated PIG-835:
-----------------------------------

    Status: Patch Available  (was: Open)

resubmitting the patch

> Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-835
>                 URL: https://issues.apache.org/jira/browse/PIG-835
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.3.0
>
>         Attachments: PIG-835.patch
>
>
> A query like the following results in an exception on execution:
> {noformat}
> a = load 'mult.input' as (name, age, gpa);
> b = group a ALL;
> c = foreach b generate group, COUNT(a);
> store c into 'foo';
> d = group a by (name, gpa);
> e = foreach d generate flatten(group), MIN(a.age);
> store e into 'bar';
> {noformat}
> Exception on execution:
> 09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_000000_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/PIG-835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12718042#action_12718042 ] 

Hudson commented on PIG-835:
----------------------------

Integrated in Pig-trunk #469 (See [http://hudson.zones.apache.org/hudson/job/Pig-trunk/469/])
    added entry in CHANGES.txt into 0.3 branch section for  since  has been committed to the 0.3 branch also


> Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-835
>                 URL: https://issues.apache.org/jira/browse/PIG-835
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.3.0
>
>         Attachments: PIG-835-v2.patch, PIG-835.patch
>
>
> A query like the following results in an exception on execution:
> {noformat}
> a = load 'mult.input' as (name, age, gpa);
> b = group a ALL;
> c = foreach b generate group, COUNT(a);
> store c into 'foo';
> d = group a by (name, gpa);
> e = foreach d generate flatten(group), MIN(a.age);
> store e into 'bar';
> {noformat}
> Exception on execution:
> 09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_000000_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-835:
-------------------------------

    Comment: was deleted

(was: New patch - the only change is to not add extra information in POLocalRearrange.name() - was in the earlier patch only to add more information in explain outputs but this breaks some unit tests. 

TestHBaseStorage unit test still fails for me but the failure is not related to the changes in the patch - am assuming that is an environment issue on my machine.)

> Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-835
>                 URL: https://issues.apache.org/jira/browse/PIG-835
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.3.0
>
>         Attachments: PIG-835-v2.patch, PIG-835.patch
>
>
> A query like the following results in an exception on execution:
> {noformat}
> a = load 'mult.input' as (name, age, gpa);
> b = group a ALL;
> c = foreach b generate group, COUNT(a);
> store c into 'foo';
> d = group a by (name, gpa);
> e = foreach d generate flatten(group), MIN(a.age);
> store e into 'bar';
> {noformat}
> Exception on execution:
> 09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_000000_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-835:
-------------------------------

    Status: Patch Available  (was: Open)

> Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-835
>                 URL: https://issues.apache.org/jira/browse/PIG-835
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.3.0
>
>         Attachments: PIG-835.patch
>
>
> A query like the following results in an exception on execution:
> {noformat}
> a = load 'mult.input' as (name, age, gpa);
> b = group a ALL;
> c = foreach b generate group, COUNT(a);
> store c into 'foo';
> d = group a by (name, gpa);
> e = foreach d generate flatten(group), MIN(a.age);
> store e into 'bar';
> {noformat}
> Exception on execution:
> 09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_000000_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/PIG-835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717456#action_12717456 ] 

Hadoop QA commented on PIG-835:
-------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12410010/PIG-835.patch
  against trunk revision 782703.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 2 new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/75/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/75/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/75/console

This message is automatically generated.

> Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-835
>                 URL: https://issues.apache.org/jira/browse/PIG-835
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.3.0
>
>         Attachments: PIG-835.patch
>
>
> A query like the following results in an exception on execution:
> {noformat}
> a = load 'mult.input' as (name, age, gpa);
> b = group a ALL;
> c = foreach b generate group, COUNT(a);
> store c into 'foo';
> d = group a by (name, gpa);
> e = foreach d generate flatten(group), MIN(a.age);
> store e into 'bar';
> {noformat}
> Exception on execution:
> 09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_000000_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)

Posted by "Giridharan Kesavan (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Giridharan Kesavan updated PIG-835:
-----------------------------------

    Status: Open  (was: Patch Available)

> Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-835
>                 URL: https://issues.apache.org/jira/browse/PIG-835
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.3.0
>
>         Attachments: PIG-835.patch
>
>
> A query like the following results in an exception on execution:
> {noformat}
> a = load 'mult.input' as (name, age, gpa);
> b = group a ALL;
> c = foreach b generate group, COUNT(a);
> store c into 'foo';
> d = group a by (name, gpa);
> e = foreach d generate flatten(group), MIN(a.age);
> store e into 'bar';
> {noformat}
> Exception on execution:
> 09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_000000_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-835:
-------------------------------

    Attachment: PIG-835-v2.patch

New patch with findbugs warnings addressed - essentially findbugs wanted the public static members in PigNUllableWritable to be marked final.

> Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-835
>                 URL: https://issues.apache.org/jira/browse/PIG-835
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.3.0
>
>         Attachments: PIG-835-v2.patch, PIG-835.patch
>
>
> A query like the following results in an exception on execution:
> {noformat}
> a = load 'mult.input' as (name, age, gpa);
> b = group a ALL;
> c = foreach b generate group, COUNT(a);
> store c into 'foo';
> d = group a by (name, gpa);
> e = foreach d generate flatten(group), MIN(a.age);
> store e into 'bar';
> {noformat}
> Exception on execution:
> 09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_000000_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-835:
-------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

Patch commited to both trunk and branch-0.3

> Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-835
>                 URL: https://issues.apache.org/jira/browse/PIG-835
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.3.0
>
>         Attachments: PIG-835-v2.patch, PIG-835.patch
>
>
> A query like the following results in an exception on execution:
> {noformat}
> a = load 'mult.input' as (name, age, gpa);
> b = group a ALL;
> c = foreach b generate group, COUNT(a);
> store c into 'foo';
> d = group a by (name, gpa);
> e = foreach d generate flatten(group), MIN(a.age);
> store e into 'bar';
> {noformat}
> Exception on execution:
> 09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_000000_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-835:
-------------------------------

    Status: Open  (was: Patch Available)

> Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-835
>                 URL: https://issues.apache.org/jira/browse/PIG-835
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.3.0
>
>         Attachments: PIG-835.patch
>
>
> A query like the following results in an exception on execution:
> {noformat}
> a = load 'mult.input' as (name, age, gpa);
> b = group a ALL;
> c = foreach b generate group, COUNT(a);
> store c into 'foo';
> d = group a by (name, gpa);
> e = foreach d generate flatten(group), MIN(a.age);
> store e into 'bar';
> {noformat}
> Exception on execution:
> 09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_000000_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-835:
-------------------------------

    Status: Open  (was: Patch Available)

> Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-835
>                 URL: https://issues.apache.org/jira/browse/PIG-835
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.3.0
>
>         Attachments: PIG-835.patch
>
>
> A query like the following results in an exception on execution:
> {noformat}
> a = load 'mult.input' as (name, age, gpa);
> b = group a ALL;
> c = foreach b generate group, COUNT(a);
> store c into 'foo';
> d = group a by (name, gpa);
> e = foreach d generate flatten(group), MIN(a.age);
> store e into 'bar';
> {noformat}
> Exception on execution:
> 09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_000000_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.