You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Viraj Bhat (JIRA)" <ji...@apache.org> on 2009/10/30 01:48:59 UTC

[jira] Created: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Behvaiour of COGROUP with and without schema when using "*" operator
--------------------------------------------------------------------

                 Key: PIG-1064
                 URL: https://issues.apache.org/jira/browse/PIG-1064
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.6.0
            Reporter: Viraj Bhat
             Fix For: 0.6.0


I have 2 tab separated files, "1.txt" and "2.txt"

$ cat 1.txt 
====================
1       2

2       3

====================
$ cat 2.txt 

1       2

2       3

I use COGROUP feature of Pig in the following way:

$java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main

{code}
grunt> A = load '1.txt';            
grunt> B = load '2.txt' as (b0, b1);
grunt> C = cogroup A by *, B by *;  
{code}

2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
Details at logfile: pig_1256845224752.log
==========================================================

If I reverse, the order of the schema's
{code}
grunt> A = load '1.txt' as (a0, a1);
grunt> B = load '2.txt';            
grunt> C = cogroup A by *, B by *;  
{code}
2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
Details at logfile: pig_1256845224752.log

==========================================================
Now running without schema??
{code}
grunt> A = load '1.txt';            
grunt> B = load '2.txt';            
grunt> C = cogroup A by *, B by *;
grunt> dump C; 
{code}

2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!

((1,2),{(1,2)},{(1,2)})
((2,3),{(2,3)},{(2,3)})
==========================================================

Is this a bug or a feature?

Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777747#action_12777747 ] 

Pradeep Kamath commented on PIG-1064:
-------------------------------------

Can't make out what is wrong with the unit tests from the report above - am running them all on my local box - will update with the results

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064-4.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779693#action_12779693 ] 

Hadoop QA commented on PIG-1064:
--------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12425360/PIG-1064-5.patch
  against trunk revision 881008.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/160/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/160/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/160/console

This message is automatically generated.

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064-4.patch, PIG-1064-5.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-1064:
----------------------------

    Attachment: PIG-1064-4.patch

Attach a patch to fix TestSecondarySort unit failure.

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064-4.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-1064:
--------------------------------

    Status: Open  (was: Patch Available)

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777699#action_12777699 ] 

Hadoop QA commented on PIG-1064:
--------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12424878/PIG-1064-4.patch
  against trunk revision 835499.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 15 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/155/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/155/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/155/console

This message is automatically generated.

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064-4.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778430#action_12778430 ] 

Daniel Dai commented on PIG-1064:
---------------------------------

With this patch, "group by *" without schema does not work anymore. I think there could be some valid use case on that, eg, people may want to use this to do a count for each distinctive values using statement "group by *; foreach generate group, COUNT(*);". It is much safe to allow "group by *" work, and only disallow "cogroup by *".

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064-4.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-1064:
--------------------------------

    Status: Open  (was: Patch Available)

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-1064:
--------------------------------

    Attachment: PIG-1064.patch

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-1064:
--------------------------------

    Status: Open  (was: Patch Available)

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064-4.patch, PIG-1064-5.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-1064:
--------------------------------

    Attachment: PIG-1064-3.patch

There were a couple of new tests added by a recent patch (PIG-1038) which had group by star and broke the tests with this patch - attached patch with fix in the tests.

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-1064:
----------------------------

    Status: Patch Available  (was: Open)

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064-4.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777265#action_12777265 ] 

Hadoop QA commented on PIG-1064:
--------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12424755/PIG-1064-2.patch
  against trunk revision 835499.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 12 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/153/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/153/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/153/console

This message is automatically generated.

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776527#action_12776527 ] 

Pradeep Kamath commented on PIG-1064:
-------------------------------------

The last paragraph in my previous comment should read:
If we feel that users should not cogroup on star we should prevent it in the parser. The proposed fix is easy enough that I don't think we need to restrict the use of star.

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>             Fix For: 0.6.0
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12775976#action_12775976 ] 

Pradeep Kamath commented on PIG-1064:
-------------------------------------

A proposal to fix this is to catch the situation wherein the user specifies '*' as the cogrouping key and does not have a schema for the corresponding input to the cogroup. In these situations we would issue an error message - "Cogroup by * is only allowed if the input has a schema" and error out.

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>             Fix For: 0.6.0
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780169#action_12780169 ] 

Pradeep Kamath commented on PIG-1064:
-------------------------------------

Patch committed to trunk.

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064-4.patch, PIG-1064-5.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-1064:
--------------------------------

    Status: Patch Available  (was: Open)

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-1064:
--------------------------------

    Status: Patch Available  (was: Open)

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-1064:
--------------------------------

    Status: Patch Available  (was: Open)

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064-4.patch, PIG-1064-5.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777392#action_12777392 ] 

Hadoop QA commented on PIG-1064:
--------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12424792/PIG-1064-3.patch
  against trunk revision 835499.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 15 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/154/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/154/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/154/console

This message is automatically generated.

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-1064:
--------------------------------

    Attachment: PIG-1064-5.patch

Attached patch to ensure group by star with out schema still works.

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064-4.patch, PIG-1064-5.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-1064:
--------------------------------

    Assignee: Pradeep Kamath
      Status: Patch Available  (was: Open)

The patch implements the proposal to catch the situation wherein the user specifies '*' as the cogrouping key and does not have a schema for the corresponding input to the cogroup. In these situations we would issue an error message - "Cogroup by * is only allowed if the input has a schema" and error out.

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776832#action_12776832 ] 

Hadoop QA commented on PIG-1064:
--------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12424676/PIG-1064.patch
  against trunk revision 835005.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/149/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/149/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/149/console

This message is automatically generated.

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-1064:
--------------------------------

    Attachment: PIG-1064-2.patch

Attached patch address unit test failures - the failures were in other tests wherein cogroup * without schema would be valid in the front end. With the changes in the patch, this is no longer the case. I have removed these testcases and in one case retained it since it tests with different loadfuncs.

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776525#action_12776525 ] 

Pradeep Kamath commented on PIG-1064:
-------------------------------------

Cogroup needs the same arity for the grouping key from both inputs. If there is a cogroup by *, the '*' needs to be expanded so we know the arity. This is done in ProjectStarTranslator - the current code leaves the '*' as is when there is no schema. This causes problems in the backend - hence the proposed fix to catch this and error out.

If we feel that users should not cogroup on '*' we should prevent it in the parser. The proposed fix is easy enough that I don't think we need to restrict the use of '*'.

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>             Fix For: 0.6.0
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-1064:
--------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064-4.patch, PIG-1064-5.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-1064:
----------------------------

    Status: Open  (was: Patch Available)

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064-4.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776520#action_12776520 ] 

Alan Gates commented on PIG-1064:
---------------------------------

Why is cogrouping on * without a schema causing trouble?  Because we can't guarantee that inputs have the same number of fields?

Why would anyone ever want to cogroup on *?  Do we need to spend any effort fixing this?

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>             Fix For: 0.6.0
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779795#action_12779795 ] 

Daniel Dai commented on PIG-1064:
---------------------------------

+1

> Behvaiour of COGROUP with and without schema when using "*" operator
> --------------------------------------------------------------------
>
>                 Key: PIG-1064
>                 URL: https://issues.apache.org/jira/browse/PIG-1064
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>             Fix For: 0.6.0
>
>         Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064-4.patch, PIG-1064-5.patch, PIG-1064.patch
>
>
> I have 2 tab separated files, "1.txt" and "2.txt"
> $ cat 1.txt 
> ====================
> 1       2
> 2       3
> ====================
> $ cat 2.txt 
> 1       2
> 2       3
> I use COGROUP feature of Pig in the following way:
> $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt' as (b0, b1);
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1012: Each COGroup input has to have the same number of inner plans
> Details at logfile: pig_1256845224752.log
> ==========================================================
> If I reverse, the order of the schema's
> {code}
> grunt> A = load '1.txt' as (a0, a1);
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;  
> {code}
> 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1013: Grouping attributes can either be star (*) or a list of expressions, but not both.
> Details at logfile: pig_1256845224752.log
> ==========================================================
> Now running without schema??
> {code}
> grunt> A = load '1.txt';            
> grunt> B = load '2.txt';            
> grunt> C = cogroup A by *, B by *;
> grunt> dump C; 
> {code}
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully stored result in: "file:/tmp/temp-319926700/tmp-1990275961"
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records written : 2
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written : 154
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
> 2009-10-29 12:55:37,202 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
> ((1,2),{(1,2)},{(1,2)})
> ((2,3),{(2,3)},{(2,3)})
> ==========================================================
> Is this a bug or a feature?
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.