You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Araceli Henley (Created) (JIRA)" <ji...@apache.org> on 2011/11/16 00:31:54 UTC

[jira] [Created] (PIG-2374) streaming regression with dotNext

streaming regression with dotNext
---------------------------------

                 Key: PIG-2374
                 URL: https://issues.apache.org/jira/browse/PIG-2374
             Project: Pig
          Issue Type: Bug
         Environment: hadoopApache Pig version 0.9.2.1111101150 (r1200499)
compiled Nov 10 2011, 19:50:15
 -bash-3.1$ hadoop version
Hadoop 0.23.0.1111080202

Subversion http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.23.0/hadoop-common-project/hadoop-common -r 1196973
Compiled by hadoopqa on Tue Nov  8 02:12:04 PST 2011
>From source with checksum 4e42b2d96c899a98a8ab8c7cc23f27ae

            Reporter: Araceli Henley
             Fix For: 0.9.2


Streaming seems to be broken in dotNext. There are several tests that are failing.
The results from C below produce clean results.
The results from D which are streamed through CMD produce control characters on some of the output.

define CMD `perl GroupBy.pl '\t' 0` ship('/homes/monster/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/GroupBy.pl');
A = load '/user/user1/pig/tests/data/singlefile/studenttab10k';
B = group A by $0;
C = foreach B generate flatten(A);
D = stream C through CMD;
store C into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_C.out';
store D into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_D.out';



Other streaming tests that fail with control characters:
EST FAILED <ComputeSpec_7>
TEST FAILED <ComputeSpec_8>
TEST FAILED <ComputeSpec_10>
TEST FAILED <ComputeSpec_11>
TEST FAILED <ComputeSpec_12>
TEST FAILED <JobManagement_2>
TEST FAILED <JobManagement_3>
TEST FAILED <StreamingIO_4>
TEST FAILED <NonStreaming_1>
TEST FAILED <MultiQuery_21>
...


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2374) streaming regression with dotNext

Posted by "Daniel Dai (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-2374:
----------------------------

    Attachment: PIG-2374-1.patch
    
> streaming regression with dotNext
> ---------------------------------
>
>                 Key: PIG-2374
>                 URL: https://issues.apache.org/jira/browse/PIG-2374
>             Project: Pig
>          Issue Type: Bug
>         Environment: hadoopApache Pig version 0.9.2.1111101150 (r1200499)
> compiled Nov 10 2011, 19:50:15
>  -bash-3.1$ hadoop version
> Hadoop 0.23.0.1111080202
> Subversion http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.23.0/hadoop-common-project/hadoop-common -r 1196973
> Compiled by hadoopqa on Tue Nov  8 02:12:04 PST 2011
> From source with checksum 4e42b2d96c899a98a8ab8c7cc23f27ae
>            Reporter: Araceli Henley
>            Assignee: Daniel Dai
>             Fix For: 0.9.2
>
>         Attachments: PIG-2374-1.patch
>
>
> Streaming seems to be broken in dotNext. There are several tests that are failing.
> The results from C below produce clean results.
> The results from D which are streamed through CMD produce control characters on some of the output.
> define CMD `perl GroupBy.pl '\t' 0` ship('/homes/monster/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/GroupBy.pl');
> A = load '/user/user1/pig/tests/data/singlefile/studenttab10k';
> B = group A by $0;
> C = foreach B generate flatten(A);
> D = stream C through CMD;
> store C into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_C.out';
> store D into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_D.out';
> Other streaming tests that fail with control characters:
> EST FAILED <ComputeSpec_7>
> TEST FAILED <ComputeSpec_8>
> TEST FAILED <ComputeSpec_10>
> TEST FAILED <ComputeSpec_11>
> TEST FAILED <ComputeSpec_12>
> TEST FAILED <JobManagement_2>
> TEST FAILED <JobManagement_3>
> TEST FAILED <StreamingIO_4>
> TEST FAILED <NonStreaming_1>
> TEST FAILED <MultiQuery_21>
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2374) streaming regression with dotNext

Posted by "Olga Natkovich (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164769#comment-13164769 ] 

Olga Natkovich commented on PIG-2374:
-------------------------------------

I think Ashutosh is brining a really good point. We seemed to always fixing things in Pig because understandably it is easier for us. However, if Hadoop is breaking contract they should be fixing this especially if we have to be paying performance penalty on this
                
> streaming regression with dotNext
> ---------------------------------
>
>                 Key: PIG-2374
>                 URL: https://issues.apache.org/jira/browse/PIG-2374
>             Project: Pig
>          Issue Type: Bug
>         Environment: hadoopApache Pig version 0.9.2.1111101150 (r1200499)
> compiled Nov 10 2011, 19:50:15
>  -bash-3.1$ hadoop version
> Hadoop 0.23.0.1111080202
> Subversion http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.23.0/hadoop-common-project/hadoop-common -r 1196973
> Compiled by hadoopqa on Tue Nov  8 02:12:04 PST 2011
> From source with checksum 4e42b2d96c899a98a8ab8c7cc23f27ae
>            Reporter: Araceli Henley
>            Assignee: Daniel Dai
>              Labels: hadoop2.0
>             Fix For: 0.9.2
>
>         Attachments: PIG-2374-1.patch
>
>
> Streaming seems to be broken in dotNext. There are several tests that are failing.
> The results from C below produce clean results.
> The results from D which are streamed through CMD produce control characters on some of the output.
> define CMD `perl GroupBy.pl '\t' 0` ship('/homes/monster/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/GroupBy.pl');
> A = load '/user/user1/pig/tests/data/singlefile/studenttab10k';
> B = group A by $0;
> C = foreach B generate flatten(A);
> D = stream C through CMD;
> store C into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_C.out';
> store D into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_D.out';
> Other streaming tests that fail with control characters:
> EST FAILED <ComputeSpec_7>
> TEST FAILED <ComputeSpec_8>
> TEST FAILED <ComputeSpec_10>
> TEST FAILED <ComputeSpec_11>
> TEST FAILED <ComputeSpec_12>
> TEST FAILED <JobManagement_2>
> TEST FAILED <JobManagement_3>
> TEST FAILED <StreamingIO_4>
> TEST FAILED <NonStreaming_1>
> TEST FAILED <MultiQuery_21>
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2374) streaming regression with dotNext

Posted by "Daniel Dai (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161358#comment-13161358 ] 

Daniel Dai commented on PIG-2374:
---------------------------------

This is caused by HADOOP-6109 (0.21 and beyond). After 6109, Text.getBytes() will return a bytearray larger than Text.length. In Pig code, OutputHandler:92, we use the bytearray and ignore length. We need to either:
1. Ask Hadoop to rollback HADOOP-6109
2. Hunting down all occurrence we use getBytes() but ignore length in Pig
                
> streaming regression with dotNext
> ---------------------------------
>
>                 Key: PIG-2374
>                 URL: https://issues.apache.org/jira/browse/PIG-2374
>             Project: Pig
>          Issue Type: Bug
>         Environment: hadoopApache Pig version 0.9.2.1111101150 (r1200499)
> compiled Nov 10 2011, 19:50:15
>  -bash-3.1$ hadoop version
> Hadoop 0.23.0.1111080202
> Subversion http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.23.0/hadoop-common-project/hadoop-common -r 1196973
> Compiled by hadoopqa on Tue Nov  8 02:12:04 PST 2011
> From source with checksum 4e42b2d96c899a98a8ab8c7cc23f27ae
>            Reporter: Araceli Henley
>             Fix For: 0.9.2
>
>
> Streaming seems to be broken in dotNext. There are several tests that are failing.
> The results from C below produce clean results.
> The results from D which are streamed through CMD produce control characters on some of the output.
> define CMD `perl GroupBy.pl '\t' 0` ship('/homes/monster/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/GroupBy.pl');
> A = load '/user/user1/pig/tests/data/singlefile/studenttab10k';
> B = group A by $0;
> C = foreach B generate flatten(A);
> D = stream C through CMD;
> store C into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_C.out';
> store D into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_D.out';
> Other streaming tests that fail with control characters:
> EST FAILED <ComputeSpec_7>
> TEST FAILED <ComputeSpec_8>
> TEST FAILED <ComputeSpec_10>
> TEST FAILED <ComputeSpec_11>
> TEST FAILED <ComputeSpec_12>
> TEST FAILED <JobManagement_2>
> TEST FAILED <JobManagement_3>
> TEST FAILED <StreamingIO_4>
> TEST FAILED <NonStreaming_1>
> TEST FAILED <MultiQuery_21>
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2374) streaming regression with dotNext

Posted by "Daniel Dai (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164738#comment-13164738 ] 

Daniel Dai commented on PIG-2374:
---------------------------------

Yes, this is a break of contract and might hit other projects as well. 
                
> streaming regression with dotNext
> ---------------------------------
>
>                 Key: PIG-2374
>                 URL: https://issues.apache.org/jira/browse/PIG-2374
>             Project: Pig
>          Issue Type: Bug
>         Environment: hadoopApache Pig version 0.9.2.1111101150 (r1200499)
> compiled Nov 10 2011, 19:50:15
>  -bash-3.1$ hadoop version
> Hadoop 0.23.0.1111080202
> Subversion http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.23.0/hadoop-common-project/hadoop-common -r 1196973
> Compiled by hadoopqa on Tue Nov  8 02:12:04 PST 2011
> From source with checksum 4e42b2d96c899a98a8ab8c7cc23f27ae
>            Reporter: Araceli Henley
>            Assignee: Daniel Dai
>              Labels: hadoop2.0
>             Fix For: 0.9.2
>
>         Attachments: PIG-2374-1.patch
>
>
> Streaming seems to be broken in dotNext. There are several tests that are failing.
> The results from C below produce clean results.
> The results from D which are streamed through CMD produce control characters on some of the output.
> define CMD `perl GroupBy.pl '\t' 0` ship('/homes/monster/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/GroupBy.pl');
> A = load '/user/user1/pig/tests/data/singlefile/studenttab10k';
> B = group A by $0;
> C = foreach B generate flatten(A);
> D = stream C through CMD;
> store C into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_C.out';
> store D into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_D.out';
> Other streaming tests that fail with control characters:
> EST FAILED <ComputeSpec_7>
> TEST FAILED <ComputeSpec_8>
> TEST FAILED <ComputeSpec_10>
> TEST FAILED <ComputeSpec_11>
> TEST FAILED <ComputeSpec_12>
> TEST FAILED <JobManagement_2>
> TEST FAILED <JobManagement_3>
> TEST FAILED <StreamingIO_4>
> TEST FAILED <NonStreaming_1>
> TEST FAILED <MultiQuery_21>
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (PIG-2374) streaming regression with dotNext

Posted by "Daniel Dai (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai resolved PIG-2374.
-----------------------------

      Resolution: Fixed
    Hadoop Flags: Reviewed

Unit tests pass. test-patch:
     [exec] -1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
     [exec]                         Please justify why no tests are needed for this patch.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     -1 release audit.  The applied patch generated 463 release audit warnings (more than the trunk's current 456 warnings).

No tests included since it is a regression. No new file added so ignore release audit warning.

Patch committed to trunk/0.10/0.9
                
> streaming regression with dotNext
> ---------------------------------
>
>                 Key: PIG-2374
>                 URL: https://issues.apache.org/jira/browse/PIG-2374
>             Project: Pig
>          Issue Type: Bug
>         Environment: hadoopApache Pig version 0.9.2.1111101150 (r1200499)
> compiled Nov 10 2011, 19:50:15
>  -bash-3.1$ hadoop version
> Hadoop 0.23.0.1111080202
> Subversion http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.23.0/hadoop-common-project/hadoop-common -r 1196973
> Compiled by hadoopqa on Tue Nov  8 02:12:04 PST 2011
> From source with checksum 4e42b2d96c899a98a8ab8c7cc23f27ae
>            Reporter: Araceli Henley
>            Assignee: Daniel Dai
>              Labels: hadoop2.0
>             Fix For: 0.9.2
>
>         Attachments: PIG-2374-1.patch
>
>
> Streaming seems to be broken in dotNext. There are several tests that are failing.
> The results from C below produce clean results.
> The results from D which are streamed through CMD produce control characters on some of the output.
> define CMD `perl GroupBy.pl '\t' 0` ship('/homes/monster/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/GroupBy.pl');
> A = load '/user/user1/pig/tests/data/singlefile/studenttab10k';
> B = group A by $0;
> C = foreach B generate flatten(A);
> D = stream C through CMD;
> store C into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_C.out';
> store D into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_D.out';
> Other streaming tests that fail with control characters:
> EST FAILED <ComputeSpec_7>
> TEST FAILED <ComputeSpec_8>
> TEST FAILED <ComputeSpec_10>
> TEST FAILED <ComputeSpec_11>
> TEST FAILED <ComputeSpec_12>
> TEST FAILED <JobManagement_2>
> TEST FAILED <JobManagement_3>
> TEST FAILED <StreamingIO_4>
> TEST FAILED <NonStreaming_1>
> TEST FAILED <MultiQuery_21>
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2374) streaming regression with dotNext

Posted by "Ashutosh Chauhan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164724#comment-13164724 ] 

Ashutosh Chauhan commented on PIG-2374:
---------------------------------------

We should push for backward compatibility of getBytes() on Hadoop for this. The way it is fixed with this patch will necessitate an extra buffer copy in Pig, an unnecessary performance hit.
                
> streaming regression with dotNext
> ---------------------------------
>
>                 Key: PIG-2374
>                 URL: https://issues.apache.org/jira/browse/PIG-2374
>             Project: Pig
>          Issue Type: Bug
>         Environment: hadoopApache Pig version 0.9.2.1111101150 (r1200499)
> compiled Nov 10 2011, 19:50:15
>  -bash-3.1$ hadoop version
> Hadoop 0.23.0.1111080202
> Subversion http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.23.0/hadoop-common-project/hadoop-common -r 1196973
> Compiled by hadoopqa on Tue Nov  8 02:12:04 PST 2011
> From source with checksum 4e42b2d96c899a98a8ab8c7cc23f27ae
>            Reporter: Araceli Henley
>            Assignee: Daniel Dai
>              Labels: hadoop2.0
>             Fix For: 0.9.2
>
>         Attachments: PIG-2374-1.patch
>
>
> Streaming seems to be broken in dotNext. There are several tests that are failing.
> The results from C below produce clean results.
> The results from D which are streamed through CMD produce control characters on some of the output.
> define CMD `perl GroupBy.pl '\t' 0` ship('/homes/monster/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/GroupBy.pl');
> A = load '/user/user1/pig/tests/data/singlefile/studenttab10k';
> B = group A by $0;
> C = foreach B generate flatten(A);
> D = stream C through CMD;
> store C into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_C.out';
> store D into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_D.out';
> Other streaming tests that fail with control characters:
> EST FAILED <ComputeSpec_7>
> TEST FAILED <ComputeSpec_8>
> TEST FAILED <ComputeSpec_10>
> TEST FAILED <ComputeSpec_11>
> TEST FAILED <ComputeSpec_12>
> TEST FAILED <JobManagement_2>
> TEST FAILED <JobManagement_3>
> TEST FAILED <StreamingIO_4>
> TEST FAILED <NonStreaming_1>
> TEST FAILED <MultiQuery_21>
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2374) streaming regression with dotNext

Posted by "Thejas M Nair (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163940#comment-13163940 ] 

Thejas M Nair commented on PIG-2374:
------------------------------------

+1
                
> streaming regression with dotNext
> ---------------------------------
>
>                 Key: PIG-2374
>                 URL: https://issues.apache.org/jira/browse/PIG-2374
>             Project: Pig
>          Issue Type: Bug
>         Environment: hadoopApache Pig version 0.9.2.1111101150 (r1200499)
> compiled Nov 10 2011, 19:50:15
>  -bash-3.1$ hadoop version
> Hadoop 0.23.0.1111080202
> Subversion http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.23.0/hadoop-common-project/hadoop-common -r 1196973
> Compiled by hadoopqa on Tue Nov  8 02:12:04 PST 2011
> From source with checksum 4e42b2d96c899a98a8ab8c7cc23f27ae
>            Reporter: Araceli Henley
>            Assignee: Daniel Dai
>             Fix For: 0.9.2
>
>         Attachments: PIG-2374-1.patch
>
>
> Streaming seems to be broken in dotNext. There are several tests that are failing.
> The results from C below produce clean results.
> The results from D which are streamed through CMD produce control characters on some of the output.
> define CMD `perl GroupBy.pl '\t' 0` ship('/homes/monster/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/GroupBy.pl');
> A = load '/user/user1/pig/tests/data/singlefile/studenttab10k';
> B = group A by $0;
> C = foreach B generate flatten(A);
> D = stream C through CMD;
> store C into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_C.out';
> store D into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_D.out';
> Other streaming tests that fail with control characters:
> EST FAILED <ComputeSpec_7>
> TEST FAILED <ComputeSpec_8>
> TEST FAILED <ComputeSpec_10>
> TEST FAILED <ComputeSpec_11>
> TEST FAILED <ComputeSpec_12>
> TEST FAILED <JobManagement_2>
> TEST FAILED <JobManagement_3>
> TEST FAILED <StreamingIO_4>
> TEST FAILED <NonStreaming_1>
> TEST FAILED <MultiQuery_21>
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2374) streaming regression with dotNext

Posted by "Daniel Dai (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162945#comment-13162945 ] 

Daniel Dai commented on PIG-2374:
---------------------------------

Unit tests pass. No tests included cuz the current e2e tests already have it covered.
                
> streaming regression with dotNext
> ---------------------------------
>
>                 Key: PIG-2374
>                 URL: https://issues.apache.org/jira/browse/PIG-2374
>             Project: Pig
>          Issue Type: Bug
>         Environment: hadoopApache Pig version 0.9.2.1111101150 (r1200499)
> compiled Nov 10 2011, 19:50:15
>  -bash-3.1$ hadoop version
> Hadoop 0.23.0.1111080202
> Subversion http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.23.0/hadoop-common-project/hadoop-common -r 1196973
> Compiled by hadoopqa on Tue Nov  8 02:12:04 PST 2011
> From source with checksum 4e42b2d96c899a98a8ab8c7cc23f27ae
>            Reporter: Araceli Henley
>            Assignee: Daniel Dai
>             Fix For: 0.9.2
>
>         Attachments: PIG-2374-1.patch
>
>
> Streaming seems to be broken in dotNext. There are several tests that are failing.
> The results from C below produce clean results.
> The results from D which are streamed through CMD produce control characters on some of the output.
> define CMD `perl GroupBy.pl '\t' 0` ship('/homes/monster/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/GroupBy.pl');
> A = load '/user/user1/pig/tests/data/singlefile/studenttab10k';
> B = group A by $0;
> C = foreach B generate flatten(A);
> D = stream C through CMD;
> store C into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_C.out';
> store D into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_D.out';
> Other streaming tests that fail with control characters:
> EST FAILED <ComputeSpec_7>
> TEST FAILED <ComputeSpec_8>
> TEST FAILED <ComputeSpec_10>
> TEST FAILED <ComputeSpec_11>
> TEST FAILED <ComputeSpec_12>
> TEST FAILED <JobManagement_2>
> TEST FAILED <JobManagement_3>
> TEST FAILED <StreamingIO_4>
> TEST FAILED <NonStreaming_1>
> TEST FAILED <MultiQuery_21>
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (PIG-2374) streaming regression with dotNext

Posted by "Daniel Dai (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai reassigned PIG-2374:
-------------------------------

    Assignee: Daniel Dai
    
> streaming regression with dotNext
> ---------------------------------
>
>                 Key: PIG-2374
>                 URL: https://issues.apache.org/jira/browse/PIG-2374
>             Project: Pig
>          Issue Type: Bug
>         Environment: hadoopApache Pig version 0.9.2.1111101150 (r1200499)
> compiled Nov 10 2011, 19:50:15
>  -bash-3.1$ hadoop version
> Hadoop 0.23.0.1111080202
> Subversion http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.23.0/hadoop-common-project/hadoop-common -r 1196973
> Compiled by hadoopqa on Tue Nov  8 02:12:04 PST 2011
> From source with checksum 4e42b2d96c899a98a8ab8c7cc23f27ae
>            Reporter: Araceli Henley
>            Assignee: Daniel Dai
>             Fix For: 0.9.2
>
>
> Streaming seems to be broken in dotNext. There are several tests that are failing.
> The results from C below produce clean results.
> The results from D which are streamed through CMD produce control characters on some of the output.
> define CMD `perl GroupBy.pl '\t' 0` ship('/homes/monster/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/GroupBy.pl');
> A = load '/user/user1/pig/tests/data/singlefile/studenttab10k';
> B = group A by $0;
> C = foreach B generate flatten(A);
> D = stream C through CMD;
> store C into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_C.out';
> store D into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_D.out';
> Other streaming tests that fail with control characters:
> EST FAILED <ComputeSpec_7>
> TEST FAILED <ComputeSpec_8>
> TEST FAILED <ComputeSpec_10>
> TEST FAILED <ComputeSpec_11>
> TEST FAILED <ComputeSpec_12>
> TEST FAILED <JobManagement_2>
> TEST FAILED <JobManagement_3>
> TEST FAILED <StreamingIO_4>
> TEST FAILED <NonStreaming_1>
> TEST FAILED <MultiQuery_21>
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2374) streaming regression with dotNext

Posted by "Daniel Dai (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162539#comment-13162539 ] 

Daniel Dai commented on PIG-2374:
---------------------------------

PIG-2374-1.patch use approach 2.
                
> streaming regression with dotNext
> ---------------------------------
>
>                 Key: PIG-2374
>                 URL: https://issues.apache.org/jira/browse/PIG-2374
>             Project: Pig
>          Issue Type: Bug
>         Environment: hadoopApache Pig version 0.9.2.1111101150 (r1200499)
> compiled Nov 10 2011, 19:50:15
>  -bash-3.1$ hadoop version
> Hadoop 0.23.0.1111080202
> Subversion http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.23.0/hadoop-common-project/hadoop-common -r 1196973
> Compiled by hadoopqa on Tue Nov  8 02:12:04 PST 2011
> From source with checksum 4e42b2d96c899a98a8ab8c7cc23f27ae
>            Reporter: Araceli Henley
>            Assignee: Daniel Dai
>             Fix For: 0.9.2
>
>         Attachments: PIG-2374-1.patch
>
>
> Streaming seems to be broken in dotNext. There are several tests that are failing.
> The results from C below produce clean results.
> The results from D which are streamed through CMD produce control characters on some of the output.
> define CMD `perl GroupBy.pl '\t' 0` ship('/homes/monster/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/GroupBy.pl');
> A = load '/user/user1/pig/tests/data/singlefile/studenttab10k';
> B = group A by $0;
> C = foreach B generate flatten(A);
> D = stream C through CMD;
> store C into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_C.out';
> store D into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_D.out';
> Other streaming tests that fail with control characters:
> EST FAILED <ComputeSpec_7>
> TEST FAILED <ComputeSpec_8>
> TEST FAILED <ComputeSpec_10>
> TEST FAILED <ComputeSpec_11>
> TEST FAILED <ComputeSpec_12>
> TEST FAILED <JobManagement_2>
> TEST FAILED <JobManagement_3>
> TEST FAILED <StreamingIO_4>
> TEST FAILED <NonStreaming_1>
> TEST FAILED <MultiQuery_21>
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira