You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Feng Peng (JIRA)" <ji...@apache.org> on 2012/09/27 23:37:07 UTC

[jira] [Created] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Feng Peng created PIG-2937:
------------------------------

             Summary: generated field in nested foreach does not inherit the variable name as the field name
                 Key: PIG-2937
                 URL: https://issues.apache.org/jira/browse/PIG-2937
             Project: Pig
          Issue Type: Bug
            Reporter: Feng Peng


{code}
raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
records = foreach raw_data {
  generated_field = (field_a is null ? '-' : someUDF(field_b)); 
  GENERATE
    field_c,
    generated_field
  ;
}
describe records;
{code}

One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Jonathan Coveney (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487217#comment-13487217 ] 

Jonathan Coveney commented on PIG-2937:
---------------------------------------

Bump. Would love it if someone could take a look or opine
                
> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>            Assignee: Jonathan Coveney
>             Fix For: 0.11, 0.12
>
>         Attachments: PIG-2937-0.patch, PIG-2937-1.patch
>
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Jonathan Coveney (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481841#comment-13481841 ] 

Jonathan Coveney commented on PIG-2937:
---------------------------------------

This is definitely important and useful.

To my eye, the way that this should work is that in any case where you don't have a schema (in this case, generated_field inside of the GENERATE) we should do our best to fill it in. In the case of a binary conditional, etc, we know the return type, so that gives us the type, and the field name (ie generated_field) would give us the name.

I think that this is not a deep change, but it is a tricky one as getting Pig to thread through Schema information like this that isn't currently threaded through can be tricky.
                
> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Jonathan Coveney (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Coveney updated PIG-2937:
----------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

It's in. Thanks Rohini!
                
> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>            Assignee: Jonathan Coveney
>             Fix For: 0.11, 0.12
>
>         Attachments: PIG-2937-0.patch, PIG-2937-1.patch, PIG-2937-2.patch, PIG-2937-3_nowhitespace.patch, PIG-2937-3_whitespace.patch, PIG-2937-4_whitespace.patch
>
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Rohini Palaniswamy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13494830#comment-13494830 ] 

Rohini Palaniswamy commented on PIG-2937:
-----------------------------------------

Sorry about the delay Jon. The patch looks good and the logical plan schema is now correct. It would be good if we can add a testcase. Just had a recent reminder from Santhosh to do a +1 after ensuring there is a unit test. 
                
> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>            Assignee: Jonathan Coveney
>             Fix For: 0.11, 0.12
>
>         Attachments: PIG-2937-0.patch, PIG-2937-1.patch, PIG-2937-2.patch
>
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Jonathan Coveney (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Coveney updated PIG-2937:
----------------------------------

    Attachment: PIG-2937-4_whitespace.patch
    
> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>            Assignee: Jonathan Coveney
>             Fix For: 0.11, 0.12
>
>         Attachments: PIG-2937-0.patch, PIG-2937-1.patch, PIG-2937-2.patch, PIG-2937-3_nowhitespace.patch, PIG-2937-3_whitespace.patch, PIG-2937-4_whitespace.patch
>
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Rohini Palaniswamy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488444#comment-13488444 ] 

Rohini Palaniswamy commented on PIG-2937:
-----------------------------------------

Jon, 
   I compiled the logical plan for the given query with your patch. The schema for the generated field still does not have a variable name associated with it. Did not dig further to see where is the problem though. 
   Also can we include a unit test? Thanks. 
                
> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>            Assignee: Jonathan Coveney
>             Fix For: 0.11, 0.12
>
>         Attachments: PIG-2937-0.patch, PIG-2937-1.patch
>
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Jonathan Coveney (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Coveney updated PIG-2937:
----------------------------------

    Attachment: PIG-2937-2.patch

Rohini,

Not sure what happened. I thought it worked locally, but maybe I was wrong. Either way, I uploaded a version that should work. Would love some eyes. Thanks!
                
> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>            Assignee: Jonathan Coveney
>             Fix For: 0.11, 0.12
>
>         Attachments: PIG-2937-0.patch, PIG-2937-1.patch, PIG-2937-2.patch
>
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Jonathan Coveney (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Coveney reassigned PIG-2937:
-------------------------------------

    Assignee: Jonathan Coveney
    
> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>            Assignee: Jonathan Coveney
>             Fix For: 0.11, 0.12
>
>         Attachments: PIG-2937-0.patch
>
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Jonathan Coveney (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Coveney updated PIG-2937:
----------------------------------

    Attachment: PIG-2937-0.patch

I've attached a patch that fixes this and passes test-commit.

The approach it takes is in the case of a generate where you have a field referencing a valid LogicalExpression that for some reason does not have an alias associated with it (this is not common, only really happens afaik in the context Feng posted), then the alias name becomes the alias in the Schema.
                
> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>         Attachments: PIG-2937-0.patch
>
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Jonathan Coveney (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Coveney updated PIG-2937:
----------------------------------

    Attachment: PIG-2937-3_whitespace.patch
                PIG-2937-3_nowhitespace.patch

Rohini,

Thanks for taking a look. I've attached a patch with and without whitespace changes. I added a test, and refactored TestLogicalPlanGenerator (where I put the test) a little bit. The patch without whitespace is to make reviewing easier.

Please let me know your thoughts.
                
> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>            Assignee: Jonathan Coveney
>             Fix For: 0.11, 0.12
>
>         Attachments: PIG-2937-0.patch, PIG-2937-1.patch, PIG-2937-2.patch, PIG-2937-3_nowhitespace.patch, PIG-2937-3_whitespace.patch
>
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Feng Peng (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13497417#comment-13497417 ] 

Feng Peng commented on PIG-2937:
--------------------------------

Great, thanks guys!
                
> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>            Assignee: Jonathan Coveney
>             Fix For: 0.11, 0.12
>
>         Attachments: PIG-2937-0.patch, PIG-2937-1.patch, PIG-2937-2.patch, PIG-2937-3_nowhitespace.patch, PIG-2937-3_whitespace.patch
>
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Rohini Palaniswamy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13497319#comment-13497319 ] 

Rohini Palaniswamy commented on PIG-2937:
-----------------------------------------

Thanks Jon. +1. Patch looks good. 

Minor nitpick I have is that it would be nice to give the test a more relevant name than testAutomaticallyMadeName. Something like testRelationAliasForBinCond. But I am not going to insist. 
                
> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>            Assignee: Jonathan Coveney
>             Fix For: 0.11, 0.12
>
>         Attachments: PIG-2937-0.patch, PIG-2937-1.patch, PIG-2937-2.patch, PIG-2937-3_nowhitespace.patch, PIG-2937-3_whitespace.patch
>
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Feng Peng (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483456#comment-13483456 ] 

Feng Peng commented on PIG-2937:
--------------------------------

Thanks [~jcoveney]!

* The change in ProjectExpression.java seems to be a whitespace change only.
* The change in LogicalPlanBuilder.java seems to be a reasonable best-effort to me.

Non-comitter +1
                
> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>            Assignee: Jonathan Coveney
>             Fix For: 0.11, 0.12
>
>         Attachments: PIG-2937-0.patch, PIG-2937-1.patch
>
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Jonathan Coveney (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483465#comment-13483465 ] 

Jonathan Coveney commented on PIG-2937:
---------------------------------------

I fixed a typo in ProjectExpression :)
                
> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>            Assignee: Jonathan Coveney
>             Fix For: 0.11, 0.12
>
>         Attachments: PIG-2937-0.patch, PIG-2937-1.patch
>
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Jonathan Coveney (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Coveney updated PIG-2937:
----------------------------------

    Attachment: PIG-2937-1.patch
    
> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>            Assignee: Jonathan Coveney
>             Fix For: 0.11, 0.12
>
>         Attachments: PIG-2937-0.patch, PIG-2937-1.patch
>
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Jonathan Coveney (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Coveney updated PIG-2937:
----------------------------------

    Fix Version/s: 0.12
                   0.11
           Status: Patch Available  (was: In Progress)
    
> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>            Assignee: Jonathan Coveney
>             Fix For: 0.11, 0.12
>
>         Attachments: PIG-2937-0.patch
>
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Feng Peng (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488045#comment-13488045 ] 

Feng Peng commented on PIG-2937:
--------------------------------

co-bump, can any committer take a look at the change? Thanks!
                
> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>            Assignee: Jonathan Coveney
>             Fix For: 0.11, 0.12
>
>         Attachments: PIG-2937-0.patch, PIG-2937-1.patch
>
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Jonathan Coveney (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13497329#comment-13497329 ] 

Jonathan Coveney commented on PIG-2937:
---------------------------------------

I am horrible at naming. I'll change the name to something more like that, and then commit.
                
> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>            Assignee: Jonathan Coveney
>             Fix For: 0.11, 0.12
>
>         Attachments: PIG-2937-0.patch, PIG-2937-1.patch, PIG-2937-2.patch, PIG-2937-3_nowhitespace.patch, PIG-2937-3_whitespace.patch
>
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Work started] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

Posted by "Jonathan Coveney (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on PIG-2937 started by Jonathan Coveney.

> generated field in nested foreach does not inherit the variable name as the field name
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-2937
>                 URL: https://issues.apache.org/jira/browse/PIG-2937
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Feng Peng
>            Assignee: Jonathan Coveney
>             Fix For: 0.11, 0.12
>
>         Attachments: PIG-2937-0.patch
>
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
>     field_c,
>     generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the field_c that is from the original relation. However, Pig currently doesn't assign the field name by default. It'd be nice if we can assign the variable name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira