You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Santhosh Srinivasan (JIRA)" <ji...@apache.org> on 2009/02/10 20:17:59 UTC
[jira] Created: (PIG-664) Semantics of * is not consistent
Semantics of * is not consistent
--------------------------------
Key: PIG-664
URL: https://issues.apache.org/jira/browse/PIG-664
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: types_branch
Reporter: Santhosh Srinivasan
Assignee: Santhosh Srinivasan
Fix For: types_branch
The semantics of * is not consistent in PIG. The use of * with generate results in the all the columns of the record being flattened. However, the use of * as an input to a UDF results in a tuple (wrapped in another tuple). For consistency, * should always result in all the columns of the record (i.e., flattened). The use of * occurs in:
1. Foreach generate: E.g.: foreach input generate *;
2. Input to UDFs: E.g. foreach input generate myUDF(*);
3. Order by: E.g.: order input by *;
4. (Co)Group: E.g.: group a by *; cogroup a by *, b by *;
In terms of implementation, this involves rolling back the fix introduced in PIG-597 and fixing the following builtin UDFs:
1. ARITY - Should return the size of the input tuple instead of extracting the first column of the input tuple
2. SIZE - Should return the size of the input tuple instead of extracting the first column of the input tuple
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-664) Semantics of * is not consistent
Posted by "Yiping Han (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672372#action_12672372 ]
Yiping Han commented on PIG-664:
--------------------------------
I would second Santhosh. In PIG 1.x, * in UDF parameter list does expend as flattened list of columns. While converting into PIG 2.0, this create a lot of inconvenience. * should always generate flattened columns.
> Semantics of * is not consistent
> --------------------------------
>
> Key: PIG-664
> URL: https://issues.apache.org/jira/browse/PIG-664
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: types_branch
> Reporter: Santhosh Srinivasan
> Assignee: Santhosh Srinivasan
> Fix For: types_branch
>
>
> The semantics of * is not consistent in PIG. The use of * with generate results in the all the columns of the record being flattened. However, the use of * as an input to a UDF results in a tuple (wrapped in another tuple). For consistency, * should always result in all the columns of the record (i.e., flattened). The use of * occurs in:
> 1. Foreach generate: E.g.: foreach input generate *;
> 2. Input to UDFs: E.g. foreach input generate myUDF(*);
> 3. Order by: E.g.: order input by *;
> 4. (Co)Group: E.g.: group a by *; cogroup a by *, b by *;
> In terms of implementation, this involves rolling back the fix introduced in PIG-597 and fixing the following builtin UDFs:
> 1. ARITY - Should return the size of the input tuple instead of extracting the first column of the input tuple
> 2. SIZE - Should return the size of the input tuple instead of extracting the first column of the input tuple
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-664) Semantics of * is not consistent
Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676789#action_12676789 ]
Olga Natkovich commented on PIG-664:
------------------------------------
I am reviewing this changes. The patch looks good. Running the tests now. Will commit once they complete
> Semantics of * is not consistent
> --------------------------------
>
> Key: PIG-664
> URL: https://issues.apache.org/jira/browse/PIG-664
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: types_branch
> Reporter: Santhosh Srinivasan
> Assignee: Santhosh Srinivasan
> Fix For: types_branch
>
> Attachments: PIG-664.patch
>
>
> The semantics of * is not consistent in PIG. The use of * with generate results in the all the columns of the record being flattened. However, the use of * as an input to a UDF results in a tuple (wrapped in another tuple). For consistency, * should always result in all the columns of the record (i.e., flattened). The use of * occurs in:
> 1. Foreach generate: E.g.: foreach input generate *;
> 2. Input to UDFs: E.g. foreach input generate myUDF(*);
> 3. Order by: E.g.: order input by *;
> 4. (Co)Group: E.g.: group a by *; cogroup a by *, b by *;
> In terms of implementation, this involves rolling back the fix introduced in PIG-597 and fixing the following builtin UDFs:
> 1. ARITY - Should return the size of the input tuple instead of extracting the first column of the input tuple
> 2. SIZE - Should return the size of the input tuple instead of extracting the first column of the input tuple
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-664) Semantics of * is not consistent
Posted by "Santhosh Srinivasan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Santhosh Srinivasan updated PIG-664:
------------------------------------
Patch Info: [Patch Available]
> Semantics of * is not consistent
> --------------------------------
>
> Key: PIG-664
> URL: https://issues.apache.org/jira/browse/PIG-664
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: types_branch
> Reporter: Santhosh Srinivasan
> Assignee: Santhosh Srinivasan
> Fix For: types_branch
>
> Attachments: PIG-664.patch
>
>
> The semantics of * is not consistent in PIG. The use of * with generate results in the all the columns of the record being flattened. However, the use of * as an input to a UDF results in a tuple (wrapped in another tuple). For consistency, * should always result in all the columns of the record (i.e., flattened). The use of * occurs in:
> 1. Foreach generate: E.g.: foreach input generate *;
> 2. Input to UDFs: E.g. foreach input generate myUDF(*);
> 3. Order by: E.g.: order input by *;
> 4. (Co)Group: E.g.: group a by *; cogroup a by *, b by *;
> In terms of implementation, this involves rolling back the fix introduced in PIG-597 and fixing the following builtin UDFs:
> 1. ARITY - Should return the size of the input tuple instead of extracting the first column of the input tuple
> 2. SIZE - Should return the size of the input tuple instead of extracting the first column of the input tuple
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-664) Semantics of * is not consistent
Posted by "Santhosh Srinivasan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Santhosh Srinivasan resolved PIG-664.
-------------------------------------
Resolution: Fixed
Hadoop Flags: [Reviewed]
Patch has been committed.
> Semantics of * is not consistent
> --------------------------------
>
> Key: PIG-664
> URL: https://issues.apache.org/jira/browse/PIG-664
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: types_branch
> Reporter: Santhosh Srinivasan
> Assignee: Santhosh Srinivasan
> Fix For: types_branch
>
> Attachments: PIG-664.patch
>
>
> The semantics of * is not consistent in PIG. The use of * with generate results in the all the columns of the record being flattened. However, the use of * as an input to a UDF results in a tuple (wrapped in another tuple). For consistency, * should always result in all the columns of the record (i.e., flattened). The use of * occurs in:
> 1. Foreach generate: E.g.: foreach input generate *;
> 2. Input to UDFs: E.g. foreach input generate myUDF(*);
> 3. Order by: E.g.: order input by *;
> 4. (Co)Group: E.g.: group a by *; cogroup a by *, b by *;
> In terms of implementation, this involves rolling back the fix introduced in PIG-597 and fixing the following builtin UDFs:
> 1. ARITY - Should return the size of the input tuple instead of extracting the first column of the input tuple
> 2. SIZE - Should return the size of the input tuple instead of extracting the first column of the input tuple
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-664) Semantics of * is not consistent
Posted by "Santhosh Srinivasan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Santhosh Srinivasan updated PIG-664:
------------------------------------
Attachment: PIG-664.patch
Attached patch fixes the issue. All unit test cases pass.
> Semantics of * is not consistent
> --------------------------------
>
> Key: PIG-664
> URL: https://issues.apache.org/jira/browse/PIG-664
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: types_branch
> Reporter: Santhosh Srinivasan
> Assignee: Santhosh Srinivasan
> Fix For: types_branch
>
> Attachments: PIG-664.patch
>
>
> The semantics of * is not consistent in PIG. The use of * with generate results in the all the columns of the record being flattened. However, the use of * as an input to a UDF results in a tuple (wrapped in another tuple). For consistency, * should always result in all the columns of the record (i.e., flattened). The use of * occurs in:
> 1. Foreach generate: E.g.: foreach input generate *;
> 2. Input to UDFs: E.g. foreach input generate myUDF(*);
> 3. Order by: E.g.: order input by *;
> 4. (Co)Group: E.g.: group a by *; cogroup a by *, b by *;
> In terms of implementation, this involves rolling back the fix introduced in PIG-597 and fixing the following builtin UDFs:
> 1. ARITY - Should return the size of the input tuple instead of extracting the first column of the input tuple
> 2. SIZE - Should return the size of the input tuple instead of extracting the first column of the input tuple
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.