You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Allan Avendaño (JIRA)" <ji...@apache.org> on 2012/06/06 18:22:23 UTC
[jira] [Created] (PIG-2743) Output Schema
Allan Avendaño created PIG-2743:
-----------------------------------
Summary: Output Schema
Key: PIG-2743
URL: https://issues.apache.org/jira/browse/PIG-2743
Project: Pig
Issue Type: Sub-task
Reporter: Allan Avendaño
Assignee: Allan Avendaño
For the rank operator, I was considering the following schema:
E.g.
A = load 'data' as (x:int,y:chararray,z:int,rz:chararray);
C = rank A by x;
So the output schema could be:
C: {x: int,y: chararray,z: int,rz: chararray,A::rank: int}
In general
{<schema_of_working_alias>,<alias>::rank#int}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-2743) Output Schema
Posted by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gianmarco De Francisci Morales resolved PIG-2743.
-------------------------------------------------
Resolution: Fixed
> Output Schema
> -------------
>
> Key: PIG-2743
> URL: https://issues.apache.org/jira/browse/PIG-2743
> Project: Pig
> Issue Type: Sub-task
> Reporter: Allan Avendaño
> Assignee: Allan Avendaño
>
> For the rank operator, I was considering the following schema:
> E.g.
> A = load 'data' as (x:int,y:chararray,z:int,rz:chararray);
> C = rank A by x;
> So the output schema could be:
> C: {x: int,y: chararray,z: int,rz: chararray,A::rank: int}
> In general
> {<schema_of_working_alias>,<alias>::rank#int}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2743) Output Schema
Posted by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291801#comment-13291801 ]
Gianmarco De Francisci Morales commented on PIG-2743:
-----------------------------------------------------
The alternative option would be to prepend the rank to the tuple (akin to line numbers).
The advantage would be you always know where your rank field will end up (i.e. $0).
But I have no strong opinion on it.
Anybody else cares to comment?
> Output Schema
> -------------
>
> Key: PIG-2743
> URL: https://issues.apache.org/jira/browse/PIG-2743
> Project: Pig
> Issue Type: Sub-task
> Reporter: Allan Avendaño
> Assignee: Allan Avendaño
>
> For the rank operator, I was considering the following schema:
> E.g.
> A = load 'data' as (x:int,y:chararray,z:int,rz:chararray);
> C = rank A by x;
> So the output schema could be:
> C: {x: int,y: chararray,z: int,rz: chararray,A::rank: int}
> In general
> {<schema_of_working_alias>,<alias>::rank#int}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (PIG-2743) Output Schema
Posted by "Allan Avendaño (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on PIG-2743 started by Allan Avendaño.
> Output Schema
> -------------
>
> Key: PIG-2743
> URL: https://issues.apache.org/jira/browse/PIG-2743
> Project: Pig
> Issue Type: Sub-task
> Reporter: Allan Avendaño
> Assignee: Allan Avendaño
>
> For the rank operator, I was considering the following schema:
> E.g.
> A = load 'data' as (x:int,y:chararray,z:int,rz:chararray);
> C = rank A by x;
> So the output schema could be:
> C: {x: int,y: chararray,z: int,rz: chararray,A::rank: int}
> In general
> {<schema_of_working_alias>,<alias>::rank#int}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2743) Output Schema
Posted by "Allan Avendaño (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400066#comment-13400066 ]
Allan Avendaño commented on PIG-2743:
-------------------------------------
Currently, I put the field at the beginning.
I did two small changes into the schema:
1.- rank field is long.
2.- The field is named as "rank"
All changes are reflected at ReviewBoard: https://reviews.apache.org/r/5523/diff/#index_header
> Output Schema
> -------------
>
> Key: PIG-2743
> URL: https://issues.apache.org/jira/browse/PIG-2743
> Project: Pig
> Issue Type: Sub-task
> Reporter: Allan Avendaño
> Assignee: Allan Avendaño
>
> For the rank operator, I was considering the following schema:
> E.g.
> A = load 'data' as (x:int,y:chararray,z:int,rz:chararray);
> C = rank A by x;
> So the output schema could be:
> C: {x: int,y: chararray,z: int,rz: chararray,A::rank: int}
> In general
> {<schema_of_working_alias>,<alias>::rank#int}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2743) Output Schema
Posted by "Cristina L. Abad (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13399517#comment-13399517 ]
Cristina L. Abad commented on PIG-2743:
---------------------------------------
I would also find it helpful if the rank ends up in position $0. Does anybody think this would have any disadvantages?
> Output Schema
> -------------
>
> Key: PIG-2743
> URL: https://issues.apache.org/jira/browse/PIG-2743
> Project: Pig
> Issue Type: Sub-task
> Reporter: Allan Avendaño
> Assignee: Allan Avendaño
>
> For the rank operator, I was considering the following schema:
> E.g.
> A = load 'data' as (x:int,y:chararray,z:int,rz:chararray);
> C = rank A by x;
> So the output schema could be:
> C: {x: int,y: chararray,z: int,rz: chararray,A::rank: int}
> In general
> {<schema_of_working_alias>,<alias>::rank#int}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira