You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2010/12/08 00:45:01 UTC

[jira] Created: (PIG-1758) Deep cast of complex type

Deep cast of complex type
-------------------------

                 Key: PIG-1758
                 URL: https://issues.apache.org/jira/browse/PIG-1758
             Project: Pig
          Issue Type: New Feature
          Components: impl
    Affects Versions: 0.8.0
            Reporter: Daniel Dai
            Assignee: Daniel Dai
             Fix For: 0.9.0


Pig does not handle deep cast from bag -> bag, tuple -> tuple. Eg, the following script does not produce desired result:
{code}
a = load '1.txt' as (a0:bag{t:tuple(i0:double)});
b = foreach a generate (bag{tuple(int)})a0;
dump b;
{code}

The result tuple still contain int inside tuple of bag. 

PIG-613 fix the case we cast bytearray -> bag/tuple, we take complex type including inner types, but bag->bag, tuple->tuple is still not effective.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Commented] (PIG-1758) Deep cast of complex type

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011718#comment-13011718 ] 

Dmitriy V. Ryaboy commented on PIG-1758:
----------------------------------------

The Cassandra fellows ran into needing this for their CassandraStorage implementation. Objections to backporting this into 8.1?

> Deep cast of complex type
> -------------------------
>
>                 Key: PIG-1758
>                 URL: https://issues.apache.org/jira/browse/PIG-1758
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.9.0
>
>         Attachments: PIG-1758-1.patch, PIG-1758-2.patch
>
>
> Pig does not handle deep cast from bag -> bag, tuple -> tuple. Eg, the following script does not produce desired result:
> {code}
> a = load '1.txt' as (a0:bag{t:tuple(i0:double)});
> b = foreach a generate (bag{tuple(int)})a0;
> dump b;
> {code}
> The result tuple still contain int inside tuple of bag. 
> PIG-613 fix the case we cast bytearray -> bag/tuple, we take complex type including inner types, but bag->bag, tuple->tuple is still not effective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-1758) Deep cast of complex type

Posted by "Santhosh Srinivasan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011888#comment-13011888 ] 

Santhosh Srinivasan commented on PIG-1758:
------------------------------------------

Should be fine if the patch applies cleanly and all the unit tests pass. I am concerned about having patches in JIRAs that are not documented in release notes as known issues (with patches)

> Deep cast of complex type
> -------------------------
>
>                 Key: PIG-1758
>                 URL: https://issues.apache.org/jira/browse/PIG-1758
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.9.0
>
>         Attachments: PIG-1758-1.patch, PIG-1758-2.patch
>
>
> Pig does not handle deep cast from bag -> bag, tuple -> tuple. Eg, the following script does not produce desired result:
> {code}
> a = load '1.txt' as (a0:bag{t:tuple(i0:double)});
> b = foreach a generate (bag{tuple(int)})a0;
> dump b;
> {code}
> The result tuple still contain int inside tuple of bag. 
> PIG-613 fix the case we cast bytearray -> bag/tuple, we take complex type including inner types, but bag->bag, tuple->tuple is still not effective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (PIG-1758) Deep cast of complex type

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-1758:
----------------------------

    Attachment: PIG-1758-2.patch

PIG-1758-2.patch address findbugs warnings.

> Deep cast of complex type
> -------------------------
>
>                 Key: PIG-1758
>                 URL: https://issues.apache.org/jira/browse/PIG-1758
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.9.0
>
>         Attachments: PIG-1758-1.patch, PIG-1758-2.patch
>
>
> Pig does not handle deep cast from bag -> bag, tuple -> tuple. Eg, the following script does not produce desired result:
> {code}
> a = load '1.txt' as (a0:bag{t:tuple(i0:double)});
> b = foreach a generate (bag{tuple(int)})a0;
> dump b;
> {code}
> The result tuple still contain int inside tuple of bag. 
> PIG-613 fix the case we cast bytearray -> bag/tuple, we take complex type including inner types, but bag->bag, tuple->tuple is still not effective.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Commented] (PIG-1758) Deep cast of complex type

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012796#comment-13012796 ] 

Dmitriy V. Ryaboy commented on PIG-1758:
----------------------------------------

In other words, it's impossible to cast a bag of tuples of bytearrays into a bag of tuples of longs without this patch. 
It simply makes the POCast operator recurse into nested structures and keep applying the caster as needed.

> Deep cast of complex type
> -------------------------
>
>                 Key: PIG-1758
>                 URL: https://issues.apache.org/jira/browse/PIG-1758
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.9.0
>
>         Attachments: PIG-1758-1.patch, PIG-1758-2.patch
>
>
> Pig does not handle deep cast from bag -> bag, tuple -> tuple. Eg, the following script does not produce desired result:
> {code}
> a = load '1.txt' as (a0:bag{t:tuple(i0:double)});
> b = foreach a generate (bag{tuple(int)})a0;
> dump b;
> {code}
> The result tuple still contain int inside tuple of bag. 
> PIG-613 fix the case we cast bytearray -> bag/tuple, we take complex type including inner types, but bag->bag, tuple->tuple is still not effective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-1758) Deep cast of complex type

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012791#comment-13012791 ] 

Daniel Dai commented on PIG-1758:
---------------------------------

Cast bytes into bag/tuple is handled by PIG-613. This Jira handles cast bag to bag, tuple to tuple.

> Deep cast of complex type
> -------------------------
>
>                 Key: PIG-1758
>                 URL: https://issues.apache.org/jira/browse/PIG-1758
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.9.0
>
>         Attachments: PIG-1758-1.patch, PIG-1758-2.patch
>
>
> Pig does not handle deep cast from bag -> bag, tuple -> tuple. Eg, the following script does not produce desired result:
> {code}
> a = load '1.txt' as (a0:bag{t:tuple(i0:double)});
> b = foreach a generate (bag{tuple(int)})a0;
> dump b;
> {code}
> The result tuple still contain int inside tuple of bag. 
> PIG-613 fix the case we cast bytearray -> bag/tuple, we take complex type including inner types, but bag->bag, tuple->tuple is still not effective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (PIG-1758) Deep cast of complex type

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-1758:
----------------------------

    Attachment: PIG-1758-1.patch

> Deep cast of complex type
> -------------------------
>
>                 Key: PIG-1758
>                 URL: https://issues.apache.org/jira/browse/PIG-1758
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.9.0
>
>         Attachments: PIG-1758-1.patch
>
>
> Pig does not handle deep cast from bag -> bag, tuple -> tuple. Eg, the following script does not produce desired result:
> {code}
> a = load '1.txt' as (a0:bag{t:tuple(i0:double)});
> b = foreach a generate (bag{tuple(int)})a0;
> dump b;
> {code}
> The result tuple still contain int inside tuple of bag. 
> PIG-613 fix the case we cast bytearray -> bag/tuple, we take complex type including inner types, but bag->bag, tuple->tuple is still not effective.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (PIG-1758) Deep cast of complex type

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai resolved PIG-1758.
-----------------------------

      Resolution: Fixed
    Release Note: Enable deep cast of a tuple/bag to a tuple/bag of different inner schema
    Hadoop Flags: [Reviewed]

Review notes:
https://reviews.apache.org/r/152/

Patch committed to trunk.

> Deep cast of complex type
> -------------------------
>
>                 Key: PIG-1758
>                 URL: https://issues.apache.org/jira/browse/PIG-1758
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.9.0
>
>         Attachments: PIG-1758-1.patch, PIG-1758-2.patch
>
>
> Pig does not handle deep cast from bag -> bag, tuple -> tuple. Eg, the following script does not produce desired result:
> {code}
> a = load '1.txt' as (a0:bag{t:tuple(i0:double)});
> b = foreach a generate (bag{tuple(int)})a0;
> dump b;
> {code}
> The result tuple still contain int inside tuple of bag. 
> PIG-613 fix the case we cast bytearray -> bag/tuple, we take complex type including inner types, but bag->bag, tuple->tuple is still not effective.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Commented] (PIG-1758) Deep cast of complex type

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012133#comment-13012133 ] 

Alan Gates commented on PIG-1758:
---------------------------------

I have some concerns on this.  I want to review it more completely before I vote one way or another.  If I promise to get to it later today can you hold any checkins until then?  Thanks.

> Deep cast of complex type
> -------------------------
>
>                 Key: PIG-1758
>                 URL: https://issues.apache.org/jira/browse/PIG-1758
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.9.0
>
>         Attachments: PIG-1758-1.patch, PIG-1758-2.patch
>
>
> Pig does not handle deep cast from bag -> bag, tuple -> tuple. Eg, the following script does not produce desired result:
> {code}
> a = load '1.txt' as (a0:bag{t:tuple(i0:double)});
> b = foreach a generate (bag{tuple(int)})a0;
> dump b;
> {code}
> The result tuple still contain int inside tuple of bag. 
> PIG-613 fix the case we cast bytearray -> bag/tuple, we take complex type including inner types, but bag->bag, tuple->tuple is still not effective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-1758) Deep cast of complex type

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011726#comment-13011726 ] 

Olga Natkovich commented on PIG-1758:
-------------------------------------

This looks like a pretty involved change and not a bug fix but a new feature. Also, I am not sure if it is completely backward compatible. I would be hesitant to backport this into 0.8 branch.

How about just making the patch available for 0.8 branch so that they can use it with the branch if they choose so.

> Deep cast of complex type
> -------------------------
>
>                 Key: PIG-1758
>                 URL: https://issues.apache.org/jira/browse/PIG-1758
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.9.0
>
>         Attachments: PIG-1758-1.patch, PIG-1758-2.patch
>
>
> Pig does not handle deep cast from bag -> bag, tuple -> tuple. Eg, the following script does not produce desired result:
> {code}
> a = load '1.txt' as (a0:bag{t:tuple(i0:double)});
> b = foreach a generate (bag{tuple(int)})a0;
> dump b;
> {code}
> The result tuple still contain int inside tuple of bag. 
> PIG-613 fix the case we cast bytearray -> bag/tuple, we take complex type including inner types, but bag->bag, tuple->tuple is still not effective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (PIG-1758) Deep cast of complex type

Posted by "Yan Zhou (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969982#action_12969982 ] 

Yan Zhou commented on PIG-1758:
-------------------------------

+1

> Deep cast of complex type
> -------------------------
>
>                 Key: PIG-1758
>                 URL: https://issues.apache.org/jira/browse/PIG-1758
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.9.0
>
>         Attachments: PIG-1758-1.patch, PIG-1758-2.patch
>
>
> Pig does not handle deep cast from bag -> bag, tuple -> tuple. Eg, the following script does not produce desired result:
> {code}
> a = load '1.txt' as (a0:bag{t:tuple(i0:double)});
> b = foreach a generate (bag{tuple(int)})a0;
> dump b;
> {code}
> The result tuple still contain int inside tuple of bag. 
> PIG-613 fix the case we cast bytearray -> bag/tuple, we take complex type including inner types, but bag->bag, tuple->tuple is still not effective.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Commented] (PIG-1758) Deep cast of complex type

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012237#comment-13012237 ] 

Alan Gates commented on PIG-1758:
---------------------------------

bq. LoadCaster interface provides methods to cast complex types – but they are silently ignored by Pig

So before this patch casting of bytearray to tuple or bag fails in 0.8?  That is definitely a fix worth back porting.

The addition of deep casting I have a hard time thinking of as a bug since we have never it in the past.

Since this is relatively small and did not work in the past I am ok with back porting it.  I do think we should add something in the release notes that says the feature is experimental.

I have a concern in general about when we back port features and when we do not, but I will start a thread on that on the dev list.


> Deep cast of complex type
> -------------------------
>
>                 Key: PIG-1758
>                 URL: https://issues.apache.org/jira/browse/PIG-1758
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.9.0
>
>         Attachments: PIG-1758-1.patch, PIG-1758-2.patch
>
>
> Pig does not handle deep cast from bag -> bag, tuple -> tuple. Eg, the following script does not produce desired result:
> {code}
> a = load '1.txt' as (a0:bag{t:tuple(i0:double)});
> b = foreach a generate (bag{tuple(int)})a0;
> dump b;
> {code}
> The result tuple still contain int inside tuple of bag. 
> PIG-613 fix the case we cast bytearray -> bag/tuple, we take complex type including inner types, but bag->bag, tuple->tuple is still not effective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-1758) Deep cast of complex type

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011758#comment-13011758 ] 

Dmitriy V. Ryaboy commented on PIG-1758:
----------------------------------------

Olga, I would contend that this is indeed a bug. The fact that you cannot provide a cast of a complex type is not documented anywhere in 0.8, and in fact LoadCaster interface provides methods to cast complex types -- but they are silently ignored by Pig, which caused the Cassandra developers (and me) a good 3 days of debugging.

I will post a 0.8 version of this patch tomorrow, let's look at how deep it has to go.

> Deep cast of complex type
> -------------------------
>
>                 Key: PIG-1758
>                 URL: https://issues.apache.org/jira/browse/PIG-1758
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.9.0
>
>         Attachments: PIG-1758-1.patch, PIG-1758-2.patch
>
>
> Pig does not handle deep cast from bag -> bag, tuple -> tuple. Eg, the following script does not produce desired result:
> {code}
> a = load '1.txt' as (a0:bag{t:tuple(i0:double)});
> b = foreach a generate (bag{tuple(int)})a0;
> dump b;
> {code}
> The result tuple still contain int inside tuple of bag. 
> PIG-613 fix the case we cast bytearray -> bag/tuple, we take complex type including inner types, but bag->bag, tuple->tuple is still not effective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-1758) Deep cast of complex type

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011838#comment-13011838 ] 

Dmitriy V. Ryaboy commented on PIG-1758:
----------------------------------------

Actually, as it turns out, this patch applies cleanly to 0.8 branch, and the tests pass.
I think this is safe to commit. 
I'll wait a day since Olga expressed reservations in case someone wants to -1 this.

> Deep cast of complex type
> -------------------------
>
>                 Key: PIG-1758
>                 URL: https://issues.apache.org/jira/browse/PIG-1758
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.9.0
>
>         Attachments: PIG-1758-1.patch, PIG-1758-2.patch
>
>
> Pig does not handle deep cast from bag -> bag, tuple -> tuple. Eg, the following script does not produce desired result:
> {code}
> a = load '1.txt' as (a0:bag{t:tuple(i0:double)});
> b = foreach a generate (bag{tuple(int)})a0;
> dump b;
> {code}
> The result tuple still contain int inside tuple of bag. 
> PIG-613 fix the case we cast bytearray -> bag/tuple, we take complex type including inner types, but bag->bag, tuple->tuple is still not effective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira