You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Thejas M Nair (JIRA)" <ji...@apache.org> on 2010/08/26 19:23:53 UTC

[jira] Created: (PIG-1572) change default datatype when relations are used as scalar to bytearray

change default datatype when relations are used as scalar to bytearray
----------------------------------------------------------------------

                 Key: PIG-1572
                 URL: https://issues.apache.org/jira/browse/PIG-1572
             Project: Pig
          Issue Type: Bug
            Reporter: Thejas M Nair
             Fix For: 0.8.0


When relations are cast to scalar, the current default type is chararray. This is inconsistent with the behavior in rest of pig-latin.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1572) change default datatype when relations are used as scalar to bytearray

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905346#action_12905346 ] 

Thejas M Nair commented on PIG-1572:
------------------------------------

bq. Yes, the changes to UserFuncExpression.getFieldSchema() are no longer required because the cast inserted to appropriate type. But while thinking about that I believe I have found an issue with the handling of non PigStorage load functions.
Since this patch address a bunch of issues I will commit it and create a new jira to address that, and also look at the utility of this change to UserFuncExpression.getFieldSchema().

Created  PIG-1595 to address the issue.

> change default datatype when relations are used as scalar to bytearray
> ----------------------------------------------------------------------
>
>                 Key: PIG-1572
>                 URL: https://issues.apache.org/jira/browse/PIG-1572
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1572.1.patch, PIG-1572.2.patch
>
>
> When relations are cast to scalar, the current default type is chararray. This is inconsistent with the behavior in rest of pig-latin.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1572) change default datatype when relations are used as scalar to bytearray

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905332#action_12905332 ] 

Thejas M Nair commented on PIG-1572:
------------------------------------

Patch committed to 0.8 branch as well .

> change default datatype when relations are used as scalar to bytearray
> ----------------------------------------------------------------------
>
>                 Key: PIG-1572
>                 URL: https://issues.apache.org/jira/browse/PIG-1572
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1572.1.patch, PIG-1572.2.patch
>
>
> When relations are cast to scalar, the current default type is chararray. This is inconsistent with the behavior in rest of pig-latin.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1572) change default datatype when relations are used as scalar to bytearray

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1572:
-------------------------------

          Status: Resolved  (was: Patch Available)
    Hadoop Flags: [Reviewed]
      Resolution: Fixed

Patch committed to trunk.


> change default datatype when relations are used as scalar to bytearray
> ----------------------------------------------------------------------
>
>                 Key: PIG-1572
>                 URL: https://issues.apache.org/jira/browse/PIG-1572
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1572.1.patch, PIG-1572.2.patch
>
>
> When relations are cast to scalar, the current default type is chararray. This is inconsistent with the behavior in rest of pig-latin.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1572) change default datatype when relations are used as scalar to bytearray

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905293#action_12905293 ] 

Daniel Dai commented on PIG-1572:
---------------------------------

Patch looks good. One minor doubt is when we migrate to new logical plan, UserFuncExpression already have necessary cast inserted, seems we do not need to change new logical plan's UserFuncExpression.getFieldSchema(), am I right?

> change default datatype when relations are used as scalar to bytearray
> ----------------------------------------------------------------------
>
>                 Key: PIG-1572
>                 URL: https://issues.apache.org/jira/browse/PIG-1572
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1572.1.patch, PIG-1572.2.patch
>
>
> When relations are cast to scalar, the current default type is chararray. This is inconsistent with the behavior in rest of pig-latin.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1572) change default datatype when relations are used as scalar to bytearray

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1572:
-------------------------------

    Attachment: PIG-1572.2.patch

PIG-1572.2.patch 
- Fixed loss of lineage information in translation during explain call
- Added cast on output of ReadScalars so that type information is not lost during schema reset from optimizer.

Unit tests and test-patch has passed. Patch is ready for review.

     [exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.


> change default datatype when relations are used as scalar to bytearray
> ----------------------------------------------------------------------
>
>                 Key: PIG-1572
>                 URL: https://issues.apache.org/jira/browse/PIG-1572
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1572.1.patch, PIG-1572.2.patch
>
>
> When relations are cast to scalar, the current default type is chararray. This is inconsistent with the behavior in rest of pig-latin.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (PIG-1572) change default datatype when relations are used as scalar to bytearray

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich reassigned PIG-1572:
-----------------------------------

    Assignee: Thejas M Nair

> change default datatype when relations are used as scalar to bytearray
> ----------------------------------------------------------------------
>
>                 Key: PIG-1572
>                 URL: https://issues.apache.org/jira/browse/PIG-1572
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>
> When relations are cast to scalar, the current default type is chararray. This is inconsistent with the behavior in rest of pig-latin.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1572) change default datatype when relations are used as scalar to bytearray

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1572:
-------------------------------

    Status: Patch Available  (was: Open)

> change default datatype when relations are used as scalar to bytearray
> ----------------------------------------------------------------------
>
>                 Key: PIG-1572
>                 URL: https://issues.apache.org/jira/browse/PIG-1572
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1572.1.patch
>
>
> When relations are cast to scalar, the current default type is chararray. This is inconsistent with the behavior in rest of pig-latin.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1572) change default datatype when relations are used as scalar to bytearray

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905324#action_12905324 ] 

Thejas M Nair commented on PIG-1572:
------------------------------------

Yes, the changes to UserFuncExpression.getFieldSchema() are no longer required because the cast inserted to appropriate type. But while thinking about that I believe I have found an issue with the handling of non PigStorage load functions.
Since this patch address a bunch of issues I will commit it and create a new jira to address that, and also look at the utility of this change to UserFuncExpression.getFieldSchema().



> change default datatype when relations are used as scalar to bytearray
> ----------------------------------------------------------------------
>
>                 Key: PIG-1572
>                 URL: https://issues.apache.org/jira/browse/PIG-1572
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1572.1.patch, PIG-1572.2.patch
>
>
> When relations are cast to scalar, the current default type is chararray. This is inconsistent with the behavior in rest of pig-latin.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1572) change default datatype when relations are used as scalar to bytearray

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1572:
-------------------------------

    Attachment: PIG-1572.1.patch

Summary of changes
- Changed default type (ie type when input relation to scalar has not type) to bytearray.
- Replaced PigStorage with InterStorage for load/store of scalar data, so typed data is stored.
- Changes to track lineage of the ReadScalars udf to the load function(s).
- Removed unnecessary casts on output of ReadScalars
- "describe alias;" PigServer code now checks the alias of the leaf logical operators 
- Changed test cases - explicit cast no longer required when bytearray is used in arithmetic operations. Moved some of the tests to local mode to reduce test run time.


> change default datatype when relations are used as scalar to bytearray
> ----------------------------------------------------------------------
>
>                 Key: PIG-1572
>                 URL: https://issues.apache.org/jira/browse/PIG-1572
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1572.1.patch
>
>
> When relations are cast to scalar, the current default type is chararray. This is inconsistent with the behavior in rest of pig-latin.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1572) change default datatype when relations are used as scalar to bytearray

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1572:
-------------------------------

    Release Note: 
This changes the release note in PIG-1434, the part  "Also, please, note that when the schema can't be inferred chararray rather than bytearray is used."

The datatype of byetarray is used when schema can't be inferred.



> change default datatype when relations are used as scalar to bytearray
> ----------------------------------------------------------------------
>
>                 Key: PIG-1572
>                 URL: https://issues.apache.org/jira/browse/PIG-1572
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1572.1.patch, PIG-1572.2.patch
>
>
> When relations are cast to scalar, the current default type is chararray. This is inconsistent with the behavior in rest of pig-latin.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.