You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Thejas M Nair (JIRA)" <ji...@apache.org> on 2009/10/07 21:39:31 UTC

[jira] Created: (PIG-999) sorting on map-value fails if map-value is not of bytearray type

sorting on map-value fails if map-value is not of bytearray type
----------------------------------------------------------------

                 Key: PIG-999
                 URL: https://issues.apache.org/jira/browse/PIG-999
             Project: Pig
          Issue Type: Bug
            Reporter: Thejas M Nair


When query execution plan is created by pig, it assumes the type to be bytearray because there is no schema information associated with map fields.
But at run time, the loader might return the actual type. This results in a ClassCastException.
This issue points to the larger issue of the way pig is handling types for map-value. 

This issue should be fixed in the context of revisiting the frontend logic and pig-latin semantics.

This is related to PIG-880 . The patch in PIG-880 changed PigStorage to always return bytearray for map values to work around this, but other loaders like BinStorage can return the actual type causing this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-999) sorting on map-value fails if map-value is not of bytearray type

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich updated PIG-999:
-------------------------------

    Fix Version/s: 0.9.0

> sorting on map-value fails if map-value is not of bytearray type
> ----------------------------------------------------------------
>
>                 Key: PIG-999
>                 URL: https://issues.apache.org/jira/browse/PIG-999
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Thejas M Nair
>             Fix For: 0.9.0
>
>
> When query execution plan is created by pig, it assumes the type to be bytearray because there is no schema information associated with map fields.
> But at run time, the loader might return the actual type. This results in a ClassCastException.
> This issue points to the larger issue of the way pig is handling types for map-value. 
> This issue should be fixed in the context of revisiting the frontend logic and pig-latin semantics.
> This is related to PIG-880 . The patch in PIG-880 changed PigStorage to always return bytearray for map values to work around this, but other loaders like BinStorage can return the actual type causing this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-999) sorting on map-value fails if map-value is not of bytearray type

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-999:
------------------------------

    Issue Type: Sub-task  (was: Bug)
        Parent: PIG-998

> sorting on map-value fails if map-value is not of bytearray type
> ----------------------------------------------------------------
>
>                 Key: PIG-999
>                 URL: https://issues.apache.org/jira/browse/PIG-999
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Thejas M Nair
>
> When query execution plan is created by pig, it assumes the type to be bytearray because there is no schema information associated with map fields.
> But at run time, the loader might return the actual type. This results in a ClassCastException.
> This issue points to the larger issue of the way pig is handling types for map-value. 
> This issue should be fixed in the context of revisiting the frontend logic and pig-latin semantics.
> This is related to PIG-880 . The patch in PIG-880 changed PigStorage to always return bytearray for map values to work around this, but other loaders like BinStorage can return the actual type causing this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-999) sorting on map-value fails if map-value is not of bytearray type

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-999:
------------------------------

    Fix Version/s: 0.7.0

> sorting on map-value fails if map-value is not of bytearray type
> ----------------------------------------------------------------
>
>                 Key: PIG-999
>                 URL: https://issues.apache.org/jira/browse/PIG-999
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Thejas M Nair
>             Fix For: 0.7.0
>
>
> When query execution plan is created by pig, it assumes the type to be bytearray because there is no schema information associated with map fields.
> But at run time, the loader might return the actual type. This results in a ClassCastException.
> This issue points to the larger issue of the way pig is handling types for map-value. 
> This issue should be fixed in the context of revisiting the frontend logic and pig-latin semantics.
> This is related to PIG-880 . The patch in PIG-880 changed PigStorage to always return bytearray for map values to work around this, but other loaders like BinStorage can return the actual type causing this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-999) sorting on map-value fails if map-value is not of bytearray type

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763234#action_12763234 ] 

Thejas M Nair commented on PIG-999:
-----------------------------------

{code}
l = load 'st_attr2.bin' using BinStorage();
f = foreach l generate $1, $4#'origin';  --   $4#'origin is stored as chararray
o = order f by $2;
dump o; 
{code}

It results in map-reduce failure with error -

java.lang.ClassCastException: org.apache.pig.impl.io.NullableText cannot be cast to
org.apache.pig.impl.io.NullableBytesWritable
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigBytesRawComparator.compare(PigBytesRawComparator.java:94)
        at java.util.Arrays.binarySearch0(Arrays.java:2105)
        at java.util.Arrays.binarySearch(Arrays.java:2043)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:64)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:53)
        at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:108)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:251)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:240)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)


> sorting on map-value fails if map-value is not of bytearray type
> ----------------------------------------------------------------
>
>                 Key: PIG-999
>                 URL: https://issues.apache.org/jira/browse/PIG-999
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Thejas M Nair
>
> When query execution plan is created by pig, it assumes the type to be bytearray because there is no schema information associated with map fields.
> But at run time, the loader might return the actual type. This results in a ClassCastException.
> This issue points to the larger issue of the way pig is handling types for map-value. 
> This issue should be fixed in the context of revisiting the frontend logic and pig-latin semantics.
> This is related to PIG-880 . The patch in PIG-880 changed PigStorage to always return bytearray for map values to work around this, but other loaders like BinStorage can return the actual type causing this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-999) sorting on map-value fails if map-value is not of bytearray type

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich updated PIG-999:
-------------------------------

    Fix Version/s:     (was: 0.7.0)

Dealying - will address as part of semantic cleanup

> sorting on map-value fails if map-value is not of bytearray type
> ----------------------------------------------------------------
>
>                 Key: PIG-999
>                 URL: https://issues.apache.org/jira/browse/PIG-999
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Thejas M Nair
>
> When query execution plan is created by pig, it assumes the type to be bytearray because there is no schema information associated with map fields.
> But at run time, the loader might return the actual type. This results in a ClassCastException.
> This issue points to the larger issue of the way pig is handling types for map-value. 
> This issue should be fixed in the context of revisiting the frontend logic and pig-latin semantics.
> This is related to PIG-880 . The patch in PIG-880 changed PigStorage to always return bytearray for map values to work around this, but other loaders like BinStorage can return the actual type causing this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (PIG-999) sorting on map-value fails if map-value is not of bytearray type

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Gates reassigned PIG-999:
------------------------------

    Assignee: Alan Gates

> sorting on map-value fails if map-value is not of bytearray type
> ----------------------------------------------------------------
>
>                 Key: PIG-999
>                 URL: https://issues.apache.org/jira/browse/PIG-999
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Thejas M Nair
>            Assignee: Alan Gates
>             Fix For: 0.9.0
>
>
> When query execution plan is created by pig, it assumes the type to be bytearray because there is no schema information associated with map fields.
> But at run time, the loader might return the actual type. This results in a ClassCastException.
> This issue points to the larger issue of the way pig is handling types for map-value. 
> This issue should be fixed in the context of revisiting the frontend logic and pig-latin semantics.
> This is related to PIG-880 . The patch in PIG-880 changed PigStorage to always return bytearray for map values to work around this, but other loaders like BinStorage can return the actual type causing this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-999) sorting on map-value fails if map-value is not of bytearray type

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763236#action_12763236 ] 

Thejas M Nair commented on PIG-999:
-----------------------------------


In previous comment
{code}
o = order f by $2;
{code}
should have been -
{code}
o = order f by $1;
{code}


> sorting on map-value fails if map-value is not of bytearray type
> ----------------------------------------------------------------
>
>                 Key: PIG-999
>                 URL: https://issues.apache.org/jira/browse/PIG-999
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Thejas M Nair
>
> When query execution plan is created by pig, it assumes the type to be bytearray because there is no schema information associated with map fields.
> But at run time, the loader might return the actual type. This results in a ClassCastException.
> This issue points to the larger issue of the way pig is handling types for map-value. 
> This issue should be fixed in the context of revisiting the frontend logic and pig-latin semantics.
> This is related to PIG-880 . The patch in PIG-880 changed PigStorage to always return bytearray for map values to work around this, but other loaders like BinStorage can return the actual type causing this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.