You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pig.apache.org by "liyunzhang_intel (JIRA)" <ji...@apache.org> on 2017/04/06 22:27:41 UTC

[jira] [Commented] (PIG-5197) Replace IndexedKey with PigNullableWritable in spark branch

    [ https://issues.apache.org/jira/browse/PIG-5197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959895#comment-15959895 ] 

liyunzhang_intel commented on PIG-5197:
---------------------------------------

[~rohini]: we can not replace IndexedKey with PigNullableWritable.I replaced IndexedKey with PigNullableWriable in PIG-5197.patch.  Just run TestSparkSecondarySort to verify. TestSecondarySortSpark#testNestedSortMultiQueryEndToEnd3 fails and throws exception like
{code}
had a not serializable result: org.apache.hadoop.io.Text$
{code}
It is because {code} HDataType.getWritableComparableTypes -> org.apache.pig.impl.io.NullableText#NullableText(java.lang.String)->org.apache.hadoop.io.Text{code}  

For the case we use chararray as type, this exception will be thrown out. Can you provide suggestion to solve it or remain IndexedKey in spark package?
 

> Replace IndexedKey with PigNullableWritable in spark branch
> -----------------------------------------------------------
>
>                 Key: PIG-5197
>                 URL: https://issues.apache.org/jira/browse/PIG-5197
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: liyunzhang_intel
>             Fix For: spark-branch
>
>
> The function of IndexedKey and PigNullableWritable is similar. 
> The difference between these two is  IndexedKey contains Index,key while PigNullableWritable contains index,key,value.
> Besides,the comparators for PigNullableWritable have lot of conditions for the different data types taken care of and IndexedKey can miss some of that. We can try to replace IndexedKey with PigNullableWritable.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)