You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Nandor Kollar (JIRA)" <ji...@apache.org> on 2016/11/03 13:21:58 UTC

[jira] [Commented] (PIG-5048) HiveUDTF fail if it is the first expression in projection

    [ https://issues.apache.org/jira/browse/PIG-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15632699#comment-15632699 ] 

Nandor Kollar commented on PIG-5048:
------------------------------------

Attached a new version of my patch, it includes these changes:
- test cases for Hive UDFs UDTFs and UDAFs
- extracted the UnlimitedNullTuple to a constant in POForEach
- added Hive contrib package to the dependencies to be able to use GenericUDTFCount2 in the tests
- UnlimitedNullTuple's size method doesn't throw an exception, but returns Integer.MAX_VALUE
- In HiveUDTF class, the collector is reused in close and in normal process case, thus if init doesn't create a new bag, but just clears the current, the close() will erase the result of normal process.

One thing I still don't really like is that it seems that if close doesn't produce any new tuple(s) because close is not implemented at all in the UDF, an empty tuple is still appended to the end of the output. I don't know how to handle this case, since we don't know if Hive UDF actually did something in close, but the result was empty (in this case I think we have to append the empty result to the output), or close was not even implemented (here I think it doesn't make sense to append an empty tuple). [~daijy] what do you think? Could you please help with the review?

> HiveUDTF fail if it is the first expression in projection
> ---------------------------------------------------------
>
>                 Key: PIG-5048
>                 URL: https://issues.apache.org/jira/browse/PIG-5048
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Daniel Dai
>            Assignee: Nandor Kollar
>             Fix For: 0.17.0, 0.16.1
>
>         Attachments: PIG-5048-1.patch, PIG-5048-2.patch, PIG-5048.patch
>
>
> The following script fail:
> {code}
> define explode HiveUDTF('explode');
> A = load 'bag.txt' as (a0:{(b0:chararray)});
> B = foreach A generate explode(a0);
> dump B;
> {code}
> Message: Unimplemented at org.apache.pig.data.UnlimitedNullTuple.size(UnlimitedNullTuple.java:31)
> If it is not the first projection, the script pass:
> {code}
> define explode HiveUDTF('explode');
> A = load 'bag.txt' as (a0:{(b0:chararray)});
> B = foreach A generate a0, explode(a0);
> dump B;
> {code}
> Thanks [~nkollar] reporting it!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)