You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Mathias Herberts (JIRA)" <ji...@apache.org> on 2011/02/23 21:34:38 UTC
[jira] Created: (PIG-1867) Allow UDFs that can generate multiple
output tuples from a single input tuple
Allow UDFs that can generate multiple output tuples from a single input tuple
-----------------------------------------------------------------------------
Key: PIG-1867
URL: https://issues.apache.org/jira/browse/PIG-1867
Project: Pig
Issue Type: New Feature
Reporter: Mathias Herberts
Hive offers this kind of thing using UDTF (User Defined Table generating Functions), it would be very useful for Pig to offer something similar, thus allowing more complex processing.
One example of such use could be an n-gram generating function.
I guess EvalFunc could be adapted/morped so exec returns an Iterator<T> instead of T.
In a first approach, the iterator scanning could be restricted to cases when the UDF is used alone in a generate clause.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Resolved: (PIG-1867) Allow UDFs that can generate multiple
output tuples from a single input tuple
Posted by "Mathias Herberts (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mathias Herberts resolved PIG-1867.
-----------------------------------
Resolution: Not A Problem
> Allow UDFs that can generate multiple output tuples from a single input tuple
> -----------------------------------------------------------------------------
>
> Key: PIG-1867
> URL: https://issues.apache.org/jira/browse/PIG-1867
> Project: Pig
> Issue Type: New Feature
> Reporter: Mathias Herberts
>
> Hive offers this kind of thing using UDTF (User Defined Table generating Functions), it would be very useful for Pig to offer something similar, thus allowing more complex processing.
> One example of such use could be an n-gram generating function.
> I guess EvalFunc could be adapted/morped so exec returns an Iterator<T> instead of T.
> In a first approach, the iterator scanning could be restricted to cases when the UDF is used alone in a generate clause.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1867) Allow UDFs that can generate multiple
output tuples from a single input tuple
Posted by "Mathias Herberts (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12998580#comment-12998580 ]
Mathias Herberts commented on PIG-1867:
---------------------------------------
You're right, I had never thought of returning a bag!
I guess this makes this issue a non-sense.
Thanks for the head up.
> Allow UDFs that can generate multiple output tuples from a single input tuple
> -----------------------------------------------------------------------------
>
> Key: PIG-1867
> URL: https://issues.apache.org/jira/browse/PIG-1867
> Project: Pig
> Issue Type: New Feature
> Reporter: Mathias Herberts
>
> Hive offers this kind of thing using UDTF (User Defined Table generating Functions), it would be very useful for Pig to offer something similar, thus allowing more complex processing.
> One example of such use could be an n-gram generating function.
> I guess EvalFunc could be adapted/morped so exec returns an Iterator<T> instead of T.
> In a first approach, the iterator scanning could be restricted to cases when the UDF is used alone in a generate clause.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1867) Allow UDFs that can generate multiple
output tuples from a single input tuple
Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12998570#comment-12998570 ]
Alan Gates commented on PIG-1867:
---------------------------------
Pig already offers this. Have your UDF return a bag. You can then use flatten to iterate through that bag.
> Allow UDFs that can generate multiple output tuples from a single input tuple
> -----------------------------------------------------------------------------
>
> Key: PIG-1867
> URL: https://issues.apache.org/jira/browse/PIG-1867
> Project: Pig
> Issue Type: New Feature
> Reporter: Mathias Herberts
>
> Hive offers this kind of thing using UDTF (User Defined Table generating Functions), it would be very useful for Pig to offer something similar, thus allowing more complex processing.
> One example of such use could be an n-gram generating function.
> I guess EvalFunc could be adapted/morped so exec returns an Iterator<T> instead of T.
> In a first approach, the iterator scanning could be restricted to cases when the UDF is used alone in a generate clause.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira