You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Alan Gates (JIRA)" <ji...@apache.org> on 2009/10/15 03:59:31 UTC

[jira] Commented: (PIG-928) UDFs in scripting languages

    [ https://issues.apache.org/jira/browse/PIG-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765860#action_12765860 ] 

Alan Gates commented on PIG-928:
--------------------------------

Questions that we need to answer to get this patch ready for commit:

1) How do we do type conversion?  The current patch assumes a single string input and output.  We'll want to be able to do conversions from scripting languages to pig types that make sense.  How this can be done is tied up with #2 below.

2) Do we do this using the Bean Scripting Framework or with specific bindings for each language?  This patch shows how to do the specific bindings for Groovy.  It can be done for Jython, and I'm reasonably sure it can be done for JRuby.  The obvious advantage of using the BSF is we get all the languages they support for free.  We need to understand the performance costs of each choice.  We should be able to use the existing patch to test the difference between using the BSF and direct Groovy bindings.  Also, it seems like type conversions will be much easier to do if we use specific bindings, as we can do explicit type mappings for each language.  Perhaps this is possible with BSF, but I'm not sure how.

3) Grammer for how to declare these.  I propose that we allow two options:  inlined in define and file referenced in define.  So these would roughly look like:

define myudf ScriptUDF('groovy', 'return input.get(0).split();');
define myudf ScriptUDF('python', myudf.py);

We could also support inlining in the Pig Latin itself, something like:

B = foreach A generate {'groovy', 'return input.get(0).split();');};

I'm not a fan of this type of inlining, as I think it makes the code hard to read.


> UDFs in scripting languages
> ---------------------------
>
>                 Key: PIG-928
>                 URL: https://issues.apache.org/jira/browse/PIG-928
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Alan Gates
>         Attachments: package.zip
>
>
> It should be possible to write UDFs in scripting languages such as python, ruby, etc.  This frees users from needing to compile Java, generate a jar, etc.  It also opens Pig to programmers who prefer scripting languages over Java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.